Monthly Archives: February 2012

To rule out NFS Client hanged state on NFS Server and Client Architecture Setup

Abstracts –   We are in process of setting up NFS Server & Client architecture in order to provide high availability to websites running on Web Cluster. We have identical servers placed on cluster with identical data individually installed on each and every server so that if any node on the cluster goes down, there should be minimal downtime for the sites.

We have guest column features enabled on website in order to provide  services to guest to write contents for the website. The biggest challenge was to share disk so that the content can be fully available to shared location NOT on individual server’s disk in order to read and write simultaneously.

To overcome this situation we came through NFS sharing so that shared resource can be shared among all the cluster nodes and if somehow the NFS servers goes down the content posted on shared disk will not shown but rest of sites will work fine. However there could be few minutes downtime but that is better than running sites on standalone server which can cause downtime of more than 8 hours.

After implementation we observes that during NFS servers goes down we are getting hanged status of NFS client not serving the purpose we owned NFS for our architecture. The load got increase on NFS client and error occurred STALE NFS client due to modification time mismatched on NFS server and client. Website stop working.

During observation we thought if we can write any daemon which enquire per second basis NFS server from NFS client, if NFS service is listening on the port specified on NFS server.

If this does not get any reply it will simply umount the NFS shared disk on NFS client so that only NFS part get impacted and rest of the sites run fine without any load causing issue on server.

Architectural Setup

Apache-cluster

Let me explain the steps by step here for NFS Server and client architecture

1 – Since we don’t have NFS storage so we have configure Apache Server 1 as a NFS server and Apache Server 2 for NFS Client

2 –  We have exported a separate partition on Apache Server 1 and shared the same on Apache Server 2.

3 – We have configured the following script as a daemon on Apache Server 2 (NFS client) to check if NFS server is still alive if this does not find the NFS server alive. It makes NFS client to umount the shared folder on Apache server 2 by escaping NFS client goes to hanged state/Load increase and makes rest of content served from individual disk installed on client.

4 – Script is here

#!/bin/bash

NC=`which nc`

IPADDRESS=”x.x.x.x”

NFSPORT=2049

if [ `$NC -z  $IPADDRESS $NFSPORT | grep nfs | wc -l` = 0 ]

then

echo “$IPADDRESS server is reboot/shutdown or nfs service interrupt/stopped”

/bin/umount -f /nfsdata

fi