  | | | Problem - After a reboot, the *first* `mount -a -t nfs ' gets stuck in `D ' | Problem - After a reboot, the *first* `mount -a -t nfs ' gets stuck in `D ' 2006-09-05 - By Keith Lewis
Back
Hello Taroon-list,
Running Red Hat Enterprise Linux AS release 3 (Taroon Update 8)
Kernels - on one box tired 2.4.21-47.EL 2.4.21-40.EL and 2.4.21-37.0.1.EL on another 2.4.21-40.ELsmp Hardware - IBM 1U rack mounted boxes (x-305 iirc) Intel(R) Pentium(R) 4 CPU 1.80GHz
Last night something happened - A network event possibly. Since then:
On attempting to reboot my machines they fail to come up all the way. They stop in netfs. They do come up all the way if I `chkconfig netfs off'.
*BUT* if I then log in and type mount -a -t nfs & the mount process hangs - in state D.
However if I type mount -a -t nfs & *again* this time it succeeds quite normally. Disks are mounted. All is well.
This has been seen on two machines so far.
The first mount process is still in D:
# ps auxww | grep mount root 1880 0.0 0.0 3776 812 pts/0 D 10:48 0:00 mount -a -t nfs root 2532 0.0 0.0 3688 672 pts/0 S 11:18 0:00 grep mount #
I'm really really puzzled.
Turning off iptables made no difference. The NFS server machines can be pinged successfully with a range of packet sizes before, during and after the mount attempts. No software has been changed that we know about. The NFS mounts are of disks only used by an application. /bin/mount, indeed all the AS3 files, are on a local hard disk.
# dmesg | grep eth ... eth0: Tigon3 [partno(BCM95703A30) rev 1002 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:0f:4f:40 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] eth0: dma_rwctrl[769f4000] divert: allocating divert_blk for eth1 eth1: Tigon3 [partno(BCM95703A30) rev 1002 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:0f:4f:41 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] eth1: dma_rwctrl[769f4000] tg3: eth0: Link is up at 1000 Mbps, full duplex. tg3: eth0: Flow control is off for TX and off for RX.
Syslog only says: Sep 6 10:48:54 <name> kernel: nfs: server <othername> not responding, still trying
Does anybody have any clues ?
Keith
-- Taroon-list mailing list Taroon-list@(protected) https://www.redhat.com/mailman/listinfo/taroon-list
|
|
 |