From owner-freebsd-stable Fri Mar 30 2:17:38 2001 Delivered-To: freebsd-stable@freebsd.org Received: from arg1.demon.co.uk (arg1.demon.co.uk [194.222.34.166]) by hub.freebsd.org (Postfix) with ESMTP id 24F3137B720 for ; Fri, 30 Mar 2001 02:17:29 -0800 (PST) (envelope-from arg@arg1.demon.co.uk) Received: by arg1.demon.co.uk (Postfix, from userid 300) id B67CF9B14; Fri, 30 Mar 2001 11:17:24 +0100 (BST) Received: from localhost (localhost [127.0.0.1]) by arg1.demon.co.uk (Postfix) with ESMTP id AE74B5D12 for ; Fri, 30 Mar 2001 11:17:24 +0100 (BST) Date: Fri, 30 Mar 2001 11:17:24 +0100 (BST) From: Andrew Gordon X-Sender: arg@server.arg.sj.co.uk To: freebsd-stable@freebsd.org Subject: NFS problems in 4.3-RC (maybe Vinum?) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Wednesday, I upgraded an NFS server to 4.3-RC. This machine has an IDE drive with the system partitions and a Vinum RAID5 on 5 SCSI drives for /home which is the main NFS export, plus a single SCSI drive (non-Vinum) exported as /cd. Soft updates are enabled everywhere except the Vinum volume. The server had been running 4.2-STABLE without problems since mid-January (at which time there were some Vinum-related panics, but nothing like the current behaviour). Since the upgrade, it has failed 4 times: 1) Apparently stopped serving NFS to one client - tcpdump showed incoming UDP from that client but no replies. Server rebooted cleanly and problem went away. 2) Stopped providing NFS service to any clients. On reboot, "syncing disks... 5 1 1 1 1 1 1 1 1 1 1 1 1 1 giving up on 1 buffers" The automatic fsck on all the filesystems threw up one error on /home (INCORRECT BLOCK COUNT I=12634345 (2 should be 0)), suggesting that the un-flushed block was in the Vinum volume. 3) Stopped serving NFS. This time I noticed on ps that the nfsd processes were all stuck: 0 523 1 0 2 0 360 180 accept Is ?? 0:00.00 nfsd: master 0 525 523 0 -2 0 352 172 getblk D ?? 0:06.24 nfsd: server 0 526 523 0 -14 0 352 172 inode D ?? 0:00.07 nfsd: server 0 527 523 0 -14 0 352 172 inode D ?? 0:00.01 nfsd: server 0 528 523 0 -14 0 352 172 inode D ?? 0:00.01 nfsd: server A reboot hung the machine: ctrl-T gave: load: 0.00 cmd: reboot 62014 [inode] 0.00u 0.00s 0% 252k After a hard reset, the fsck gave three "incorrect block count" errors on /home (also one unref file in /var), but again came up without needing manual fsck. 4) As for 2), except that this time the fsck found nothing wrong on /home, but a load of unref files on /var. A 'ps' before doing the reboot showed the nfsd processes stuck again: 0 264 1 0 2 0 360 132 accept Is ?? 0:00.00 nfsd: master 0 266 264 0 -14 0 352 124 inode D ?? 0:06.15 nfsd: server 0 267 264 0 -14 0 352 124 inode D ?? 0:00.26 nfsd: server 0 268 264 0 -14 0 352 124 inode D ?? 0:00.02 nfsd: server 0 269 264 0 -14 0 352 124 inode D ?? 0:00.04 nfsd: server The load on the machine would have been much lower than usual, since most of the users are on holiday (which is why I did the upgrade in the first place). The only thing that has changed apart from the upgrade is that the /cd filesystem, while present on the machine for some time and full of data, would not have been used until this week as various clients were re-configured to use it; however it doesn't seem particularly involved (and also one of the failures happened around 02:00 when all of the machines mounting /cd were powered off: there would only have been me (logged into another machine that mounts /home) and various cron jobs active at the time. I say "maybe Vinum?" in the subject since the main NFS export is on a Vinum RAID5, but there isn't really any evidence to suggest Vinum is to blame. I re-cvsuped this morning in case a fix had appeared; I haven't rebuilt yet, but none of the diffs look at all relevant: U contrib/sendmail/FREEBSD-upgrade U lib/libc/gen/glob.c U release/sysinstall/main.c U sys/dev/vinum/vinumconfig.c U sys/net/if.c U sys/net/if_vlan.c U sys/netinet/if_ether.c U sys/netinet/ip_icmp.c U sys/netinet/tcp_subr.c U usr.bin/fetch/fetch.c U usr.bin/netstat/if.c U usr.sbin/ppp/bundle.c U usr.sbin/ppp/ether.c U usr.sbin/ppp/iface.c U usr.sbin/ppp/iface.h U usr.sbin/ppp/ppp.8 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message