From owner-freebsd-bugs Thu Apr 17 05:30:05 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id FAA28570 for bugs-outgoing; Thu, 17 Apr 1997 05:30:05 -0700 (PDT) Received: (from gnats@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id FAA28562; Thu, 17 Apr 1997 05:30:02 -0700 (PDT) Date: Thu, 17 Apr 1997 05:30:02 -0700 (PDT) Message-Id: <199704171230.FAA28562@freefall.freebsd.org> To: freebsd-bugs Cc: From: Thomas David Rivers Subject: Re: kern/3304: NFS V2 readdir hangs Reply-To: Thomas David Rivers Sender: owner-bugs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk The following reply was made to PR kern/3304; it has been noted by GNATS. From: Thomas David Rivers To: ponds!freebsd.org!gpalmer, ponds!lakes.water.net!rivers Cc: ponds!freebsd.org!FreeBSD-gnats-submit Subject: Re: kern/3304: NFS V2 readdir hangs Date: Thu, 17 Apr 1997 07:23:31 -0400 (EDT) > > Thomas David Rivers wrote in message ID > <199704160209.WAA01541@lakes.water.net>: > > Mount a V2 NFS server (I've tried both Sunos 4.1.3 and HP/UX 9.05), > > go to a rather large directory and do "ls -l". The ls -l will hang > > in sbwait(). This apparently also needs a rather slow network > > for a reliable reproduction - that is, it's somewhat timing dependent. > > I recently did something similar (ls -l on a 16,000 file directory) > across NFS on a recent RELENG_2_2 box which was mounting /var/mail > from a 2.1.x based mail server. Worked fine. This was probably 2 or 3 > weeks ago... I'll try again if you like. 16,000 files is more than enough :-) I've also witnessed it "work"; although never from my particular box that reliably reproduces it; and not always from the box that sometimes "works." I believe the problem is "tickled" by some timing issue. For example, maybe on of the 6 possible UDP packets is out of order and that throws everything for a loop. This could be explained by network issues between the server and client. However, I now believe that we do an nfs_receive(); the packet isn't yet there so we go into sbwait() to be awakened by an sorwakeup() (sowakeup) in udp_input(). Now; I may even have some evidence that udp_input() is doing the right wakeup(); but we don't get woken up.... but I had to leave work early yesterday and didn't get that finished. A possible idea for those people that don't see this problem; we could via software, corrupt or drop UDP packets and see if NFS recovers properly. That could reproduce the problem I'm seeing that people in more robust networks don't see. What do you think? - Dave Rivers - > > Gary