From owner-freebsd-bugs Fri Apr 18 10:38:17 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA20556 for bugs-outgoing; Fri, 18 Apr 1997 10:38:17 -0700 (PDT) Received: from nlsystems.com (nlsys.demon.co.uk [158.152.125.33]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA20548 for ; Fri, 18 Apr 1997 10:38:12 -0700 (PDT) Received: from herring.nlsystems.com (herring.nlsystems.com [10.0.0.2]) by nlsystems.com (8.8.5/8.8.5) with SMTP id SAA01297; Fri, 18 Apr 1997 18:38:04 +0100 (BST) Date: Fri, 18 Apr 1997 18:38:04 +0100 (BST) From: Doug Rabson To: Thomas David Rivers cc: freebsd-bugs@freefall.freebsd.org Subject: Re: kern/3304: NFS V2 readdir hangs In-Reply-To: <199704181600.JAA13507@freefall.freebsd.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-bugs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Fri, 18 Apr 1997, Thomas David Rivers wrote: > The following reply was made to PR kern/3304; it has been noted by GNATS. > > From: Thomas David Rivers > To: ponds!lakes.water.net!rivers, ponds!khavrinen.lcs.mit.edu!wollman > Cc: ponds!freefall.freebsd.org!freebsd-gnats-submit > Subject: Re: kern/3304: NFS V2 readdir hangs > Date: Fri, 18 Apr 1997 11:49:35 -0400 (EDT) > > More information... > > > Here's the scenario I've now determined (via more printf()s in the > kernel): > > 1) nfs_request() is called from readdirrpc(). > > 2) nfs_request malloc's a nfsreq block, which is used > by rcvlock()... the lock is granted; we go down to > soreceive() and wind up tsleeping in sbwait(). > > 3) At this point, a vnode lookup() operation is called. > The lookup() isn't satisfied from the cache; so > we call nfs_request() to get the information. > > 4) This nfs_request() malloc's a different nfsreq block. > The "lock" is granted since rcvlock() works on addresses > from the nfsreq block; these are different addresses, the > lock is granted. We wind down to soreceive() > again. > > 5) udp_intr() is called because a UDP packet arrived... > this is, presumably, the packet we're expecting from 2). > *however* the last request we received was from 4). > That is the nfsreq this packet winds up being associated > with; but - it is totally wrong. > Nope. The lock is done with flags from the struct nfsmount (flagp = &rep->r_nmp->nm_flag). This is shared by all the requests and nfsnodes on the same mountpoint. The code in nfs_reply is supposed to continue looping until the reply for myrep is recieved. If any other replies are received, they are matched against the list of outstanding requests and their owners will notice when they wake up and try to re-get the rcvlock. > So; we're left with the lookup() failing with a ENONENT (#2), > and the nfs_request from #2 hanging; never being woken up. > > I think that pretty well describes my findings. I really need a packet trace to try and get a picture of what is happening here. Could you run 'tcpdump -vv -s300' on a third machine and send me the trace. > > Perhaps the rcvlock() needs to change to lock on something other > than the nfsreq block... does anyone have any suggestions? As mentioned above, the lock is shared by all requests on the same mount point. -- Doug Rabson Mail: dfr@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 951 1891