Date: Mon, 21 Apr 1997 14:59:53 -0400 (EDT) From: Thomas David Rivers <ponds!rivers@dg-rtp.dg.com> To: ponds!nlsystems.com!dfr, ponds!lakes.water.net!rivers Cc: ponds!freefall.cdrom.com!freebsd-bugs Subject: Re: kern/3304: NFS V2 readdir hangs Message-ID: <199704211859.OAA02323@lakes.water.net>
next in thread | raw e-mail | index | archive | help
> > What appears to be happening is that numb is making a 4096byte sized > readdir request for the first block of the large directory. You can see > this in the trace as request id b6cff051 (btw. you may find it useful to > grep the log for nfs to separate the wood from the trees; next time we > should add 'port nfs' to the tcpdump command). The reply is sent but for > some reason it never makes it into sorecieve. > > You can see that numb retries the request with the same xid several times > but never receives the reply. My guess is that something between numb and > sundog has corrupted the packet and it is failing the checksum in > udp_input. What we need to do is find out how far up the protocol stack > the packet goes. I suggest adding printfs to udp_input and ip_input where > they drop packets with bad checksums (line 154 in udp_usrreq.c). You > should also be able to see it with 'netstat -p udp' and 'netstat -p ip'. Here's the output of those netstat commands: Script started on Mon Apr 21 14:11:18 1997 # netstat -p udp udp: 129 datagrams received 0 with incomplete header 0 with bad data length field 0 with bad checksum 0 dropped due to no socket 13 broadcast/multicast datagrams dropped due to no socket 5 dropped due to full socket buffers 0 not for hashed pcb 111 delivered 116 datagrams output # netstat -p ip ip: 180 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number 15 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped after timeout 5 packets reassembled ok 130 packets for this host 0 packets for unknown/unsupported protocol 0 packets forwarded 40 packets not forwardable 0 redirects sent 116 packets sent from this host 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented # exit Script done on Mon Apr 21 14:11:25 1997 No checksum problems - but I do notice the "5 dropped due to socket full buffers" line... could that be the reason?... > > You might also try this (untested) hack which should limit readdirs to > smaller bites: > > Index: nfs_vfsops.c > =================================================================== > RCS file: /home/smp/sys/nfs/nfs_vfsops.c,v > retrieving revision 1.1.1.5 > diff -u -r1.1.1.5 nfs_vfsops.c > --- nfs_vfsops.c 1997/04/18 07:09:39 1.1.1.5 > +++ nfs_vfsops.c 1997/04/21 17:19:58 > @@ -748,6 +748,7 @@ > } > if (nmp->nm_readdirsize > maxio) > nmp->nm_readdirsize = maxio; > + nmp->nm_readdirsize = 1024; /* XXX */ > > if ((argp->flags & NFSMNT_MAXGRPS) && argp->maxgrouplist >= 0 && > argp->maxgrouplist <= NFS_MAXGRPS) > Yes! - this particular change does work-around the problem. I'm able to run my "ls -lR" and have it complete successfully [although, there are some strange 'lags' every now and then...] it does work. I've been running it continuously for a few minutes now; no hangs... Now - a good question, which you asked, is why are those packets getting blocked? Also, another question I have is why did this work with 2.1.5 - did it always have a lower readdirsize; or is another problem in 2.2.1 simply masked by lowering the readdirsize? I'm happy to investigate this further - and *overjoyed* that NFS seems to be working for me... let me know what I can do at this end. - Thanks! - - Dave Rivers -
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704211859.OAA02323>
