Date: Thu, 13 Jul 2006 11:36:20 -0700 From: Sean McNeil <sean@mcneil.com> To: Oliver Lehmann <lehmann@ans-netz.de> Cc: amd64@freebsd.org Subject: Re: NFS lockup when copying a "special" file Message-ID: <1152815780.17757.4.camel@triton.mcneil.com> In-Reply-To: <20060713201434.a5335637.lehmann@ans-netz.de> References: <20060713201434.a5335637.lehmann@ans-netz.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 2006-07-13 at 20:14 +0200, Oliver Lehmann wrote: > Hi, > > while backuping some old irc logs i've noticed that my nfs hung up while > copying a logfile. > > nfs server www:/mnt/space/www: not responding > nfs server www:/mnt/space/www: not responding > nfs server www:/mnt/space/www: not responding > > I have to reboot to get that nfs share up and running again. I tried then > copying the file again - same problem. I've rebootet the NFS server - no > difference. I've tried to copy the file on an nfs share from an other > system - same problem, the nfs mount on my ammd64 gets unresponsive. > After that I FTPed the file on my alpha and tried then copying the file > via nfs to an other host - it just worked. So I wonder if > > a) something on amd64 is broken > b) something on that file is special which produces problems on my router > between my amd64 and my NFS fileservers > c) sometghing on my own system is broken > > tcpdump just showes me: > > 19:18:39.582342 IP kartoffel.salatschuessel.net.2033686372 > dill.salatschuessel.net.nfs: 1472 write [|nfs] > 19:18:39.582344 IP kartoffel.salatschuessel.net > dill.salatschuessel.net: udp > 19:18:39.582346 IP kartoffel.salatschuessel.net > dill.salatschuessel.net: udp > 19:18:39.582348 IP kartoffel.salatschuessel.net > dill.salatschuessel.net: udp > > I've put the logfile online here: > > http://pofo.de/tmp/file.tar > > Would be nice if someone with an amd64 could extract the "yang.xchatlog" > in that tar, and try copying that onto a nfs mounted partition. > > olivleh1@kartoffel logs> cp yang.xchatlog /mnt/www/ > ^C > > Quite some time later I'm back at the prompt, but /mnt/www remains > kinda unuseable, and responds only after ages. (df for example takes > quite some time to show up with something) I used to have a similar problem and tracked it down to my NIC and hardware checksums. Would this happen to be an if_re device? Can you give ifconfig info and if hardware checksums are on, try your test with them turned off (RXCSUM and TXCSUM). You can try different combos of these if turning off both helps.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1152815780.17757.4.camel>