Date: Sat, 28 Nov 2015 16:53:37 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: "Mikhail T." <mi+thun@aldan.algebra.com> Cc: stable@freebsd.org, freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: cp from NFS to ZFS hung in "fifoor" Message-ID: <1797738664.110049243.1448747617558.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <5659CB64.5020105@aldan.algebra.com> References: <5659CB64.5020105@aldan.algebra.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Mikhail T. wrote: > I was copying /home from an old server (narawntapu) to a new one > (aldan). The narawntapu:/home is mounted on aldan as /mnt with flags > ro,intr. On narawntapu /home was simply located on an SSD, but on aldan > I created a ZFS filesystem for it. > > The copying was started thus: > > root@aldan:/home (435) cp -Rpn /mnt/* . > > for a while this was proceeding at a decent clip with cp making > newnfsreq-uests: > > load: 0.78 cmd: cp 38711 [newnfsreq] 802.84r 1.57u 140.63s 20% 10768k > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > -> > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > 100% > load: 1.23 cmd: cp 38711 [newnfsreq] 874.19r 1.66u 154.74s 17% 4576k > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > -> > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > 100% > > ZFS on the destination compressing and writing stuff out and the traffic > between the two ranging from 30 to 50Mb/s (according to systat), but > then something happened and the cp-process is now hung: > > load: 0.55 cmd: cp 38711 [fifoor] 1107.67r 2.09u 194.12s 0% 3300k > load: 0.50 cmd: cp 38711 [fifoor] 1112.66r 2.09u 194.12s 0% 3300k > load: 0.22 cmd: cp 38711 [fifoor] 1642.37r 2.09u 194.12s 0% 3300k > Doing `ps axHl` will show you what the ``cp`` process is stuck on (WCHAN). If it is down inside ZFS, then I suspect it is ZFS resource related. If it is stuck somewhere in NFS or the kernel RPC, then I`d suspect a net driver issue: - The number 1 issue for net drivers vs NFS is TSO, so disabling TSO is the first thing to try (if the processes aren`t stuck inside zfs). (In the machine that is sending data, since it is a transmit segment limit problem. If I understood what you were doing, that would be the NFS server, but I`d disable it on both server and client.) - If that doesn`t fix it, try rsize=32768,wsize=32768 mount options for the NFS mount. These TSO issues are slowly getting resolved, but some drivers may still be broken, especially if you aren`t running head. (For example, only a very recent em(4) driver is fixed.) You can also do things like `netstat -m` to look for mbuf cluster exhaustion and look at the stats for your net driver (usually a sysctl). Good luck with it, rick > There is nothing in the logs on the new system, but the old one has a > number of entries like: > > Nov 28 10:28:45 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (62 occurrences) > Nov 28 10:29:45 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (50 occurrences) > Nov 28 10:30:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (59 occurrences) > Nov 28 10:31:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (57 occurrences) > Nov 28 10:32:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (68 occurrences) > > Both systems are largely idle now. I'm not in a hurry -- is anybody > interested in investigating it in situ? What is "fifoor" -- does this > point to a trouble in the ZFS, the NFS-client, or the NFS-server? Both > systems run FreeBSD/amd64 of recent 10.x-vintage. > > Thanks! > > -mi > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1797738664.110049243.1448747617558.JavaMail.zimbra>