From owner-freebsd-stable@freebsd.org Sat Nov 28 21:53:41 2015 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5FDC9A3B9EA for ; Sat, 28 Nov 2015 21:53:41 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 380BD1CA6 for ; Sat, 28 Nov 2015 21:53:41 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.ysv.freebsd.org (Postfix) id 3502CA3B9E8; Sat, 28 Nov 2015 21:53:41 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 32883A3B9E7; Sat, 28 Nov 2015 21:53:41 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id B98791CA5; Sat, 28 Nov 2015 21:53:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:6CP78RUDZdHj6rul5/Rls/vLP/XV8LGtZVwlr6E/grcLSJyIuqrYZhKGt8tkgFKBZ4jH8fUM07OQ6PC9Hzxdqs/b7jgrS99laVwssY0uhQsuAcqIWwXQDcXBSGgEJvlET0Jv5HqhMEJYS47UblzWpWCuv3ZJQk2sfTR8Kum9IIPOlcP/j7n0oM2OJVUVz2PnP/tbF1afk0b4joEum4xsK6I8mFPig0BjXKBo/15uPk+ZhB3m5829r9ZJ+iVUvO89pYYbCf2pN/dwcbsNRhEnMGA85cmjiV+JBV+K5zgAUngQuhNMDwHDqhj+UZr7qCK8ve14jnq0J8rzGIo1Ujfqyq5gSxvljW9TLTsw+2LTh8lYkaVUvR+lvxw5yIeCM9LdD+Z3Yq6IJYBSfmFGRMsEDyE= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DQAQDRIVpW/61jaINdhA5vBr4bAQ2BZhcKhSRKAkyBExQBAQEBAQEBAYEJgi2CBwEBAQMBAQEBIAQnIAsFCwIBCBMFAgINGQICJwEJJgIECAcEARwEiAUIDawDjz8BAQEBAQEBAQIBAQEBAQEBARuBAYVThH6ENQYBAQWDM4FEBY4YiD+FKoUihEdJlkuDcAIfAQFChCIgNAeEIQgXI4EHAQEB X-IronPort-AV: E=Sophos;i="5.20,357,1444708800"; d="scan'208";a="254652042" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 28 Nov 2015 16:53:38 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 735DD15F5E4; Sat, 28 Nov 2015 16:53:38 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id jJGBskTkx1YT; Sat, 28 Nov 2015 16:53:37 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id ACEB615F5E9; Sat, 28 Nov 2015 16:53:37 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 4MSDyysiMoVz; Sat, 28 Nov 2015 16:53:37 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 8EBB615F5E4; Sat, 28 Nov 2015 16:53:37 -0500 (EST) Date: Sat, 28 Nov 2015 16:53:37 -0500 (EST) From: Rick Macklem To: "Mikhail T." Cc: stable@freebsd.org, freebsd-fs Message-ID: <1797738664.110049243.1448747617558.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <5659CB64.5020105@aldan.algebra.com> References: <5659CB64.5020105@aldan.algebra.com> Subject: Re: cp from NFS to ZFS hung in "fifoor" MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: cp from NFS to ZFS hung in "fifoor" Thread-Index: 71K1Z8URtZx18dr8wVDEnFmEW0myoQ== X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Nov 2015 21:53:41 -0000 Mikhail T. wrote: > I was copying /home from an old server (narawntapu) to a new one > (aldan). The narawntapu:/home is mounted on aldan as /mnt with flags > ro,intr. On narawntapu /home was simply located on an SSD, but on aldan > I created a ZFS filesystem for it. > > The copying was started thus: > > root@aldan:/home (435) cp -Rpn /mnt/* . > > for a while this was proceeding at a decent clip with cp making > newnfsreq-uests: > > load: 0.78 cmd: cp 38711 [newnfsreq] 802.84r 1.57u 140.63s 20% 10768k > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > -> > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > 100% > load: 1.23 cmd: cp 38711 [newnfsreq] 874.19r 1.66u 154.74s 17% 4576k > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > -> > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > 100% > > ZFS on the destination compressing and writing stuff out and the traffic > between the two ranging from 30 to 50Mb/s (according to systat), but > then something happened and the cp-process is now hung: > > load: 0.55 cmd: cp 38711 [fifoor] 1107.67r 2.09u 194.12s 0% 3300k > load: 0.50 cmd: cp 38711 [fifoor] 1112.66r 2.09u 194.12s 0% 3300k > load: 0.22 cmd: cp 38711 [fifoor] 1642.37r 2.09u 194.12s 0% 3300k > Doing `ps axHl` will show you what the ``cp`` process is stuck on (WCHAN). If it is down inside ZFS, then I suspect it is ZFS resource related. If it is stuck somewhere in NFS or the kernel RPC, then I`d suspect a net driver issue: - The number 1 issue for net drivers vs NFS is TSO, so disabling TSO is the first thing to try (if the processes aren`t stuck inside zfs). (In the machine that is sending data, since it is a transmit segment limit problem. If I understood what you were doing, that would be the NFS server, but I`d disable it on both server and client.) - If that doesn`t fix it, try rsize=32768,wsize=32768 mount options for the NFS mount. These TSO issues are slowly getting resolved, but some drivers may still be broken, especially if you aren`t running head. (For example, only a very recent em(4) driver is fixed.) You can also do things like `netstat -m` to look for mbuf cluster exhaustion and look at the stats for your net driver (usually a sysctl). Good luck with it, rick > There is nothing in the logs on the new system, but the old one has a > number of entries like: > > Nov 28 10:28:45 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (62 occurrences) > Nov 28 10:29:45 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (50 occurrences) > Nov 28 10:30:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (59 occurrences) > Nov 28 10:31:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (57 occurrences) > Nov 28 10:32:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (68 occurrences) > > Both systems are largely idle now. I'm not in a hurry -- is anybody > interested in investigating it in situ? What is "fifoor" -- does this > point to a trouble in the ZFS, the NFS-client, or the NFS-server? Both > systems run FreeBSD/amd64 of recent 10.x-vintage. > > Thanks! > > -mi > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >