From owner-freebsd-net@FreeBSD.ORG Thu Jul 24 13:11:40 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B45C31A2 for ; Thu, 24 Jul 2014 13:11:40 +0000 (UTC) Received: from mail.ijs.si (mail.ijs.si [IPv6:2001:1470:ff80::25]) by mx1.freebsd.org (Postfix) with ESMTP id 3F81A2ECD for ; Thu, 24 Jul 2014 13:11:40 +0000 (UTC) Received: from amavis-proxy-ori.ijs.si (localhost [IPv6:::1]) by mail.ijs.si (Postfix) with ESMTP id 3hJv7t46Txz1Ff for ; Thu, 24 Jul 2014 15:11:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ijs.si; h= user-agent:message-id:organization:subject:subject:from:from :date:date:content-transfer-encoding:content-type:content-type :mime-version:received:received:received:received; s=jakla2; t= 1406207491; x=1408799492; bh=RrG4pjMDx5PN8CKCrMO3lFi4NlyKQGeCTmg MQy2XWRw=; b=Qsz2LL9Oo4+uEbs3BwfYdNoYiKS5aoai6VTbD0vn0bfjVEcRyEj TzXaGsvVJUjI/ZpGlEtKKLPSDWjK4xEA6KvnNS8AR2WFWV63RrRvkf7D+WxJQCQT lZ2byE3lNb+EvVPzFzkZweZ2ul7t1uBIluJ68tNTRZAhHRoBmZ7GR72A= X-Virus-Scanned: amavisd-new at ijs.si Received: from mail.ijs.si ([IPv6:::1]) by amavis-proxy-ori.ijs.si (mail.ijs.si [IPv6:::1]) (amavisd-new, port 10012) with ESMTP id rKTpe7YUraWE for ; Thu, 24 Jul 2014 15:11:31 +0200 (CEST) Received: from mildred.ijs.si (mailbox.ijs.si [IPv6:2001:1470:ff80::143:1]) by mail.ijs.si (Postfix) with ESMTP for ; Thu, 24 Jul 2014 15:11:31 +0200 (CEST) Received: from neli.ijs.si (neli.ijs.si [IPv6:2001:1470:ff80:88:21c:c0ff:feb1:8c91]) by mildred.ijs.si (Postfix) with ESMTP id 3hJv7k6klzzgJ for ; Thu, 24 Jul 2014 15:11:30 +0200 (CEST) Received: from neli.ijs.si ([2001:1470:ff80:88:21c:c0ff:feb1:8c91]) by neli.ijs.si with HTTP (HTTP/1.1 POST); Thu, 24 Jul 2014 15:11:30 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 24 Jul 2014 15:11:30 +0200 From: Mark Martinec To: freebsd-net@freebsd.org Subject: Bumping up a default net.graph.maxdata to avoid "Write failed: Cannot allocate memory" Organization: J. Stefan Institute Message-ID: <17db180e1fa763f41dd0051780741b9f@mailbox.ijs.si> X-Sender: Mark.Martinec+freebsd@ijs.si User-Agent: Roundcube Webmail/1.0.1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jul 2014 13:11:40 -0000 Syncing zfs snapshots across the net using 'zfs send' over ssh started failing one day with ssh reporting "Write failed: Cannot allocate memory" after transferring about 15 to 25 GB of data (as it turned out this snapshot was larger than usual). Neither of the two hosts were particularly low on memory, the receiving end was running 10.0-STABLE amd64, network between the two was a gigabit ethernet, interface em. The problem was repeatable at will. Simplifying the experiment to a: ssh zfs send | wc -c also ended up with the same "Write failed: Cannot allocate memory" on the receiving side after transferring about 20 GB. Doing some other experiments ruled out a potential blame from zfs send. As it turned out (luckily for me, after banging my had over it) I'm not the only one with this problem - it was reported three years ago: http://lists.freebsd.org/pipermail/freebsd-emulation/2011-July/008971.html http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063322.html And yes, I too had a smallish virtual host running under VirtualBox on this receiving machine, which was mostly idling. That virtual host was not involved in any of these experiments. So the problem is that NetGraph was running out of space for data queue and net.graph.maxdata needed to be bumped up from a default of 512. My current setting is net.graph.maxdata=2048 and this suffices for reliable transfer of huge files (> 60 GB) over ssh. After one such transfer the vmstat -z reports: ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP [...] NetGraph items: 72, 4123, 0, 527, 28, 0, 0 NetGraph data items: 72, 2077, 0, 1023,65650892, 0, 0 while previously the FAIL count was in the hundreds, just as in the problem report from July 2011. And the issue is not limited to ssh, others have reported the same over ftp. Btw, the 'ngctl list' shows: There are 5 total nodes: Name: ngctl78040 Type: socket ID: 00000010 Num hooks: 0 Name: em0 Type: ether ID: 00000001 Num hooks: 2 Name: em1 Type: ether ID: 00000002 Num hooks: 0 Name: vboxnet0 Type: ether ID: 00000003 Num hooks: 0 Name: vboxnetflt_em0 Type: vboxnetflt ID: 00000004 Num hooks: 2 Unfortunately the 2011 thread remained suspended in the air, with no action, neither in documentation nor bumping up the default queue limit. So to save more people from bumping into the same problem, puzzled over a mystery "Write failed: Cannot allocate memory" failure, I'd like to suggest bumping up the default value of net.graph.maxdata, or at least documenting the fact in the handbook. Mark