From owner-freebsd-current@FreeBSD.ORG  Mon Oct 27 00:22:45 2014
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 51CBD8BA;
 Mon, 27 Oct 2014 00:22:45 +0000 (UTC)
Received: from mail.jrv.org (adsl-70-243-84-11.dsl.austtx.swbell.net
 [70.243.84.11]) by mx1.freebsd.org (Postfix) with ESMTP id 08876B34;
 Mon, 27 Oct 2014 00:22:44 +0000 (UTC)
Received: from localhost (localhost.localdomain [127.0.0.1])
 by mail.jrv.org (Postfix) with ESMTP id D2B0F1B6C41;
 Sun, 26 Oct 2014 19:22:36 -0500 (CDT)
Received: from mail.jrv.org ([127.0.0.1])
 by localhost (zimbra64.housenet.jrv [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Jy30G9Y-t3Az; Sun, 26 Oct 2014 19:22:26 -0500 (CDT)
Received: from localhost (localhost.localdomain [127.0.0.1])
 by mail.jrv.org (Postfix) with ESMTP id D79A11B6C3C;
 Sun, 26 Oct 2014 19:22:26 -0500 (CDT)
X-Virus-Scanned: amavisd-new at zimbra64.housenet.jrv
Received: from mail.jrv.org ([127.0.0.1])
 by localhost (zimbra64.housenet.jrv [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id Dh7RDba9VwES; Sun, 26 Oct 2014 19:22:26 -0500 (CDT)
Received: from [192.168.138.128] (BMX.housenet.jrv [192.168.3.140])
 by mail.jrv.org (Postfix) with ESMTPSA id B50751B6C39;
 Sun, 26 Oct 2014 19:22:26 -0500 (CDT)
Message-ID: <544D9056.10805@jrv.org>
Date: Sun, 26 Oct 2014 18:22:46 -0600
From: "James R. Van Artsdalen" <james-freebsd-fs2@jrv.org>
User-Agent: Mozilla/5.0 (Windows NT 5.0;
 rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: "James R. Van Artsdalen" <james-freebsd-current@jrv.org>
Subject: Re: zfs recv hangs in kmem arena
References: <54250AE9.6070609@jrv.org> <543FAB3C.4090503@jrv.org>
In-Reply-To: <543FAB3C.4090503@jrv.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Mailman-Approved-At: Mon, 27 Oct 2014 01:49:32 +0000
Cc: freebsd-fs@freebsd.org, current@freebsd.org
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Oct 2014 00:22:45 -0000

I was able to complete a ZFS replication by manually intervening each
time "zfs recv" blocked on "kmem arena": running the program at the end
was sufficient to unblock zfs each of the 17 times it stalled.

The program is intended to consume about 24GB RAM out of 32GB physical
RAM, thereby pressuring the ARC and kernel cache to shrink: when the
program exits it would leave plenty of free RAM for zfs or whatever
else.  What actually happened is that every time, zfs unblocked as the
program below was growing: it was never necessary to wait for the
program to exit and free memory before zfs unblocked.

On 10/16/2014 6:25 AM, James R. Van Artsdalen wrote:
> The zfs recv / kmem arena hang happens with -CURRENT as well as
> 10-STABLE, on two different systems, with 16GB or 32GB of RAM, from
> memstick or normal multi-user environments,
>
> Hangs usually seem to hapeen 1TB to 3TB in, but last night one run hung
> after only 4.35MB.
>
> On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote:
>> FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 #2 r272070M:
>> Wed Sep 24 17:36:56 CDT 2014    
>> james@BLACKIE.housenet.jrv:/usr/obj/usr/src/sys/GENERIC  amd64
>>
>> With current STABLE10 I am unable to replicate a ZFS pool using zfs
>> send/recv without zfs hanging in state "kmem arena", within the first
>> 4TB or so (of a 23TB Pool).
>>
>> The most recent attempt used this command line
>>
>> SUPERTEX:/root# zfs send -R BIGTEX/UNIX@syssnap | ssh BLACKIE zfs recv
>> -duvF BIGTOX
>>
>> though local replications fail in kmem arena too.
>>
>> The two machines I've been attempting this on have 16BG and 32GB of RAM
>> each and are otherwise idle.
>>
>> Any suggestions on how to get around, or investigate, "kmem arena"?
>>
>> # top
>> last pid:  3272;  load averages:  0.22,  0.22,  0.23                  up
>> 0+08:25:02  01:32:07
>> 34 processes:  1 running, 33 sleeping
>> CPU:  0.0% user,  0.0% nice,  0.1% system,  0.0% interrupt, 99.9% idle
>> Mem: 21M Active, 82M Inact, 15G Wired, 28M Cache, 450M Free
>> ARC: 12G Total, 24M MFU, 12G MRU, 23M Anon, 216M Header, 47M Other
>> Swap: 16G Total, 16G Free
>>
>>   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU
>> COMMAND
>>  1173 root          1  52    0 86476K  7780K select  0 124:33   0.00% sshd
>>  1176 root          1  46    0 87276K 47732K kmem a  3  48:36   0.00% zfs
>>   968 root         32  20    0 12344K  1888K rpcsvc  0   0:13   0.00% nfsd
>>  1009 root          1  20    0 25452K  2864K select  3   0:01   0.00% ntpd
>> ...

#include <stdlib.h>
#include <string.h>

long long s = ( (long long) 1 << 32) - 65;

main()
{
  char *p;

  p = calloc (s, 1);
  memset (p, 1, s);
  p = calloc (s, 1);
  memset (p, 1, s);
  p = calloc (s, 1);
  memset (p, 1, s);
  p = calloc (s, 1);
  memset (p, 1, s);
  p = calloc (s, 1);
  memset (p, 1, s);
  p = calloc (s, 1);
  memset (p, 1, s);
}