From owner-freebsd-fs@FreeBSD.ORG Thu Sep 5 23:05:51 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 1606316E; Thu, 5 Sep 2013 23:05:51 +0000 (UTC) (envelope-from grant@grantgray.id.au) Received: from mail.grantgray.id.au (aurora.evps.com.au [116.240.200.42]) by mx1.freebsd.org (Postfix) with ESMTP id A08BB26CB; Thu, 5 Sep 2013 23:05:49 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.grantgray.id.au (Postfix) with ESMTP id 9C41037BA59; Fri, 6 Sep 2013 09:05:40 +1000 (EST) X-Virus-Scanned: amavisd-new at mail.grantgray.id.au Received: from mail.grantgray.id.au ([127.0.0.1]) by localhost (mail.grantgray.id.au [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CyWQ4j5wwI7f; Fri, 6 Sep 2013 09:05:39 +1000 (EST) Received: from localhost.localdomain (c27-253-54-200.thoms4.vic.optusnet.com.au [27.253.54.200]) by mail.grantgray.id.au (Postfix) with ESMTPSA id B07AB37BA44; Fri, 6 Sep 2013 09:05:39 +1000 (EST) Message-ID: <52290E43.1090203@grantgray.id.au> Date: Fri, 06 Sep 2013 09:05:39 +1000 From: Grant Gray User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130805 Thunderbird/17.0.8 MIME-Version: 1.0 To: Andriy Gapon Subject: Re: ZFS livelock / deadlock on pure SSD pool References: <522599A9.9070107@grantgray.id.au> <5225AB77.9020208@FreeBSD.org> <5225BB8C.5050802@gray.id.au> <5226CCEF.7090002@FreeBSD.org> In-Reply-To: <5226CCEF.7090002@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Sep 2013 23:05:51 -0000 On 09/04/2013 04:02 PM, Andriy Gapon wrote: > on 03/09/2013 13:35 Grant Gray said the following: >> On 3/09/2013 7:27 PM, Andriy Gapon wrote: >>> arc_lowmem+0x38 kmem_malloc+0xb0 >> Thanks for the feedback. Do you think it may be triggered when the ARC is >> evicting pages because it is full, or a genuine low-memory case? The system has >> 32GB of RAM, of which the ARC is typically about 24G (I think). > Given the kmem_malloc -> arc_lowmem call chain it was a KVA shortage. Probably > because of KVA fragmentation. > Setting KVA size to a value larger than your physical memory size (1.5x or 2x) > may work around this problem. The cost of the workaround is that some memory > will be used for the additional page table pages. > > Some recent changes in head are supposed to help with the KVA fragmentation > problem in general. I've had to revert the problem server to spinning disks as my customer can't bear any more downtime. I'm happy to test any proposed patches/workarounds on a non-production system. I have no idea how the FreeBSD allocator works. Does the suggested increase in KVA size merely defer the problem as it will still eventually run out of contiguous pages? PR has been submitted: http://www.freebsd.org/cgi/query-pr.cgi?pr=181791