From owner-freebsd-bugs@FreeBSD.ORG Wed Nov 5 19:08:38 2014 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3EE53F76 for ; Wed, 5 Nov 2014 19:08:38 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0C14C9DC for ; Wed, 5 Nov 2014 19:08:38 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id sA5J8b2j046541 for ; Wed, 5 Nov 2014 19:08:37 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 194513] zfs recv hangs in state kmem arena Date: Wed, 05 Nov 2014 19:08:38 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: In Discussion X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2014 19:08:38 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194513 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- Status|Needs Triage |In Discussion CC| |alc@FreeBSD.org, | |avg@FreeBSD.org, | |jeffr@FreeBSD.org --- Comment #4 from Andriy Gapon --- My personal opinion that the problem is caused by a bug in the combination of the new vmem-based code and the changes in the page daemon code. When there is not enough KVA the code wakes up the page daemon with expectation that it would make some more KVA available, but the pagedaemon may not actually take any action. Previously the page daemon code used to check a return value from msleep and it made a page out pass if it was woken up. Now the page daemon code performs a pass when it is woken up *and* vm_pages_needed is set. As the comment before pagedaemon_wakeup() explains that function is not guaranteed to actually wake up the page daemon unless vm_page_queue_free_mtx is held. And kmem_reclaim() does not hold vm_page_queue_free_mtx when it calls pagedaemon_wakeup(). Additionally, before the switch to the vmem kmem_malloc() used to directly invoke vm_lowmem hook and uma_reclaim() function as opposed to trying to wake up the page daemon. So, the old could would reliably free some KVA if there is any that can be freed by vm_lowmem hook and uma_reclaim. But the new code makes a lame attempt to wake up the page daemon. I believe that the above explains why you sometimes see processes stuck in vmem_xalloc() and why your workaround work - when you really force the page daemon to make a page out pass it would finally free some KVA by invoking vm_lowmem hook and uma_reclaim. I see two trivial possible solutions: - hold vm_page_queue_free_mtx in kmem_reclaim() around pagedaemon_wakeup() call - directly call vm_lowmem hook and uma_reclaim() instead of pagedaemon_wakeup() in kmem_reclaim() Not sure which one would be better. Maybe there is an even better solution. -- You are receiving this mail because: You are the assignee for the bug.