From owner-freebsd-hackers@FreeBSD.ORG  Mon Jan 19 22:56:27 2015
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id EBFDBD5B
 for <freebsd-hackers@freebsd.org>; Mon, 19 Jan 2015 22:56:26 +0000 (UTC)
Received: from mail-lb0-x232.google.com (mail-lb0-x232.google.com
 [IPv6:2a00:1450:4010:c04::232])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 707616D6
 for <freebsd-hackers@freebsd.org>; Mon, 19 Jan 2015 22:56:26 +0000 (UTC)
Received: by mail-lb0-f178.google.com with SMTP id u10so43918lbd.9
 for <freebsd-hackers@freebsd.org>; Mon, 19 Jan 2015 14:56:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=Jhoe0Yjss3WALBOSKrCtamxdsf6x4B69o/V9vH3MfZE=;
 b=ZLoCv9FwxQyXYyAckyjToL980gqQW/aqf6BDc3ipzPMxhODgmCoxW/xiF4WY41hzLF
 21TYi341nTp9MV461xm1cCdt+vrSFMWZ4pZC+MYOWiZ13rTexUgeYfZlwilhXJaOXeBN
 hmH2Q0ws6ErwR6UI2P8JObw5th43BW9lHgL1dhhVgirmZEfrvNgONPvIMa/lnKkRy72t
 LrgAhFkpDWk2HiEvInLMAf362VZlockq1udBGuD0L7EBJ/uCJ3AzG7DuKKTkmaSNBzno
 2suHaaGcHusPvZFeGNrWTNnb1h7O+zVzdlRRyhymW5W4JaSUBXvEISXYU8fAy8Dk6C5K
 geBQ==
MIME-Version: 1.0
X-Received: by 10.112.164.102 with SMTP id yp6mr34009832lbb.15.1421708184469; 
 Mon, 19 Jan 2015 14:56:24 -0800 (PST)
Received: by 10.114.78.131 with HTTP; Mon, 19 Jan 2015 14:56:24 -0800 (PST)
Date: Mon, 19 Jan 2015 17:56:24 -0500
Message-ID: <CAFMmRNxz252HMWWBmRf=Z69zh2_w9cD5e1AZGeizyagKezm2Hw@mail.gmail.com>
Subject: Sleeping thread held mutex in vm_pageout_oom()
From: Ryan Stone <rysto32@gmail.com>
To: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Jan 2015 22:56:27 -0000

I recently had a system where a DIMM failed and the OOM killer was
constantly kicking in due to a memory-hungry daemon being constantly
restarted.  This ended up triggering a race condition in the OOM
killer leading to this panic:

Sleeping thread (tid 100075, pid 8) owns a non-sleepable lock
sched_switch() at 0xffffffff8048386d = sched_switch+0x16d
mi_switch() at 0xffffffff80469dd6 = mi_switch+0x186
sleepq_wait() at 0xffffffff80499204 = sleepq_wait+0x44
__lockmgr_args() at 0xffffffff8044b88b = __lockmgr_args+0x67b
vop_stdlock() at 0xffffffff804d3689 = vop_stdlock+0x39
---Type <return> to continue, or q <return> to quit---
VOP_LOCK1_APV() at 0xffffffff8069da42 = VOP_LOCK1_APV+0x52
_vn_lock() at 0xffffffff804ed627 = _vn_lock+0x47
vm_object_deallocate() at 0xffffffff8061eef3 = vm_object_deallocate+0x203
vm_map_entry_deallocate() at 0xffffffff80616d2c = vm_map_entry_deallocate+0x4c
vm_map_process_deferred() at 0xffffffff80616d62 = vm_map_process_deferred+0x32
vm_map_remove() at 0xffffffff806183ff = vm_map_remove+0x6f
vmspace_free() at 0xffffffff80619206 = vmspace_free+0x56
vm_pageout_oom() at 0xffffffff806230d1 = vm_pageout_oom+0x181
vm_pageout() at 0xffffffff8062410b = vm_pageout+0x90b
fork_exit() at 0xffffffff8043a382 = fork_exit+0x112
fork_trampoline() at 0xffffffff8063385e = fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff80c3be1d00, rbp = 0 ---
panic: sleeping thread
cpuid = 5
curthread = grep/grep (82989/100544)
cpu_ticks = 1848294656444
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff801e52ba = db_trace_self_wrapper+0x2a
panic() at 0xffffffff80461608 = panic+0x228
propagate_priority() at 0xffffffff8049cbde = propagate_priority+0x15e
turnstile_wait() at 0xffffffff8049d278 = turnstile_wait+0x1b8
_mtx_lock_sleep() at 0xffffffff80451af1 = _mtx_lock_sleep+0xf1
---Type <return> to continue, or q <return> to quit---
_mtx_lock_flags() at 0xffffffff80451c75 = _mtx_lock_flags+0x75
exit1() at 0xffffffff804367de = exit1+0x36e
sys_exit() at 0xffffffff8043731e = sys_exit+0xe
syscallenter() at 0xffffffff8049b324 = syscallenter+0x104
syscall() at 0xffffffff80649bfc = syscall+0x4c
Xfast_syscall() at 0xffffffff806335f2 = Xfast_syscall+0xe2
--- syscall (1, FreeBSD ELF64, sys_exit), rip = 0x300a2df9c, rsp =
0x7ffffffd40c8, rbp = 0x7ffffffd40e0 ---
Uptime: 7m19s


The root cause is that vm_pageout_oom() acquires a reference on a
process's vmspace while holding its PROC_LOCK(), then the process
exited.  This left vm_pageout_oom() holding the only reference on the
vmspace, so when it dropped the reference it called into
vm_map_remove() and wound up sleeping while still holding the
PROC_LOCK().  This was under FreeBSD 8 but the code in head does not
seem to have changed here.

I'm not quite familiar with the lock mechanisms here so I'm not sure
how to fix it.  Does vm_pageout_oom() need to _PHOLD() the process
while holding the PROC_LOCK(), then drop the lock, then acquire the
vmspace reference?  It appears that's how other places that call
vmspace_acquire_ref() work.