From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 19 22:56:27 2015 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EBFDBD5B for ; Mon, 19 Jan 2015 22:56:26 +0000 (UTC) Received: from mail-lb0-x232.google.com (mail-lb0-x232.google.com [IPv6:2a00:1450:4010:c04::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 707616D6 for ; Mon, 19 Jan 2015 22:56:26 +0000 (UTC) Received: by mail-lb0-f178.google.com with SMTP id u10so43918lbd.9 for ; Mon, 19 Jan 2015 14:56:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=Jhoe0Yjss3WALBOSKrCtamxdsf6x4B69o/V9vH3MfZE=; b=ZLoCv9FwxQyXYyAckyjToL980gqQW/aqf6BDc3ipzPMxhODgmCoxW/xiF4WY41hzLF 21TYi341nTp9MV461xm1cCdt+vrSFMWZ4pZC+MYOWiZ13rTexUgeYfZlwilhXJaOXeBN hmH2Q0ws6ErwR6UI2P8JObw5th43BW9lHgL1dhhVgirmZEfrvNgONPvIMa/lnKkRy72t LrgAhFkpDWk2HiEvInLMAf362VZlockq1udBGuD0L7EBJ/uCJ3AzG7DuKKTkmaSNBzno 2suHaaGcHusPvZFeGNrWTNnb1h7O+zVzdlRRyhymW5W4JaSUBXvEISXYU8fAy8Dk6C5K geBQ== MIME-Version: 1.0 X-Received: by 10.112.164.102 with SMTP id yp6mr34009832lbb.15.1421708184469; Mon, 19 Jan 2015 14:56:24 -0800 (PST) Received: by 10.114.78.131 with HTTP; Mon, 19 Jan 2015 14:56:24 -0800 (PST) Date: Mon, 19 Jan 2015 17:56:24 -0500 Message-ID: Subject: Sleeping thread held mutex in vm_pageout_oom() From: Ryan Stone To: "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jan 2015 22:56:27 -0000 I recently had a system where a DIMM failed and the OOM killer was constantly kicking in due to a memory-hungry daemon being constantly restarted. This ended up triggering a race condition in the OOM killer leading to this panic: Sleeping thread (tid 100075, pid 8) owns a non-sleepable lock sched_switch() at 0xffffffff8048386d = sched_switch+0x16d mi_switch() at 0xffffffff80469dd6 = mi_switch+0x186 sleepq_wait() at 0xffffffff80499204 = sleepq_wait+0x44 __lockmgr_args() at 0xffffffff8044b88b = __lockmgr_args+0x67b vop_stdlock() at 0xffffffff804d3689 = vop_stdlock+0x39 ---Type to continue, or q to quit--- VOP_LOCK1_APV() at 0xffffffff8069da42 = VOP_LOCK1_APV+0x52 _vn_lock() at 0xffffffff804ed627 = _vn_lock+0x47 vm_object_deallocate() at 0xffffffff8061eef3 = vm_object_deallocate+0x203 vm_map_entry_deallocate() at 0xffffffff80616d2c = vm_map_entry_deallocate+0x4c vm_map_process_deferred() at 0xffffffff80616d62 = vm_map_process_deferred+0x32 vm_map_remove() at 0xffffffff806183ff = vm_map_remove+0x6f vmspace_free() at 0xffffffff80619206 = vmspace_free+0x56 vm_pageout_oom() at 0xffffffff806230d1 = vm_pageout_oom+0x181 vm_pageout() at 0xffffffff8062410b = vm_pageout+0x90b fork_exit() at 0xffffffff8043a382 = fork_exit+0x112 fork_trampoline() at 0xffffffff8063385e = fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80c3be1d00, rbp = 0 --- panic: sleeping thread cpuid = 5 curthread = grep/grep (82989/100544) cpu_ticks = 1848294656444 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff801e52ba = db_trace_self_wrapper+0x2a panic() at 0xffffffff80461608 = panic+0x228 propagate_priority() at 0xffffffff8049cbde = propagate_priority+0x15e turnstile_wait() at 0xffffffff8049d278 = turnstile_wait+0x1b8 _mtx_lock_sleep() at 0xffffffff80451af1 = _mtx_lock_sleep+0xf1 ---Type to continue, or q to quit--- _mtx_lock_flags() at 0xffffffff80451c75 = _mtx_lock_flags+0x75 exit1() at 0xffffffff804367de = exit1+0x36e sys_exit() at 0xffffffff8043731e = sys_exit+0xe syscallenter() at 0xffffffff8049b324 = syscallenter+0x104 syscall() at 0xffffffff80649bfc = syscall+0x4c Xfast_syscall() at 0xffffffff806335f2 = Xfast_syscall+0xe2 --- syscall (1, FreeBSD ELF64, sys_exit), rip = 0x300a2df9c, rsp = 0x7ffffffd40c8, rbp = 0x7ffffffd40e0 --- Uptime: 7m19s The root cause is that vm_pageout_oom() acquires a reference on a process's vmspace while holding its PROC_LOCK(), then the process exited. This left vm_pageout_oom() holding the only reference on the vmspace, so when it dropped the reference it called into vm_map_remove() and wound up sleeping while still holding the PROC_LOCK(). This was under FreeBSD 8 but the code in head does not seem to have changed here. I'm not quite familiar with the lock mechanisms here so I'm not sure how to fix it. Does vm_pageout_oom() need to _PHOLD() the process while holding the PROC_LOCK(), then drop the lock, then acquire the vmspace reference? It appears that's how other places that call vmspace_acquire_ref() work.