Date: Mon, 19 Jan 2015 17:56:24 -0500 From: Ryan Stone <rysto32@gmail.com> To: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Sleeping thread held mutex in vm_pageout_oom() Message-ID: <CAFMmRNxz252HMWWBmRf=Z69zh2_w9cD5e1AZGeizyagKezm2Hw@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
I recently had a system where a DIMM failed and the OOM killer was constantly kicking in due to a memory-hungry daemon being constantly restarted. This ended up triggering a race condition in the OOM killer leading to this panic: Sleeping thread (tid 100075, pid 8) owns a non-sleepable lock sched_switch() at 0xffffffff8048386d = sched_switch+0x16d mi_switch() at 0xffffffff80469dd6 = mi_switch+0x186 sleepq_wait() at 0xffffffff80499204 = sleepq_wait+0x44 __lockmgr_args() at 0xffffffff8044b88b = __lockmgr_args+0x67b vop_stdlock() at 0xffffffff804d3689 = vop_stdlock+0x39 ---Type <return> to continue, or q <return> to quit--- VOP_LOCK1_APV() at 0xffffffff8069da42 = VOP_LOCK1_APV+0x52 _vn_lock() at 0xffffffff804ed627 = _vn_lock+0x47 vm_object_deallocate() at 0xffffffff8061eef3 = vm_object_deallocate+0x203 vm_map_entry_deallocate() at 0xffffffff80616d2c = vm_map_entry_deallocate+0x4c vm_map_process_deferred() at 0xffffffff80616d62 = vm_map_process_deferred+0x32 vm_map_remove() at 0xffffffff806183ff = vm_map_remove+0x6f vmspace_free() at 0xffffffff80619206 = vmspace_free+0x56 vm_pageout_oom() at 0xffffffff806230d1 = vm_pageout_oom+0x181 vm_pageout() at 0xffffffff8062410b = vm_pageout+0x90b fork_exit() at 0xffffffff8043a382 = fork_exit+0x112 fork_trampoline() at 0xffffffff8063385e = fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80c3be1d00, rbp = 0 --- panic: sleeping thread cpuid = 5 curthread = grep/grep (82989/100544) cpu_ticks = 1848294656444 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff801e52ba = db_trace_self_wrapper+0x2a panic() at 0xffffffff80461608 = panic+0x228 propagate_priority() at 0xffffffff8049cbde = propagate_priority+0x15e turnstile_wait() at 0xffffffff8049d278 = turnstile_wait+0x1b8 _mtx_lock_sleep() at 0xffffffff80451af1 = _mtx_lock_sleep+0xf1 ---Type <return> to continue, or q <return> to quit--- _mtx_lock_flags() at 0xffffffff80451c75 = _mtx_lock_flags+0x75 exit1() at 0xffffffff804367de = exit1+0x36e sys_exit() at 0xffffffff8043731e = sys_exit+0xe syscallenter() at 0xffffffff8049b324 = syscallenter+0x104 syscall() at 0xffffffff80649bfc = syscall+0x4c Xfast_syscall() at 0xffffffff806335f2 = Xfast_syscall+0xe2 --- syscall (1, FreeBSD ELF64, sys_exit), rip = 0x300a2df9c, rsp = 0x7ffffffd40c8, rbp = 0x7ffffffd40e0 --- Uptime: 7m19s The root cause is that vm_pageout_oom() acquires a reference on a process's vmspace while holding its PROC_LOCK(), then the process exited. This left vm_pageout_oom() holding the only reference on the vmspace, so when it dropped the reference it called into vm_map_remove() and wound up sleeping while still holding the PROC_LOCK(). This was under FreeBSD 8 but the code in head does not seem to have changed here. I'm not quite familiar with the lock mechanisms here so I'm not sure how to fix it. Does vm_pageout_oom() need to _PHOLD() the process while holding the PROC_LOCK(), then drop the lock, then acquire the vmspace reference? It appears that's how other places that call vmspace_acquire_ref() work.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNxz252HMWWBmRf=Z69zh2_w9cD5e1AZGeizyagKezm2Hw>