Date: Mon, 13 Mar 2017 09:21:20 +0100 From: Peter Holm <peter@holm.cc> To: Mark Johnston <markj@FreeBSD.org> Cc: freebsd-hackers@FreeBSD.org Subject: Re: draining high-frequency callouts Message-ID: <20170313082120.GA44651@x2.osted.lan> In-Reply-To: <20170110205711.GA86449@wkstn-mjohnston.west.isilon.com> References: <20170110205711.GA86449@wkstn-mjohnston.west.isilon.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 10, 2017 at 12:57:12PM -0800, Mark Johnston wrote: > I'm occasionally seeing an assertion failure in softclock_call_cc() when > running DTrace tests on a system with hz=10000. The assertion > (c->c_flags & CALLOUT_ACTIVE) != 0 is failing while a thread is > concurrently draining the callout, which runs at a high frequency. At > the time of the panic, that thread is spinning on the per-CPU callout > lock after having been awoken from "codrain", and CALLOUT_PENDING is > set on the callout. The callout is direct, i.e., it is executed in hard > interrupt context. > > I think this is what's happening: > - callout_drain() is called while the callout is executing but after the > callout has rescheduled itself, and goes to sleep after having cleared > CALLOUT_ACTIVE. > - softclock_call_cc() wakes up the callout_drain() caller, but the > callout fires again before the caller is scheduled. > - the second softclock_call_cc() call sees that CALLOUT_ACTIVE is > cleared and panics. > > Is there anything that prevents this scenario? Is it really correct to > leave CALLOUT_ACTIVE cleared when the per-CPU callout lock must be > dropped in order to acquire a sleepqueue lock? > Is this the same problem? panic: softclock_call_cc: act 0xfffff8000de64800 0 cpuid = 10 time = 1489389893 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0f984c9660 vpanic() at vpanic+0x19c/frame 0xfffffe0f984c96e0 kassert_panic() at kassert_panic+0x126/frame 0xfffffe0f984c9750 softclock_call_cc() at softclock_call_cc+0xae/frame 0xfffffe0f984c98f0 softclock() at softclock+0x12c/frame 0xfffffe0f984c9930 intr_event_execute_handlers() at intr_event_execute_handlers+0x248/frame 0xfffffe0f984c9980 ithread_execute_handlers() at ithread_execute_handlers+0x47/frame 0xfffffe0f984c99b0 ithread_loop() at ithread_loop+0xfc/frame 0xfffffe0f984c9a30 fork_exit() at fork_exit+0x13b/frame 0xfffffe0f984c9ab0 https://people.freebsd.org/~pho/stress/log/kevent7-2.txt I first spotted this @ 12.0-CURRENT #2 r305652M: Fri Sep 9 13:09:03 CEST 2016 It's quite easy to reproduce. - Peter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170313082120.GA44651>