Date: Fri, 16 Dec 2016 15:19:09 -0800 From: John Baldwin <jhb@freebsd.org> To: Mark Johnston <markj@freebsd.org> Cc: kargl@uw.edu, freebsd-current@freebsd.org, kib@freebsd.org Subject: Re: Revision 309657 to stack_machdep.c renders unbootable system Message-ID: <1848178.kLUBJlL36D@ralph.baldwin.cx> In-Reply-To: <20161215005012.GA84222@wkstn-mjohnston.west.isilon.com> References: <20161214194848.GA881@troutmask.apl.washington.edu> <20161214234804.GA26443@troutmask.apl.washington.edu> <20161215005012.GA84222@wkstn-mjohnston.west.isilon.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, December 14, 2016 04:50:21 PM Mark Johnston wrote: > On Wed, Dec 14, 2016 at 03:48:04PM -0800, Steven G. Kargl wrote: > > On Wed, Dec 14, 2016 at 02:10:48PM -0800, Mark Johnston wrote: > > > On Wed, Dec 14, 2016 at 12:14:16PM -0800, Mark Johnston wrote: > > > > On Wed, Dec 14, 2016 at 11:49:26AM -0800, Steven G. Kargl wrote: > > > > > Well, after 3 days of bisection, I finally found the commit > > > > > that renders my system unbootable. The system does not panic. > > > > > It simply gets stuck in some state. Nonfunctional keyboard, > > > > > so can't break into debugger. No serial console available. > > > > > The verbose dmesg.boot for a working kernel from revision > > > > > 309656 is at > > > > > > > > > > http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt > > > > > > > > > > The kernel config file is at > > > > > > > > > > http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt > > > > > > > > > > In looking at /usr/src/UPDATING, there is no warning that one > > > > > can create a boat anchor by upgrading to 309657. If compiling > > > > > a kernel with 'options DDB' is no longer supported, this should > > > > > be stated in UPDATING. Or, UPDATING should state that 'options > > > > > DDB' requires 'options STACK'. Or, 'options DDB' should simply > > > > > to the right thing and pull in whatever 'option STACK' does. > > > > > > > > It is supported though - the point of that change was to fix a problem > > > > that occurred when DDB is configured but STACK isn't. While testing I > > > > tried every combination of the two options, and I just tried and > > > > successfully booted a kernel with DDB and !STACK. > > > > > > > > Does the kernel boot successfully if STACK is added to your > > > > configuration? > > > > > > I tried your config (plus virtio drivers) and was able to reproduce the > > > hang in bhyve. Adding STACK "fixed" the hang, as did reverting part of > > > my change to re-add dead code into the kernel. My VM was always hanging > > > after printing > > > > > > 000.000050 [ 426] vtnet_netmap_attach virtio attached txq=1, txd=1024 rxq=1, rxd=1024 > > > > > > Sure enough, removing "device netmap" from your config also fixes the > > > hang. When the hang occurs, I can see with "bhyvectl --get-rip" that > > > we're stuck in DELAY(), but I can't get a stack at that point. I think > > > my change is an innocent bystander - it just happened to expose a latent > > > issue elsewhere. > > > > > > I don't have much more time to look at this right now, but I'll look > > > into it more tonight. > > > > Yes, adding STACK got me to a booting kernel. I can't remember > > why I added netmap to my config file. Re-adding dead code seems to > > point to some memory corruption issue or a rogue pointer. :( > > It's not quite that bad, as it turns out. The key is that > adding/removing the dead code changes the ordering of the items in the > sysinit linker set. I discovered that if the ctl(4) module is > initialized before the vtnet driver attaches, the hang occurs, and > reverting my commit results in a sysinit order where vtnet comes > _before_ ctl(4). So my change triggers the problem just because it > happens to perturb something in the compile-time linker. > > > > > BTW, I think it would be prudent to add something like > > > > 20161206: > > At revision 309657, 'options STACK' was introduced into > > sys/x86/x86/mstack_machdep.c. Old kernel configuration files > > that included 'options DDB' are now required to include also > > 'options STACK'. > > > > to UPDATING or some such wording. I was jumping from circ Oct 10th world > > to top of tree, and got caught by ~3000 commits. > > The issue actually seems to be in 4BSD, and more specifically in r308564 > and r308565. Switching to ULE or reverting either of those two commits > fixes the hang. Here's what happens: > > 1. ctl_init() runs and creates ctl_thresh_thread. This thread's main > loop cause pause(9) when it has no work to do. During boot, pause(9) > just calls DELAY() and does not yield the CPU. > 2. thread0 attaches the vtnet driver. As part of this, it creates and > starts some high-priority taskqueue threads in > _taskqueue_thread_start(). They're added to the scheduler with: > > thread_lock(); > sched_pri(...); > sched_add(...); > thread_unlock(); > > 4BSD's sched_add() will call maybe_preempt() in this case, which as > of r308564 will unconditionally set td_owepreempt in the current > thread. > 3. thread_unlock() will release the critical section held by the current > thread and because td_owepreempt is set, we'll yield the CPU. The > taskqueue threads have nothing to do, but ctl_thresh_thread runs > and ends up busy-waiting in pause() forever. > > r308565 removes a check in maybe_preempt() that would have stopped > td_owepreempt from being set. Before r308564, maybe_preempt() would have > switched directly to the new thread and apparently always switched back > immediately. > > I'm not sure what the correct fix is - jhb might have an idea. I wonder > if pause() should try to yield periodically when called during boot. So the hack in pause() is probably not as necessary now. In particular, I think we only need it for thread0, not for other threads. The patch below worked for me with SPEW's config: Index: kern_synch.c =================================================================== --- kern_synch.c (revision 310128) +++ kern_synch.c (working copy) @@ -321,7 +321,8 @@ pause_sbt(const char *wmesg, sbintime_t sbt, sbint if (sbt == 0) sbt = tick_sbt; - if (cold || kdb_active || SCHEDULER_STOPPED()) { + if ((cold && curthread == &thread0) || kdb_active || + SCHEDULER_STOPPED()) { /* * We delay one second at a time to avoid overflowing the * system specific DELAY() function(s): -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1848178.kLUBJlL36D>