Date: Sat, 6 Jul 2002 16:19:28 -0400 (EDT) From: Andrew Gallatin <gallatin@cs.duke.edu> To: Bruce Evans <bde@zeta.org.au> Cc: Julian Elischer <julian@elischer.org>, <freebsd-current@FreeBSD.ORG> Subject: Re: more on dumping Message-ID: <15655.20688.169088.756630@grasshopper.cs.duke.edu> In-Reply-To: <20020707024114.A5419-100000@gamplex.bde.org> References: <15654.65479.31155.182179@grasshopper.cs.duke.edu> <20020707024114.A5419-100000@gamplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans writes: > On Sat, 6 Jul 2002, Andrew Gallatin wrote: > > > Julian Elischer writes: > > > On Sat, 6 Jul 2002, Andrew Gallatin wrote: > > > > OK, current is really confusing me. When we are panic'ing and syncing > > > > disks, how are we supposed to come back to the current thread which > > > > caused the dump after we do an mi_switch() to allow an interrupt > > > > thread to run? > > > > > > It depends. > > > > > > the previous thread should have been put back onto the run queue > > > before the interrupt thread was scheduled. > > > > Could it have anything to do with interrupt preemption being disabled on > > alpha & enabled on i386? > > Very likely. > > Bruce Unfortunately, that wasn't it. After reverting all my local hacks, I see that the system ends up here: db> tr siointr1() at siointr1+0x198 siointr() at siointr+0x40 isa_handle_fast_intr() at isa_handle_fast_intr+0x24 alpha_dispatch_intr() at alpha_dispatch_intr+0xd0 interrupt() at interrupt+0x110 XentInt() at XentInt+0x28 --- interrupt (from ipl 0) --- _mtx_unlock_flags() at _mtx_unlock_flags+0x8c kthread_suspend_check() at kthread_suspend_check+0xbc buf_daemon() at buf_daemon+0x80 fork_exit() at fork_exit+0xe0 exception_return() at exception_return --- root of call graph --- I think that the buf_daemon just happened to wake up at the wrong time, and the panicstr hacks in msleep prevent it from ever going back to sleep again once it is awake. Now that I realize this, I suspect the same thing happened with the random_kthread that I was talking about earlier. Perhaps there is something about alpha's hz being 1024 which is making it more likely to loose whatever race is won on i386. Humerously enough, if I clear panicstr in panic(), then crashes work (for a loose definition of work, who knows what they mean!), with the added "benefit" of marking the filesystems clean: panic: vm_page_wakeup: page not busy!!! panic Stopped at Debugger+0x34: zapnot v0,#0xf,v0 <v0=0x0> db> c Waiting (max 60 seconds) for system process `vnlru' to stop...stopped Waiting (max 60 seconds) for system process `bufdaemon' to stop...stopped Waiting (max 60 seconds) for system process `syncer' to stop...stopped syncing disks... 1 1 done Uptime: 3m17s Dumping 509 MB pid 569 (scp), uid 1387: exited on signal 4 (core dumped) pid 539 (tcsh), uid 1387: exited on signal 4 (core dumped) pid 538 (sshd), uid 1387: exited on signal 4 pid 536 (sshd), uid 0: exited on signal 4 pid 481 (sshd), uid 0: exited on signal 4 pid 442 (ntpd), uid 0: exited on signal 4 16 32 48 64 80 96 112pid 530 (cron), uid 0: exited on signal 4 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 Dump complete Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... Maybe we need to strengthen to the panicstr hacks and only allow the thread which caused the crash and interrupt threads to be scheduled once a panic occurs. Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15655.20688.169088.756630>