FreeBSD Mail Archives

Date:      Sat, 6 Jul 2002 16:19:28 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Julian Elischer <julian@elischer.org>, <freebsd-current@FreeBSD.ORG>
Subject:   Re: more on dumping
Message-ID:  <15655.20688.169088.756630@grasshopper.cs.duke.edu>
In-Reply-To: <20020707024114.A5419-100000@gamplex.bde.org>
References:  <15654.65479.31155.182179@grasshopper.cs.duke.edu> <20020707024114.A5419-100000@gamplex.bde.org>

Bruce Evans writes:
 > On Sat, 6 Jul 2002, Andrew Gallatin wrote:
 > 
 > > Julian Elischer writes:
 > >  > On Sat, 6 Jul 2002, Andrew Gallatin wrote:
 > >  > > OK, current is really confusing me.  When we are panic'ing and syncing
 > >  > > disks, how are we supposed to come back to the current thread which
 > >  > > caused the dump after we do an mi_switch() to allow an interrupt
 > >  > > thread to run?
 > >  >
 > >  > It depends.
 > >  >
 > >  > the previous thread should have been put back onto the run queue
 > >  > before the interrupt thread was scheduled.
 > >
 > > Could it have anything to do with interrupt preemption being disabled on
 > > alpha & enabled on i386?
 > 
 > Very likely.
 > 
 > Bruce

Unfortunately, that wasn't it.

After reverting all my local hacks, I see that the system ends up
here:

db> tr
siointr1() at siointr1+0x198
siointr() at siointr+0x40
isa_handle_fast_intr() at isa_handle_fast_intr+0x24
alpha_dispatch_intr() at alpha_dispatch_intr+0xd0
interrupt() at interrupt+0x110
XentInt() at XentInt+0x28
--- interrupt (from ipl 0) ---
_mtx_unlock_flags() at _mtx_unlock_flags+0x8c
kthread_suspend_check() at kthread_suspend_check+0xbc
buf_daemon() at buf_daemon+0x80
fork_exit() at fork_exit+0xe0
exception_return() at exception_return
--- root of call graph ---

I think that the buf_daemon just happened to wake up at the wrong
time, and the panicstr hacks in msleep prevent it from ever going back
to sleep again once it is awake.  Now that I realize this, I suspect
the same thing happened with the random_kthread that I was talking
about earlier.

Perhaps there is something about alpha's hz being 1024 which is making
it more likely to loose whatever race is won on i386.

Humerously enough, if I clear panicstr in panic(), then crashes work
(for a loose definition of work, who knows what they mean!), with the
added "benefit" of marking the filesystems clean:

panic: vm_page_wakeup: page not busy!!!
panic
Stopped at      Debugger+0x34:  zapnot  v0,#0xf,v0      <v0=0x0>
db> c
Waiting (max 60 seconds) for system process `vnlru' to stop...stopped
Waiting (max 60 seconds) for system process `bufdaemon' to stop...stopped
Waiting (max 60 seconds) for system process `syncer' to stop...stopped

syncing disks... 1 1 
done
Uptime: 3m17s
Dumping 509 MB
pid 569 (scp), uid 1387: exited on signal 4 (core dumped)
pid 539 (tcsh), uid 1387: exited on signal 4 (core dumped)
pid 538 (sshd), uid 1387: exited on signal 4
pid 536 (sshd), uid 0: exited on signal 4
pid 481 (sshd), uid 0: exited on signal 4
pid 442 (ntpd), uid 0: exited on signal 4
 16 32 48 64 80 96 112pid 530 (cron), uid 0: exited on signal 4
 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384
 400 416 432 448 464 480 496
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...

Maybe we need to strengthen to the panicstr hacks and only allow the
thread which caused the crash and interrupt threads to be
scheduled once a panic occurs.

Drew

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15655.20688.169088.756630>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation