From owner-freebsd-current Tue Jul 16 18: 5: 7 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E3FC37B400 for ; Tue, 16 Jul 2002 18:05:04 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id D994E43E31 for ; Tue, 16 Jul 2002 18:05:02 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id LAA23212; Wed, 17 Jul 2002 11:04:43 +1000 Date: Wed, 17 Jul 2002 11:08:24 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Andrew Gallatin Cc: Andrew Kolchoogin , Subject: Re: VOP_GETATTR panic on Alpha In-Reply-To: <15668.23528.719956.574605@grasshopper.cs.duke.edu> Message-ID: <20020717103919.D3087-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 16 Jul 2002, Andrew Gallatin wrote: > Andrew Kolchoogin writes: > > Why "panic" from debugger on i386 gives core dump and reboots the system > > and "panic" from debugger on Alpha does not? > > Because, as BDE says, that crashdumps work at all is mosty accidental. Er, I meant that working of syncs in panic() is mostly accidental. Panic dumps should not be affected, since they should involve little more than the driver's dump routine which should not depend on interrupts or context switching working. Dump routines must use polling only, and run with some sort of lock to prevent context switching. splhigh() is used in RELENG_4. sched_lock should probably be used in -current, but there seems to be only a (null) splhigh(). This could also be just a driver problem. I know the old wddump routine worked right but am not sure about any of the current ones. Maybe dumps are broken on the alpha only due to driver problems. Note that the splhigh() didn't actually lock out interrupts in RELENG_4 for drivers broken enough to call tsleep(). The [un]safepri hack in tsleep() may permit broken dump routines that call tsleep() to "work". This hack has been lost in -current except for rotted comments which still say that it is done. > On alpha, a random kernel thread is waking up, and is unable to go > back to sleep because of the panicstr hack msleep: > > mtx_lock_spin(&sched_lock); > if (cold || panicstr) { > /* > * After a panic, or during autoconfiguration, > * just give interrupts a chance, then just return; ^^^^^^^^^^^^^^^^^^^^^^^^ This is the rotted comment. No chance is given here. > * don't run any other procs or panic below, > * in case this is the idle process and already asleep. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Looks like more bitrot. We've learned that the idle process can't call here. > */ > if (mtx != NULL && priority & PDROP) > mtx_unlock(mtx); > mtx_unlock_spin(&sched_lock); The safepri hack (splx(safepri); splx(origpri);) was here instead of these mtx operations. > return (0); > } > > We need to somehow let only interrupt threads and the panic'ed process > run after a panic. I have no idea how to do this in a clean, > low-impact way. I don't want to do this since I think there is no clean way to do it. But crash dumps must work without using interrupt threads, etc. I think the "right" way to do the sync is to always do a crash dump and have fsck_*fs recover buffers from it rather than let the panicing kernel possibly create further damage. But changing fsck_*fs to do this would be a lot of work. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message