Date: Sat, 25 Oct 1997 22:01:39 -0400 (EDT) From: Thomas David Rivers <rivers@dignus.com> To: dillon@best.net, Tor.Egge@idi.ntnu.no Cc: freebsd-bugs@hub.freebsd.org Subject: re: kern/4844: vm lookup, endless loop in vm_map_lookup_entry() Message-ID: <199710260201.WAA18841@lakes.dignus.com>
index | next in thread | raw e-mail
>
> > hmm (looking at pr4630). this looks like a rather serious problem
> > considering the core nature of brelse(). this may be responsible
> > for several other crashes we have had involving "biodone: buffer already
> > done" panics. we've had four or five of those.
>
> I've also experienced some panics of the form
>
> biodone: buffer already done
> biodone: buffer already done
> biodone: buffer already done
> biodone: buffer already done
> biodone: buffer already done
> panic: biodone: buffer not busy
Yes - I've seen these messages on my "reliable reproduction"
machine... (the 'already done' messages).
>
> The last panic was on a kernel where extra sanity checks should cause
> an earlier panic if vm_map_entry_create or vm_map_entry_dispose was
> called during an interrupt. thus this is a different problem.
>
> biodone seems to be called several times on the same buffer, probably
> due to a bug in the low level device driver (ahc).
>
> Use of the ccd device makes debugging more difficult, as the buffer
> has been freed and reused for other purposes in the meantime.
>
> 254 operations in progress is clearly *wrong* (in the scsi_link structure).
>
> IdlePTD 22b000
> current pcb at 20e21c
> panic: biodone: buffer not busy
> (kgdb) where
> #0 boot (howto=260) at ../../kern/kern_shutdown.c:266
> #1 0xe0117676 in panic (fmt=0xe0131d6d "biodone: buffer not busy")
> at ../../kern/kern_shutdown.c:393
> #2 0xe0131ec8 in biodone (bp=0xe4098000) at ../../kern/vfs_bio.c:1754
> #3 0xe019849c in scsi_done (xs=0xe3071d00) at ../../scsi/scsi_base.c:450
> #4 0xe01f295c in ahc_done (ahc=0xe2fa5000, scb=0xe259f880)
> at ../../i386/scsi/aic7xxx.c:1969
> #5 0xe01f04c7 in ahc_intr (arg=0xe2fa5000) at ../../i386/scsi/aic7xxx.c:843
> #6 0xe01ccffc in splx (ipl=0) at ../../i386/isa/ipl_funcs.c:93
> #7 0xe011915d in tsleep (ident=0xe01fb1d0, priority=22,
> wmesg=0xe011704b "cpu0wt", timo=10) at ../../kern/kern_synch.c:329
> #8 0xe01171d0 in boot (howto=256) at ../../kern/kern_shutdown.c:186
> #9 0xe0117676 in panic (fmt=0xe0131d6d "biodone: buffer not busy")
> at ../../kern/kern_shutdown.c:393
> #10 0xe0131ec8 in biodone (bp=0xe3d6db00) at ../../kern/vfs_bio.c:1754
> #11 0xe019849c in scsi_done (xs=0xe304c800) at ../../scsi/scsi_base.c:450
> #12 0xe01f295c in ahc_done (ahc=0xe2fa5000, scb=0xe259f9e0)
> at ../../i386/scsi/aic7xxx.c:1969
> #13 0xe01f04c7 in ahc_intr (arg=0xe2fa5000) at ../../i386/scsi/aic7xxx.c:843
> #14 0xed66 in ?? ()
> #15 0xa1f9 in ?? ()
> #16 0xc3f3 in ?? ()
> #17 0x1095 in ?? ()
> (kgdb) up 2
> #2 0xe0131ec8 in biodone (bp=0xe4098000) at ../../kern/vfs_bio.c:1754
> (kgdb) up 1
> #3 0xe019849c in scsi_done (xs=0xe3071d00) at ../../scsi/scsi_base.c:450
> (kgdb) print *xs
> $1 = {next = 0xe304c800, flags = 2097, sc_link = 0xe25a1900,
> retries = 4 '\004', spare = "Ç\001À", timeout = 10000, cmd = 0xe3071d58,
> cmdlen = 10, data = 0xe702d000 "", datalen = 8192, resid = 0, error = 0,
> bp = 0xe4098000, sense = {error_code = 80 'P', ext = {unextended = {
> blockhi = 48 '0', blockmed = 194 'Â', blocklow = 0 '\000'},
> extended = {segment = 48 '0', flags = 194 'Â', info = "\000`\020\020",
> extra_len = 64 '@',
> extra_bytes = "O\000^\b¢\000\\*\211[\000\002\230ñ0\b-ÝÓ\000rÝ \004"}}}, req_sense_length = -2147483638, status = 0, cmdstore = {opcode = 42 '*',
> bytes = "\000\000\025?ô\000\000\020\000\001\001"}}
> (kgdb) print *xs->sc_link
> $2 = {target = 3 '\003', lun = 0 '\000', adapter_targ = 7 '\a',
> adapter_unit = 1 '\001', adapter_bus = 0 '\000', scsibus = 1 '\001',
> dev_unit = 7 '\a', opennings = 6 '\006', active = 254 'þ', flags = 4101,
> quirks = 0, adapter = 0xe020c474, device = 0xe01ffd88, active_xs = 0x0,
> fordriver = 0x0, devmodes = 0x0, dev = 3384, sd = 0xe259fe40, inqbuf = {
> device = 0 '\000', dev_qual2 = 0 '\000', version = 2 '\002',
> response_format = 2 '\002', additional_length = 91 '[', unused = "\000",
> flags = 62 '>', vendor = "QUANTUM ", product = "XP34550W ",
> revision = "LXY4", extra = "1847051"}, adapter_softc = 0xe2fa5000}
> (kgdb)
>
> > it sounds to me that a slight modification to the pr4630 suggestion
> > would work. rather then call bfreekva(), brelse() puts the bp on a
> > defered free list, yes, but why not clear out this list from
> > getnewbuf() ? i don't particularly see the need for a high priority
> > kernel process or other complexity.
>
> I agree. The last suggested patch in PR#4630 does not even have a
> deferred list. Using a deferred list is a more robust (and more complex)
> solution.
>
> > if getnewbuf() (called by getblk()) is not called from an interrupt,
> > we are home free. i don't think anyone else allocates out of the
> > buffer_map so the defered frees would not create a secondary effect
> > anywhere else.
>
> If getnewbuf() is called from an interrupt, something is seriously
> broken.
>
> - Tor Egge
>
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710260201.WAA18841>
