Date: Sat, 25 Oct 1997 22:01:39 -0400 (EDT) From: Thomas David Rivers <rivers@dignus.com> To: dillon@best.net, Tor.Egge@idi.ntnu.no Cc: freebsd-bugs@hub.freebsd.org Subject: re: kern/4844: vm lookup, endless loop in vm_map_lookup_entry() Message-ID: <199710260201.WAA18841@lakes.dignus.com>
next in thread | raw e-mail | index | archive | help
> > > hmm (looking at pr4630). this looks like a rather serious problem > > considering the core nature of brelse(). this may be responsible > > for several other crashes we have had involving "biodone: buffer already > > done" panics. we've had four or five of those. > > I've also experienced some panics of the form > > biodone: buffer already done > biodone: buffer already done > biodone: buffer already done > biodone: buffer already done > biodone: buffer already done > panic: biodone: buffer not busy Yes - I've seen these messages on my "reliable reproduction" machine... (the 'already done' messages). > > The last panic was on a kernel where extra sanity checks should cause > an earlier panic if vm_map_entry_create or vm_map_entry_dispose was > called during an interrupt. thus this is a different problem. > > biodone seems to be called several times on the same buffer, probably > due to a bug in the low level device driver (ahc). > > Use of the ccd device makes debugging more difficult, as the buffer > has been freed and reused for other purposes in the meantime. > > 254 operations in progress is clearly *wrong* (in the scsi_link structure). > > IdlePTD 22b000 > current pcb at 20e21c > panic: biodone: buffer not busy > (kgdb) where > #0 boot (howto=260) at ../../kern/kern_shutdown.c:266 > #1 0xe0117676 in panic (fmt=0xe0131d6d "biodone: buffer not busy") > at ../../kern/kern_shutdown.c:393 > #2 0xe0131ec8 in biodone (bp=0xe4098000) at ../../kern/vfs_bio.c:1754 > #3 0xe019849c in scsi_done (xs=0xe3071d00) at ../../scsi/scsi_base.c:450 > #4 0xe01f295c in ahc_done (ahc=0xe2fa5000, scb=0xe259f880) > at ../../i386/scsi/aic7xxx.c:1969 > #5 0xe01f04c7 in ahc_intr (arg=0xe2fa5000) at ../../i386/scsi/aic7xxx.c:843 > #6 0xe01ccffc in splx (ipl=0) at ../../i386/isa/ipl_funcs.c:93 > #7 0xe011915d in tsleep (ident=0xe01fb1d0, priority=22, > wmesg=0xe011704b "cpu0wt", timo=10) at ../../kern/kern_synch.c:329 > #8 0xe01171d0 in boot (howto=256) at ../../kern/kern_shutdown.c:186 > #9 0xe0117676 in panic (fmt=0xe0131d6d "biodone: buffer not busy") > at ../../kern/kern_shutdown.c:393 > #10 0xe0131ec8 in biodone (bp=0xe3d6db00) at ../../kern/vfs_bio.c:1754 > #11 0xe019849c in scsi_done (xs=0xe304c800) at ../../scsi/scsi_base.c:450 > #12 0xe01f295c in ahc_done (ahc=0xe2fa5000, scb=0xe259f9e0) > at ../../i386/scsi/aic7xxx.c:1969 > #13 0xe01f04c7 in ahc_intr (arg=0xe2fa5000) at ../../i386/scsi/aic7xxx.c:843 > #14 0xed66 in ?? () > #15 0xa1f9 in ?? () > #16 0xc3f3 in ?? () > #17 0x1095 in ?? () > (kgdb) up 2 > #2 0xe0131ec8 in biodone (bp=0xe4098000) at ../../kern/vfs_bio.c:1754 > (kgdb) up 1 > #3 0xe019849c in scsi_done (xs=0xe3071d00) at ../../scsi/scsi_base.c:450 > (kgdb) print *xs > $1 = {next = 0xe304c800, flags = 2097, sc_link = 0xe25a1900, > retries = 4 '\004', spare = "Ç\001À", timeout = 10000, cmd = 0xe3071d58, > cmdlen = 10, data = 0xe702d000 "", datalen = 8192, resid = 0, error = 0, > bp = 0xe4098000, sense = {error_code = 80 'P', ext = {unextended = { > blockhi = 48 '0', blockmed = 194 'Â', blocklow = 0 '\000'}, > extended = {segment = 48 '0', flags = 194 'Â', info = "\000`\020\020", > extra_len = 64 '@', > extra_bytes = "O\000^\b¢\000\\*\211[\000\002\230ñ0\b-ÝÓ\000rÝ \004"}}}, req_sense_length = -2147483638, status = 0, cmdstore = {opcode = 42 '*', > bytes = "\000\000\025?ô\000\000\020\000\001\001"}} > (kgdb) print *xs->sc_link > $2 = {target = 3 '\003', lun = 0 '\000', adapter_targ = 7 '\a', > adapter_unit = 1 '\001', adapter_bus = 0 '\000', scsibus = 1 '\001', > dev_unit = 7 '\a', opennings = 6 '\006', active = 254 'þ', flags = 4101, > quirks = 0, adapter = 0xe020c474, device = 0xe01ffd88, active_xs = 0x0, > fordriver = 0x0, devmodes = 0x0, dev = 3384, sd = 0xe259fe40, inqbuf = { > device = 0 '\000', dev_qual2 = 0 '\000', version = 2 '\002', > response_format = 2 '\002', additional_length = 91 '[', unused = "\000", > flags = 62 '>', vendor = "QUANTUM ", product = "XP34550W ", > revision = "LXY4", extra = "1847051"}, adapter_softc = 0xe2fa5000} > (kgdb) > > > it sounds to me that a slight modification to the pr4630 suggestion > > would work. rather then call bfreekva(), brelse() puts the bp on a > > defered free list, yes, but why not clear out this list from > > getnewbuf() ? i don't particularly see the need for a high priority > > kernel process or other complexity. > > I agree. The last suggested patch in PR#4630 does not even have a > deferred list. Using a deferred list is a more robust (and more complex) > solution. > > > if getnewbuf() (called by getblk()) is not called from an interrupt, > > we are home free. i don't think anyone else allocates out of the > > buffer_map so the defered frees would not create a secondary effect > > anywhere else. > > If getnewbuf() is called from an interrupt, something is seriously > broken. > > - Tor Egge >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710260201.WAA18841>