From owner-freebsd-current@FreeBSD.ORG Fri Jan 16 20:42:06 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A6821065670 for ; Fri, 16 Jan 2009 20:42:06 +0000 (UTC) (envelope-from minimarmot@gmail.com) Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.175]) by mx1.freebsd.org (Postfix) with ESMTP id F34FD8FC1B for ; Fri, 16 Jan 2009 20:42:05 +0000 (UTC) (envelope-from minimarmot@gmail.com) Received: by wf-out-1314.google.com with SMTP id 24so1951452wfg.7 for ; Fri, 16 Jan 2009 12:42:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=9C9F5isxbd0RSuYL58fAQwXXpXRpoydBmZ3tsGfosPA=; b=Y/gdB8Z4U/BaycJwuvLTGDgDYXNS4WDSJLL7a1OV2cmJO41zsJ1+pdhbN9x7N04ztr FsOlCoxLaGrbyMYkqUnr6seVzvBdPzJB0+zF2GP3MmtnmTVkX4zgnk5yKk7sper6+MQ1 anSbZCQIMpTzpMYxSRq5eSybCmFD+t/7lteGI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=tNtJzW1fofVhJXyxk03OcwrHx4fLESaNcYJwNK8Be560SA9IXk63md7RDLGmhtvzuB HcKVTCY31hgika3g1cBOjo9lor9Xlg48kFMfANLXi03aWdOwrDDgfQ8jgocH2PUcdHIq d/u457DfVgtYH8f14ZFPTxKIyBVmFTV4noSGM= MIME-Version: 1.0 Received: by 10.143.28.8 with SMTP id f8mr1182569wfj.204.1232138525185; Fri, 16 Jan 2009 12:42:05 -0800 (PST) In-Reply-To: <49613379.8040901@vwsoft.com> References: <49613379.8040901@vwsoft.com> Date: Fri, 16 Jan 2009 15:42:05 -0500 Message-ID: <47d0403c0901161242ifa1f121p5a3583caddd40483@mail.gmail.com> From: Ben Kaduk To: Volker Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: attilio@freebsd.org, uwe@grohnwaldt.eu, freebsd-current@freebsd.org, kib@freebsd.org, Martin Wilke Subject: Re: panic in bundirty X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Jan 2009 20:42:06 -0000 On Sun, Jan 4, 2009 at 5:08 PM, Volker wrote: > Gentlemen, > > I've had the pleasure to inspect miwi's new tinderbox machine in a panic > situation. > > The debugging info we're able to get out of the box is the following: > > panic: bundirty: buffer 0xffffffff9a475438 still on queue 1 > > #9 0xffffffff8049f196 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:556 > #10 0xffffffff8050b540 in bundirty (bp=Variable "bp" is not available. > ) at /usr/src/sys/kern/vfs_bio.c:1068 > #11 0xffffffff8050d684 in brelse (bp=0xffffffff9a475438) > at /usr/src/sys/kern/vfs_bio.c:1388 > #12 0xffffffff8050f07c in bufdone (bp=0xffffffff9a475438) > at /usr/src/sys/kern/vfs_bio.c:3157 > #13 0xffffffff8069cb6c in ffs_backgroundwritedone (bp=0xffffffff9a475438) > ---Type to continue, or q to quit--- > at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1679 > #14 0xffffffff8050f051 in bufdone (bp=0xffffffff9a475438) > at /usr/src/sys/kern/vfs_bio.c:3151 > #15 0xffffffff80452b51 in g_io_schedule_up (tp=Variable "tp" is not > available. > ) > at /usr/src/sys/geom/geom_io.c:587 > #16 0xffffffff8045323f in g_up_procbody () at > /usr/src/sys/geom/geom_kern.c:95 > #17 0xffffffff8048037a in fork_exit ( > callout=0xffffffff804531d0 , arg=0x0, > frame=0xffffffff80e63c80) at /usr/src/sys/kern/kern_fork.c:789 > > in frame 11, 'p *bp' gives: > $1 = {b_bufobj = 0xffffff00034fe958, b_bcount = 16384, b_caller1 = 0x0, > b_data = 0xffffffff9f4b7000 "", b_error = 5, b_iocmd = 2 '\002', > b_ioflags = 2 '\002', b_iooffset = 7127891968, b_resid = 16384, > b_iodone = 0, b_blkno = 13921664, b_offset = 7127891968, b_bobufs = { > tqe_next = 0xffffffff9a505a38, tqe_prev = 0xffffff00034fe9a8}, > b_left = 0x0, b_right = 0xffffffff9a505a38, b_vflags = 0, b_freelist = { > tqe_next = 0xffffffff9a472db8, tqe_prev = 0xffffffff80b56210}, > b_qindex = 1, b_flags = 41092, b_xflags = 33 '!', b_lock = > {lock_object = { > lo_name = 0xffffffff8083ce18 "bufwait", > lo_type = 0xffffffff8083ce18 "bufwait", lo_flags = 91947008, > lo_witness_data = {lod_list = {stqe_next = 0xffffffff80ae6f80}, > lod_witness = 0xffffffff80ae6f80}}, lk_lock = 18446744073709551608, > lk_recurse = 0, lk_timo = 0, lk_pri = 80}, b_bufsize = 16384, > b_runningbufspace = 0, b_kvabase = 0xffffffff9f4b7000 "", b_kvasize = > 16384, > b_lblkno = 13921664, b_vp = 0xffffff00034fe820, b_dirtyoff = 0, > b_dirtyend = 0, b_rcred = 0x0, b_wcred = 0x0, > b_saveaddr = 0xffffffff9f4b7000, b_pager = {pg_reqpage = 0}, b_cluster = { > cluster_head = {tqh_first = 0xffffffff9a479c68, > tqh_last = 0xffffffff9a4901b0}, cluster_entry = { > tqe_next = 0xffffffff9a479c68, tqe_prev = 0xffffffff9a4901b0}}, > b_pages = {0xffffff00de0eafa0, 0xffffff00de0eb010, 0xffffff00de0eb080, > 0xffffff00de0eb0f0, 0x0 }, b_npages = 4, b_dep = { > lh_first = 0x0}, b_fsprivate1 = 0x0, b_fsprivate2 = 0x0, > b_fsprivate3 = 0x0, b_pin_count = 0} > > This panic does not occur before commit 176708. > > The panic is in sys/kern/vfs_bio.c bundirty 1055: > KASSERT( bp->b_flags & B_REMFREE || bp->b_qindex == QUEUE_NONE ... > > bp->b_qindex = 0x01 (QUEUE_FREE) > bp->b_flags = 0xa084 (B_ASYNC | B_DELWRI | B_NOCACHE | B_INVAL) > > My assumption is: > > brele knows about the QUEUE_FREE but bundirty only wants to operate on > QUEUE_NONE. Either this is a race condition, brele should not call > bundirty or bundirty should operate not just on QUEUE_NONE buffers. > > As there have been some locking related commits in the changes in > question, this might also be caused by an unlocked operation. > > right before the panic, we're seeing DMA errors: > > ad6: setting up DMA failed > _vfs_done():ad6[WRITE(offset=5412487168, length=131072)]error = 5 > ad6: FAILURE - load data > ad6: setting up DMA failed > g_vfs_done():ad6[WRITE(offset=5044109312, length=131072)]error = 5 > g_vfs_done():ad6[WRITE(offset=5230985216, length=131072)]error = 5 > g_vfs_done():ad6[WRITE(offset=5231116288, length=131072)]error = 5 > g_vfs_done():ad6[WRITE(offset=5044240384, length=131072)]error = 5 > g_vfs_done():ad6[WRITE(offset=5412618240, length=131072)]error = 5 > g_vfs_done():ad4s1d[WRITE(offset=7127891968, length=16384)]error = 5 > I am seeing this sort of output on a machine here as well, though I have not been able to get a crash dump (it complains about not being able to set up DMA ...). I saw a bundirty panic today, but I've also seen panic: initiate_write_inodeblock_ufs2: already started a number of times. I think what I'm seeing happening is that something (probably the actual bug) is causing the bio to get into a weird state (I'm not entirely sure about this), which then cascades to GEOM_MIRROR dropping the disk from the array. I think the initiate_write_inodeblock_ufs2 panics only happen when I have lost both disks from the array in such a fashion and/or am trying to rebuild one of them. In any case, I have been seeing this as a regression from 7.1-prerelease to current of a few days ago, and am trying to back out possibly-relevant changes to locate the problem. (I had originally suspected GEOM, but will switch my focus to vfs_bio having seen this.) Such feeble notes as I have been able to make are at http://stuff.mit.edu/afs/sipb.mit.edu/user/kaduk/freebsd/periphrasis/20090107/panic.txt and I have also placed `ident /boot/kernel.old/kernel` and friends in that directory. (kernel.old is 7.1-prerelease, and kernel and kernel.updated are two current snapshots) Thanks, Ben Kaduk > kgdb output can be found at: > http://www.bsdmeat.net/~lando/miwi/kgdb.txt > > Suggestions what the correct fix to this issue is? Change bundirty or > brele? Clean up locking? > > CC'ing those who committed to vfs_bio.c in the relevant time frame. > > Thanks > > Volker > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >