From owner-freebsd-hackers@FreeBSD.ORG  Sun Jul 31 19:47:42 2005
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 05D7B16A41F
	for <freebsd-hackers@freebsd.org>; Sun, 31 Jul 2005 19:47:42 +0000 (GMT)
	(envelope-from ups@tree.com)
Received: from smtp.speedfactory.net (talon.speedfactory.net [66.23.216.215])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4B02443D48
	for <freebsd-hackers@freebsd.org>; Sun, 31 Jul 2005 19:47:41 +0000 (GMT)
	(envelope-from ups@tree.com)
Received: (qmail 25020 invoked by uid 210); 31 Jul 2005 19:47:40 +0000
Received: from 66.23.216.49 by talon (envelope-from <ups@tree.com>,
	uid 201) with qmail-scanner-1.25st 
	(clamdscan: 0.85.1/998. spamassassin: 3.0.2. perlscan: 1.25st.  
	Clear:RC:1(66.23.216.49):. 
	Processed in 0.030537 secs); 31 Jul 2005 19:47:40 -0000
X-Qmail-Scanner-Mail-From: ups@tree.com via talon
X-Qmail-Scanner: 1.25st (Clear:RC:1(66.23.216.49):. Processed in 0.030537 secs
	Process 25013)
Received: from 66-23-216-49.clients.speedfactory.net (HELO palm.tree.com)
	(66.23.216.49)
	by smtp.speedfactory.net with AES256-SHA encrypted SMTP;
	31 Jul 2005 19:47:40 +0000
Received: from [127.0.0.1] (ups@localhost.tree.com [127.0.0.1])
	by palm.tree.com (8.12.10/8.12.10) with ESMTP id j6VJlbrK002603;
	Sun, 31 Jul 2005 15:47:37 -0400 (EDT) (envelope-from ups@tree.com)
From: Stephan Uphoff <ups@tree.com>
To: Matthew Dillon <dillon@apollo.backplane.com>
In-Reply-To: <200507311840.j6VIeWGv028507@apollo.backplane.com>
References: <200507311840.j6VIeWGv028507@apollo.backplane.com>
Content-Type: text/plain
Message-Id: <1122839257.1360.5.camel@palm>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.4.6 
Date: Sun, 31 Jul 2005 15:47:37 -0400
Content-Transfer-Encoding: 7bit
Cc: Kirk McKusick <mckusick@mckusick.com>, freebsd-hackers@freebsd.org
Subject: Re: Possible softupdates bug when a indirect block buffer is reused
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 31 Jul 2005 19:47:42 -0000

Hi Matthew,

We have been testing a fix for this for a few weeks.
I will check it in today.

Stephan

On Sun, 2005-07-31 at 14:40, Matthew Dillon wrote:
>     Hi Kirk, hackers!
> 
>     I'm trying to track down a bug that is causing a buffer to be left
>     in a locked state and then causes the filesystem to lock up because
>     of that.
> 
>     The symptoms are that a heavily used filesystem suddenly starts running
>     out of space.  It isn't due to deleted files with open descriptors, it's
>     due to the syncer getting stuck in a getblk state.  This is in DragonFly,
>     but I can't find anything DFlyish wrong so I'm beginning to think it may
>     be an actual bug in softupdates.
> 
>     I have wound up with a situation where a getblk()'d bp has been
>     associated with a indirdep dependancy, i.e. stored in
>     indirdep->ir_savebp, but is never released.  When something like
>     the syncer comes along and tries to access it, it locks up, and this
>     of course leads to inodes not getting cleared and the filesystem
>     eventually runs out of space when a lot of files are being created and
>     deleted.
> 
>     What has got me really confused is that the buffer in question seems to
>     wind up with a D_INDIRDEP dependancy that points back to itself.
> 
>     Here's the situation from a live gdb.  Here is where the syncer is 
>     stuck:
> 
> (kgdb) back
> #0  lwkt_switch () at thread2.h:95
> #1  0xc02a8a79 in tsleep (ident=0x0, flags=0, wmesg=0xc04eadb0 "getblk", 
>     timo=0) at /usr/src-125beta/sys/kern/kern_synch.c:428
> #2  0xc02956bb in acquire (lkp=0xc758b4e0, extflags=33554464, wanted=1536)
>     at /usr/src-125beta/sys/kern/kern_lock.c:127
> #3  0xc0295a92 in lockmgr (lkp=0xc758b4e0, flags=33620002, interlkp=0x0, 
>     td=0xd68f6400) at /usr/src-125beta/sys/kern/kern_lock.c:354
> #4  0xc02d6828 in getblk (vp=0xc71b3058, blkno=94440240, size=8192, slpflag=0, 
>     slptimeo=0) at thread.h:79
> #5  0xc02d4404 in bread (vp=0xc71b3058, blkno=0, size=0, bpp=0x0)
>     at /usr/src-125beta/sys/kern/vfs_bio.c:567
> #6  0xc03f24fe in indir_trunc (ip=0xe048fc0c, dbn=94440240, level=1, lbn=2060, 
>     countp=0xe048fbf8) at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:2221
> #7  0xc03f22df in handle_workitem_freeblocks (freeblks=0xe2fcef98)
>     at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:2138
> #8  0xc03f0462 in process_worklist_item (matchmnt=0x0, flags=0)
>     at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:726
> #9  0xc03f026c in softdep_process_worklist (matchmnt=0x0)
>     at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:625
> #10 0xc02e5ff3 in sched_sync () at /usr/src-125beta/sys/kern/vfs_sync.c:244
> #11 0xc0294863 in kthread_create_stk (func=0, arg=0x0, tdp=0xff800000, 
>     stksize=0, fmt=0x0) at /usr/src-125beta/sys/kern/kern_kthread.c:104
> (kgdb) 
> 
>     The buffer it is stuck on:
> 
> (kgdb) print bp
> $62 = (struct buf *) 0xc758b4b8
> (kgdb) print *bp
> $63 = {
>   b_hash = {
>     le_next = 0x0, 
>     le_prev = 0xc7391348
>   }, 
>   b_vnbufs = {
>     tqe_next = 0xc739b890, 
>     tqe_prev = 0xc76f32b8
>   }, 
>   b_freelist = {
>     tqe_next = 0xc768d610, 
>     tqe_prev = 0xc0565ac0
>   }, 
>   b_act = {
>     tqe_next = 0x0, 
>     tqe_prev = 0x0
>   }, 
>   b_flags = 536870912, 	<<<<<<<<< 0x20000000 (getblk with no bread, etc)
>   b_qindex = 0, 
>   b_xflags = 2 '\002', 
>   b_lock = {
>     lk_interlock = {
>       t_cpu = 0xff800000, 
>       t_reqcpu = 0xff800000, 
>       t_unused01 = 0
>     }, 
>     lk_flags = 2098176, 
>     lk_sharecount = 0, 
>     lk_waitcount = 1, 
>     lk_exclusivecount = 1, 
>     lk_prio = 0, 
>     lk_wmesg = 0xc04eadb0 "getblk", 
>     lk_timo = 0, 
>     lk_lockholder = 0xfffffffe
>   }, 
>   b_error = 0, 
>   b_bufsize = 8192, 
>   b_runningbufspace = 0, 
>   b_bcount = 8192, 
>   b_resid = 0, 
>   b_dev = 0xde0f0e38, 
>   b_data = 0xcf824000 "\205\002", 
>   b_kvabase = 0xcf824000 "\205\002", 
>   b_kvasize = 16384, 
>   b_lblkno = 94440240, 
>   b_blkno = 94440240, 
>   b_offset = 48353402880, 
>   b_iodone = 0, 
>   b_iodone_chain = 0x0, 
>   b_vp = 0xc71b3058, 
>   b_dirtyoff = 0, 
>   b_dirtyend = 0, 
>   b_pblkno = 87503631, 
>   b_saveaddr = 0x0, 
>   b_driver1 = 0x0, 
>   b_caller1 = 0x0, 
>   b_pager = {
>     pg_spc = 0x0, 
>     pg_reqpage = 0
>   }, 
>   b_cluster = {
>     cluster_head = {
>       tqh_first = 0x0, 
>       tqh_last = 0xc768d6bc
> ---Type <return> to continue, or q <return> to quit--- 
>     }, 
>     cluster_entry = {
>       tqe_next = 0x0, 
>       tqe_prev = 0xc768d6bc
>     }
>   }, 
>   b_xio = {
>     xio_pages = 0xc758b584, 
>     xio_npages = 2, 
>     xio_offset = 0, 
>     xio_bytes = 0, 
>     xio_flags = 0, 
>     xio_error = 0, 
>     xio_internal_pages = {0xc34e5870, 0xc4aeb2b4, 0x0 <repeats 30 times>}
>   }, 
>   b_dep = {
>     lh_first = 0xc7045040
>   }, 
>   b_chain = {
>     parent = 0x0, 
>     count = 0
>   }
> }
> 
>     As you can see from b_flags, which is 0x20000000, the buffer has been
>     getblk()'d but not bread() or anything else.  It is the typical state
>     that occurs when a buffer is placed in an indirdep->ir_savebp state
>     in setup_allocindir_phase2().
> 
>     The buffer's dependancy looks like this:
> 
> (kgdb) print bp
> $65 = (struct buf *) 0xc758b4b8
> (kgdb) print *(struct indirdep *)bp->b_dep.lh_first
> $66 = {
>   ir_list = {
>     wk_list = {
>       le_next = 0x0, 
>       le_prev = 0xc758b604
>     }, 
>     wk_type = 5, 
>     wk_state = 33025		<<<<<<<<<<<<<  ATTACHED|GOINGAWAY|ONWORKLIST
>   }, 
>   ir_saveddata = 0xdeadc0de "", 
>   ir_savebp = 0xc758b4b8, 	<<<<<<<<<<<<< points back to itself
>   ir_donehd = {
>     lh_first = 0x0
>   }, 
>   ir_deplisthd = {
>     lh_first = 0x0
>   }
> }
> (kgdb) 
> 
>     As you can see, the buffer's dependancy seems to point to itself.
>     As you may know, ir_savebp buffers are left in a locked state, so a
>     buffer that has gotten into this situation winds up being permanently
>     deadlocked.
> 
>     This is on DragonFly, but I can't find anything else wrong.  This is 
>     on a filesystem running a myrid of jails which is constantly creating
>     and deleting files in parallel... so buffers are being reused quite often.
>     It takes about a week of this heavy activity for the bug to rear its
>     ugly head.  It's fairly difficult to reproduce (takes about a week to
>     reproduce).
> 
>     When the problem does occur the system remains functional... the disk
>     device continues to work, the filesystem continues to work, except for
>     any blockages that chain down to the particular block that has deadlocked
>     (usually the syncer, but as time goes by more parts of the system will
>     deadlock).  It IS possible to run a live gdb when the situation is 
>     caught early enough.
> 
>     I am trying to figure out how the buffer manages to get into this 
>     self-locked state.  I'm pretty much stuck once I get to the
>     deallocate_dependancies() procedure.  This procedure seems to be trying
>     to move a D_INDIRDEP dependancy from the passed bp into the buffer 
>     pointed to by ir_savebp.  As far as I can tell this is what is creating
>     the situation that makes the buffer's dependancy self-referential
>     and deadlocks the syncer.
> 
>     I have noticed a number of FreeBSD bug reports related to blocking in
>     getblk or to softupdates.  I don't know if this is a similar problem
>     but it's worth Ccing freebsd-hackers on it as well.
> 
> 						-Matt
> 
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
> 
>