From owner-freebsd-fs@FreeBSD.ORG Fri Jun 30 18:31:21 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B9B7716A407; Fri, 30 Jun 2006 18:31:21 +0000 (UTC) (envelope-from mikej@rogers.com) Received: from H43.C18.B96.tor.eicat.ca (H43.C18.B96.tor.eicat.ca [66.96.18.43]) by mx1.FreeBSD.org (Postfix) with ESMTP id 03BA143D6B; Fri, 30 Jun 2006 18:31:20 +0000 (GMT) (envelope-from mikej@rogers.com) Received: from [172.16.0.200] (desktop.home.local [172.16.0.200]) by H43.C18.B96.tor.eicat.ca (Postfix) with ESMTP id 719801141A; Fri, 30 Jun 2006 14:31:05 -0400 (EDT) Message-ID: <44A56E0F.1070904@rogers.com> Date: Fri, 30 Jun 2006 14:31:43 -0400 From: Mike Jakubik User-Agent: Thunderbird 1.5.0.4 (Windows/20060516) MIME-Version: 1.0 To: Kostik Belousov References: <20060523181638.GC767@dimma.mow.oilspace.com> <6eb82e0605231120q37224c6r3b25982f556bed72@mail.gmail.com> <447366AD.30203@rogers.com> <44736E11.6060104@mkproductions.org> <20060523203521.GA48061@xor.obsecurity.org> <20060524062118.GA766@dimma.mow.oilspace.com> <447400BB.9060603@samsco.org> <4485C010.9040402@rogers.com> <20060606182234.GB72368@deviant.kiev.zoral.com.ua> <44A490E6.1000502@rogers.com> <20060630092829.GE1258@deviant.kiev.zoral.com.ua> In-Reply-To: <20060630092829.GE1258@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-SpamToaster-Information: This messages has been scanned by SpamToaster http://www.digitalprogression.ca X-SpamToaster: Found to be clean X-SpamToaster-SpamCheck: not spam, SpamAssassin (not cached, score=-2.491, required 3.5, ALL_TRUSTED -1.80, BAYES_00 -2.60, DNS_FROM_RFC_ABUSE 0.20, DNS_FROM_RFC_POST 1.71) X-SpamToaster-From: mikej@rogers.com X-Spam-Status: No Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: md deadlocks on wdrain. Was: [Re: quota and snapshots in 6.1-RELEASE] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Jun 2006 18:31:21 -0000 Kostik Belousov wrote: > First, I set the followup to the right mailing list. > > Second, I am really curious what you do. My understanding follows: you > have set up vnode-backed md device (md0a) on sparce file, created ufs2 > on it, mounted it with quotas, and run background fsck on that fs. At > the same time, you did rm for the snapshot file created by fsck. Right ? > This is the procedure i followed, while i have quota enabled, it was not set on the test filesystem. 1) dd if=/dev/zero of=/usr/bigfile bs=1024 seek=209715200 count=0 2) mdconfig -a -t vnode -f /usr/bigfile 3) bsdlabel -w md0 auto 4) newfs -U md0a 5) fsck -v /dev/md0a # ^C this after a second or so, this makes the FS dirty 6) mount /dev/md0a /mnt 7) fsck -v -B /dev/md0a in another window: 8) while true; do ls -al /mnt/.snap;sleep 1;done > Anyway, the problem seems to be not related to neither snapshots nor > quotas. In your trace, process 35 (syncer) tries to sync the vnode > 0xc363c414, that is inode 1515 on aacd0s1f, that is used for md0. That > vnode is already locked by process 515 (md0 kthread). Process 515 is > stuck in the wdrain state, waiting for buffers to be flushed. It seems > that there is huge amount of dirty buffers going to be written to md0, > caused by snapshotting the fs. As result, system deadlocks due to md0 > hung waiting for buffer' runspace, that is occupied by pending write > requests to md0. > > Do -fs@ readers agree with analysis ? > > I propose to set TDP_NORUNNINGBUF thread flag for both swap- and file- > backed md threads to prevent such deadlocks. That i/o is already > accounted for in the upper layer. Moreover, that already accounted > requests do not really differ from requests (re)issued by md. > > Please, comment. > FYI, -CURRENT passes this test without locking up, so the fix is already there somewhere.