Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Jun 2006 14:31:43 -0400
From:      Mike Jakubik <mikej@rogers.com>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: md deadlocks on wdrain. Was: [Re: quota and snapshots in 6.1-RELEASE]
Message-ID:  <44A56E0F.1070904@rogers.com>
In-Reply-To: <20060630092829.GE1258@deviant.kiev.zoral.com.ua>
References:  <20060523181638.GC767@dimma.mow.oilspace.com> <6eb82e0605231120q37224c6r3b25982f556bed72@mail.gmail.com> <447366AD.30203@rogers.com> <44736E11.6060104@mkproductions.org> <20060523203521.GA48061@xor.obsecurity.org> <20060524062118.GA766@dimma.mow.oilspace.com> <447400BB.9060603@samsco.org> <4485C010.9040402@rogers.com> <20060606182234.GB72368@deviant.kiev.zoral.com.ua> <44A490E6.1000502@rogers.com> <20060630092829.GE1258@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Kostik Belousov wrote:
> First, I set the followup to the right mailing list.
>
> Second, I am really curious what you do. My understanding follows: you
> have set up vnode-backed md device (md0a) on sparce file, created ufs2
> on it, mounted it with quotas, and run background fsck on that fs. At
> the same time, you did rm for the snapshot file created by fsck. Right ?
>   

This is the procedure i followed, while i have quota enabled, it was not 
set on the test filesystem.

1) dd if=/dev/zero of=/usr/bigfile bs=1024 seek=209715200 count=0
2) mdconfig -a -t vnode -f /usr/bigfile
3) bsdlabel -w md0 auto
4) newfs -U md0a
5) fsck -v /dev/md0a # ^C this after a second or so, this makes the FS dirty
6) mount /dev/md0a /mnt
7) fsck -v -B /dev/md0a

in another window:
8) while true; do ls -al /mnt/.snap;sleep 1;done


> Anyway, the problem seems to be not related to neither snapshots nor
> quotas. In your trace, process 35 (syncer) tries to sync the vnode
> 0xc363c414, that is inode 1515 on aacd0s1f, that is used for md0. That
> vnode is already locked by process 515 (md0 kthread). Process 515 is
> stuck in the wdrain state, waiting for buffers to be flushed. It seems
> that there is huge amount of dirty buffers going to be written to md0,
> caused by snapshotting the fs. As result, system deadlocks due to md0
> hung waiting for buffer' runspace, that is occupied by pending write
> requests to md0.
>
> Do -fs@ readers agree with analysis ?
>
> I propose to set TDP_NORUNNINGBUF thread flag for both swap- and file-
> backed md threads to prevent such deadlocks. That i/o is already
> accounted for in the upper layer. Moreover, that already accounted
> requests do not really differ from requests (re)issued by md.
>
> Please, comment.
>   

FYI, -CURRENT passes this test without locking up, so the fix is already 
there somewhere.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?44A56E0F.1070904>