Date: Thu, 1 May 2014 10:09:51 -0700 From: David Wolfskill <david@catwhisker.org> To: Kirk McKusick <mckusick@mckusick.com> Cc: fs@freebsd.org Subject: Re: SU+J: 185 processes in state "suspfs" for >8 hrs. ... not good, right? Message-ID: <20140501170951.GI1120@albert.catwhisker.org> In-Reply-To: <201405011651.s41GphgX089174@chez.mckusick.com> References: <20140501161856.GH1120@albert.catwhisker.org> <201405011651.s41GphgX089174@chez.mckusick.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
On Thu, May 01, 2014 at 09:51:43AM -0700, Kirk McKusick wrote:
>>...
>
> The following fix for related problems was made to head and MFC'ed
> to stable/10 but not stable/9.
>
> *** stable/9/sys/ufs/ffs/ffs_vnops.c 2014-03-05 08:51:48.000000000 -0800
> --- stable/9/sys/ufs/ffsffs_vnops.c 2014-05-01 09:41:35.000000000 -0700
> ***************
> *** 258,266 ****
> continue;
> if (bp->b_lblkno > lbn)
> panic("ffs_syncvnode: syncing truncated data.");
> ! if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL))
> continue;
> - BO_UNLOCK(bo);
> if ((bp->b_flags & B_DELWRI) == 0)
> panic("ffs_fsync: not dirty");
> /*
> --- 258,274 ----
> continue;
> if (bp->b_lblkno > lbn)
> panic("ffs_syncvnode: syncing truncated data.");
> ! if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) == 0) {
> ! BO_UNLOCK(bo);
> ! } else if (wait != 0) {
> ! if (BUF_LOCK(bp,
> ! LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK,
> ! BO_LOCKPTR(bo)) != 0) {
> ! bp->b_vflags &= ~BV_SCANNED;
> ! goto next;
> ! }
> ! } else
> continue;
> if ((bp->b_flags & B_DELWRI) == 0)
> panic("ffs_fsync: not dirty");
> /*
>
> The associated comment is:
>
> If we fail to do a non-blocking acquire of a buf lock while doing a
> waiting sync pass we need to do a blocking acquire and restart.
> Another thread, typically the buf daemon, may have this buf locked and
> if we don't wait we can fail to sync the file. This lead to a great
> variety of softdep panics and deadlocks because we rely on all
> dependencies being flushed before proceeding in several cases.
Cool -- thanks!
> Let me know if it helps your problem. If it does, I will MFC it to 9.
> There have been several other fixes made to SU+J that are more likely
> to be the cause of your problem, but they are not easily back-ported
> to stable/9. So if this does not fix your problem my only suggestions
> are to turn off journaling or move to running on stable/10.
>
> Kirk McKusick
Roger that. And yes, stable/10 is a goal -- but I *just* finally managed
to get the machines migrated from 8.2-ish to 9.2. :-) (Note: I do not
have direct control -- merely a measure of influence. :-})
Peace,
david
--
David H. Wolfskill david@catwhisker.org
Taliban: Evil cowards with guns afraid of truth from a 14-year old girl.
See http://www.catwhisker.org/~david/publickey.gpg for my public key.
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)
iQJ8BAEBCgBmBQJTYn/dXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ4RThEMDY4QTIxMjc1MDZFRDIzODYzRTc4
QTY3RjlDOERFRjQxOTNCAAoJEIpn+cje9Bk7IH4P/RtaLA4RxmpGOZJ6ndwkCOR9
Xm5Bwz2lH7SMUI4wqxylcy/9zkTJdqdzeliU22TNQ0mL0ldN50p7tnHTRi99pAO3
OTOTzmzqoKIDe+gzqR+tpHNumolg2+rTWHHMw2S/sT8brKsdUkYFN5zh1yb5T9kC
dX9Oz6Lwht1xZfUFlrBg63aGdn+eqVxKbFD//WTTNAeRLpHPl4K22w+JhKmjxcp3
rrRMwTR0Vd99fW2z7zJ67hZFWkKVZ0i3c3KQMWHxzBbZXM9WS5pU4xCoWDkPOwCr
ELQ3myZeV+2+72k9fe8voGKjsOiPDyyg07J+WU8ECqeymGUJLL9Haf+UXEfquqGR
BEkPjpzW9ZuLvTx2DDWBgT5yZqI2cFh6WBA+GH1eQgSpaO+cfd8Az2s2VS+tl/61
NwGLIcr82LYnMW7Cx3d8L6VdB70UVaLNdIr7Vy7ER/x4THjQ9vuhgx7yShKgfTLz
1OTrxgTsHBvFPH88pmjQVTEI4KslCLsi8tE7TtYtsndE3bvAjULPbxrmefUkngSu
m/zaxZdjmVGciv1m/GB6WaYwl+Qe22mADl9VTq761DgIxIpMLeyRZoAUBjNu2lTo
6AyR9OMoCiCelZKwhajUy6a9iLw+iDgvdVdt9dYanfrskz9HDqv59AJW/vXje8nm
43+333DfnlRqBO6FhrcL
=4V2i
-----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140501170951.GI1120>
