Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Nov 2008 21:02:50 -0800
From:      David Wolfskill <david@catwhisker.org>
To:        Jeremy Chadwick <koitsu@freebsd.org>
Cc:        Kostik Belousov <kostikbel@gmail.com>, freebsd-stable@freebsd.org
Subject:   Re: System deadlock when using mksnap_ffs
Message-ID:  <20081113050250.GR69155@bunrab.catwhisker.org>
In-Reply-To: <20081113044200.GA10419@icarus.home.lan>
References:  <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

--qBjqo5c+GoerLc/a
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
> ...
> > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
> > > > I've been playing around with snapshots lately but I've got a probl=
em on
> > > > one of my servers running 7-STABLE amd64:
> > > >=20
> > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 1=
0 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN  amd64
> > > >=20
> > > > I run the mksnap_ffs command to take the snapshot and some time lat=
er
> > > > the system completely freezes up:
> > > >=20
> > > > paladin# cd /u2/.snap/
> > > > paladin# mksnap_ffs /u2 test.1
> > > >=20
> > > > It only happens on this one filesystem, though, which might be to do
> > > > with its size. It's not over the 2TB marker, but it's pretty close.=
 It's
> > > > also backed by a hardware RAID system, although a smaller filesyste=
m on
> > > > the same RAID has no issues.
> ...
> Then in my book, the patch didn't fix anything.  :-)  The system is
> still "deadlocking"; snapshot generation **should not** wedge the system
> hard like this.
>=20
> Also, during my own testing, I am always able to use Ctrl-T to get
> SIGINFO from the running process (mksnap_ffs).  That behaviour does not
> change for me.
>=20
> The rest of the below information is good -- but I'm confused about
> something: is there anyone out there who can use mksnap_ffs on a
> filesystem (/usr is a good test source) and NOT experience this
> deadlocking problem?

I hadn't ever tried until I saw your message.  Granted, I'm using a
smaller file system (I doubt that I have a toital of as much as 2 TB in
all my machines combined), and I'm running i386, vs. amd64.  But it ran
just fine.  I wasn't able to test SIGINFO; it finished before I had a
chance.  (I ran it under time(1); wall clock time was 0.91 sec.)

> Literally *every* FreeBSD box I have root access
> to suffers from this problem, so I'm a little baffled why we end-users
> need to keep providing debugging output when it should be easy as pie
> for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch
> their system wedge.

Well, I routinely use dump/restore pipelines to copy file systems
around; never had a problem with it.

> ...

For reference:

freebeast(7.1-P)[9] uname -a
FreeBSD freebeast.catwhisker.org 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #127=
: Wed Nov 12 05:16:20 PST 2008     root@freebeast.catwhisker.org:/common/S3=
/obj/usr/src/sys/FREEBEAST  i386
freebeast(7.1-P)[10] ls -la
total 4
drwxrwxr-x   2 root  operator  512 Nov 12 20:53 .
drwxr-xr-x  14 root  wheel     512 Jan 22  2008 ..
freebeast(7.1-P)[11] /usr/bin/time -l mksnap_ffs /S2/usr test.1
        0.91 real         0.00 user         0.05 sys
       976  maximum resident set size
         3  average shared memory size
       627  average unshared data size
       109  average unshared stack size
       104  page reclaims
         0  page faults
         0  swaps
         1  block input operations
       230  block output operations
         0  messages sent
         0  messages received
         0  signals received
       101  voluntary context switches
        34  involuntary context switches
freebeast(7.1-P)[12] ls -la
total 1460
drwxrwxr-x   2 root  operator         512 Nov 12 20:54 .
drwxr-xr-x  14 root  wheel            512 Jan 22  2008 ..
-r--r-----   1 root  operator  2410791056 Nov 12 20:54 test.1
freebeast(7.1-P)[13]=20

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--qBjqo5c+GoerLc/a
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (FreeBSD)

iEYEARECAAYFAkkbtPkACgkQmprOCmdXAD1gsACeL5KCFvdfWvY/ZhuH4XtEk+Xi
WlgAnAzipw8PVs/BQ0baZi++gx5MEEt+
=undU
-----END PGP SIGNATURE-----

--qBjqo5c+GoerLc/a--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081113050250.GR69155>