Date: Wed, 4 Jun 2008 18:23:32 +0300 From: Kostik Belousov <kostikbel@gmail.com> To: Andriy Gapon <avg@icyb.net.ua> Cc: freebsd-stable@freebsd.org Subject: Re: mystery: lock up after fs dump Message-ID: <20080604152332.GE63348@deviant.kiev.zoral.com.ua> In-Reply-To: <4846AFC3.3050101@icyb.net.ua> References: <4846AFC3.3050101@icyb.net.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
--ggCU0ZO/FnK1VHVi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 04, 2008 at 06:07:47PM +0300, Andriy Gapon wrote: >=20 > I wouldn't report this if not for one coincidence (which is described > below). I have too little facts, so this is more of a mystery problem > tale than a real problem report. >=20 > There are two systems: > 1. old, slow, i386, UP, 7-STABLE > 2. new, fast, amd64, MP, 6.3-RELEASE >=20 > Systems are located at different physical locations. >=20 > What is common between them: > 1. they both have the same backup strategy where dumps of certain levels > are performed on certain days; there are monthly dumps of level 2 (on > first day of each month), weekly dumps of level 4 (each Sunday) and > daily dumps of levels > 5 (each day except for Sunday - but including > the firsts). > dumps are done on live filesystems using -L. > dumps are initially done to the same disk and only later are transfered > to archive media. > 2. both kernels are compiled with softupdates support but there are no > filesystems with it enabled > 3. both systems have root partition gmirror-ed, it is dumped > 4. both systems have gjournal support (on 6.X it is added via a > "non-official" patch), there are gjournaled filesystems on both systems > and they are dumped. >=20 > On June 1 (Sunday) exactly the same thing happened on both systems. > At 4AM monthly level 2 dump was started and successfully performed. > At 5AM weekly level 4 dump was started. > Somewhere in the process of it system locked up. > When I physically accessed the systems I found the following: keyboard > didn't respond[*], screen froze, no pings. After reset I found that logs > stopped being updated at some timer shortly after 5AM. > [*] - although on amd64 system I was able to switch exactly once between > virtual terminals (actually from X terminal to console terminal). But > that's all, no led responses, no special combinations (like break to > debugger - it is compiled in / enabled). >=20 > This coincidence in details (and that one successful VT switch) lead me > to believe that this was some lock up in kernel rather than a hardware > problem. Also, I use that backup scheme for almost a year and never had > such a problem before. I just checked and this was the first time that > the 1st of a month fell on Sunday, so this was the first time when level > 2 dump was followed by level 4 dump. In previous months it was followed > by level > 6 dumps. >=20 > All in all, quite strange. Do you use snapshots on the gjournaled fs ? I believe this is problematic. --ggCU0ZO/FnK1VHVi Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkhGs3MACgkQC3+MBN1Mb4gouwCeLqtskITew8R5UU8xJhfzXL2w Qf8AoL1cafJuvrEvYaWwc5MI2fqZ8GLe =920D -----END PGP SIGNATURE----- --ggCU0ZO/FnK1VHVi--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080604152332.GE63348>