From owner-freebsd-stable@FreeBSD.ORG Mon Sep 1 23:51:47 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 042BF1065678 for ; Mon, 1 Sep 2008 23:51:47 +0000 (UTC) (envelope-from oberman@es.net) Received: from postal1.es.net (postal4.es.net [198.124.252.66]) by mx1.freebsd.org (Postfix) with ESMTP id AE7F88FC17 for ; Mon, 1 Sep 2008 23:51:46 +0000 (UTC) (envelope-from oberman@es.net) Received: from postal1.es.net (postal3.es.net [198.128.3.207]) by postal4.es.net (Postal Node 4) with ESMTP (SSL) id HGA01646; Mon, 01 Sep 2008 16:51:46 -0700 Received: from ptavv.es.net (ptavv.es.net [198.128.4.29]) by postal3.es.net (Postal Node 3) with ESMTP (SSL) id HGA54944; Mon, 01 Sep 2008 16:51:44 -0700 Received: from ptavv.es.net (ptavv.es.net [127.0.0.1]) by ptavv.es.net (Tachyon Server) with ESMTP id 4B53B4501A; Mon, 1 Sep 2008 16:51:44 -0700 (PDT) To: Jeremy Chadwick In-Reply-To: Your message of "Mon, 01 Sep 2008 14:38:56 PDT." <20080901213856.GA17155@icarus.home.lan> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="==_Exmh_1220313104_45719P"; micalg=pgp-sha1; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit Date: Mon, 01 Sep 2008 16:51:44 -0700 From: "Kevin Oberman" Message-Id: <20080901235144.4B53B4501A@ptavv.es.net> X-Sender-IP: 198.128.3.207 X-Sender-Domain: es.net X-Recipent: ; ; ; ; ; X-Sender: X-To_Name: Jeremy Chadwick X-To_Domain: freebsd.org X-To: Jeremy Chadwick X-To_Email: koitsu@FreeBSD.org X-To_Alias: koitsu Cc: Derek =?iso-8859-1?B?S3VsacU/c2tp?= , Michael , freebsd-stable@freebsd.org Subject: Re: bin/121684: : dump(8) frequently hangs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 23:51:47 -0000 --==_Exmh_1220313104_45719P Content-Type: text/plain; charset=us-ascii Content-Disposition: inline > Date: Mon, 1 Sep 2008 14:38:56 -0700 > From: Jeremy Chadwick > > On Mon, Sep 01, 2008 at 09:00:12AM -0700, Kevin Oberman wrote: > > > Date: Mon, 01 Sep 2008 09:36:11 -0400 > > > From: Mike Tancsa > > > Sender: owner-freebsd-stable@freebsd.org > > > > > > At 05:07 AM 9/1/2008, Derek Kuli??ski wrote: > > > > > > >Now I'm honestly a bit scared about it (even if it will be fixed > > > >before 7.1, I'm not sure I'll hurry with the update). > > > > > > There have been a number of commits to releng_7 > > > that fixed dump issues for me. A box that used > > > to regularly exhibit hung dump processes have > > > been working fine since April. e.g. a kernel from > > > 7.0-STABLE FreeBSD 7.0-STABLE #4: Wed Apr 30 > > > > > > does weekly level 0 dumps and daily differential > > > dumps on the file systems below without issue > > > % df -i > > > Filesystem 1K-blocks Used Avail > > > Capacity iused ifree %iused Mounted on > > > /dev/twed0s1a 2026030 284346 1579602 15% 2937 279685 1% / > > > devfs 1 1 0 > > > 100% 0 0 100% /dev > > > /dev/twed0s1d 5077038 575828 4095048 > > > 12% 1197 658257 0% /tmp > > > /dev/twed0s1e 20308398 11072840 7610888 > > > 59% 1065406 1572416 40% /usr > > > /dev/twed0s1f 20308398 13275050 5408678 > > > 71% 13750 2624072 1% /var > > > /dev/twed0s1g 246875258 > > > 186393906 40731332 82% 9118036 22794922 29% /zoo > > > > > > However, you should test and make sure it works for you. > > > > I have a 7-Stable system which has not been able to successfully dump(8) > > for about 2 months. Since it contains almost no important data that is > > subject to change, it's not too big a deal, but I worry that other > > systems might start showing the same problems. > > > > I have no idea why it's failing, though, and I have spent little effort > > in troubleshooting it. I'm running 3 week old stable and I'll be > > updating to today's RELENG_7 later today. > > Can someone explain what "dump frequently hangs" actually means? > > Does it lock up the entire machine indefinitely (and if so, how long did > you wait for it to (hopefully) recover)? > > Or does it more or less "deadlock" the machine, making it generally > unusable, until the dump is completely finished? > > If the latter, I can confirm this problem -- which is why we moved all > of our production systems away from using dump on UFS2 to simply using > rsnapshot[1]. I'll try to find the thread (it was a year or so ago) > where a developer told me more or less what was going on. The problem > was that UFS2 snapshot generation, over time, becomes slower and slower > to generate (this is what dump does on UFS2 systems, with or without the > -L flag), and is a known design issue. > > If anything, this issue makes ZFS incredibly important with regards to > -STABLE, where its snapshot generation for backups does not behave this > was; fast and very easily managable. > > [1]: rsync is great for backups, and very fast, but there's the issue of > modifying atimes. I committed a patch to ports/net/rsync which adds an > --atimes flag, except its behaviour is not what you'd expect: the file > which was copied, at the destination, has the correct atime (of the > source), but the source itself ends up getting its atime modified, so > you're essentially destroying the atime data on the source. > > This is a problem when it comes to programs which use atime to discern > things, such as classic UNIX mailboxes/mbox. "Um, why does mutt say I > don't have any new mail when I do??" In our case, the only person using > classic UNIX mboxes with a mail client local to the machine was me, so I > ended up migrating my procmail rules and data to Maildir using mutt, > solving the problem entirely. > > -- > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > In my case the dump deadlocks, but the system is unaffected. The dump just freezes. I need to look at it more closely, but I simply have not had time. I don't even recall what state it is in when frozen, but it can be 'kill -9'ed. The problem has persisted through at least one system upgrade. I'll try to track down more tomorrow. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 --==_Exmh_1220313104_45719P Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) Comment: Exmh version 2.5 06/03/2002 iD8DBQFIvIAQkn3rs5h7N1ERAhzxAJ9pu9Gs5lhOhFq6ctb9lziLcPU2qgCgkJZT HQZDqFDz+ZrfGJ8aLRfUnMU= =uhMF -----END PGP SIGNATURE----- --==_Exmh_1220313104_45719P--