From owner-freebsd-stable Tue Jan 29 18:43:13 2002 Delivered-To: freebsd-stable@freebsd.org Received: from dev.nethouse.com (what.ifelse.org [208.171.40.202]) by hub.freebsd.org (Postfix) with ESMTP id DE69B37B404 for ; Tue, 29 Jan 2002 18:43:05 -0800 (PST) Received: from fourier.mat (242829hfc118.tampabay.rr.com [24.28.29.118]) by dev.nethouse.com (8.11.6/8.11.6) with ESMTP id g0U2bDH11211 for ; Tue, 29 Jan 2002 21:37:13 -0500 (EST) (envelope-from btt@nethouse.com) Received: from fourier.mat (localhost.mat [127.0.0.1]) by fourier.mat (8.12.1/8.12.1/Debian -5) with ESMTP id g0U2faix001253 for ; Tue, 29 Jan 2002 21:41:36 -0500 Received: (from billt@localhost) by fourier.mat (8.12.1/8.12.1/Debian -5) id g0U2faDF001251 for stable@FreeBSD.org; Tue, 29 Jan 2002 21:41:36 -0500 Date: Tue, 29 Jan 2002 21:41:36 -0500 From: Bill Triplett To: stable@FreeBSD.org Subject: Re: FS corruption w/softupdates on 4.5-RC ? Message-ID: <20020130024136.GB1150@fourier.mat> References: <20020127064250.GA333@moreton.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020127064250.GA333@moreton.com.au> User-Agent: Mutt/1.3.25i Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi all, I can reproduce this using the MAKEDEV method as well as creating one large file with something like yes > testfile. Appended is my dmesg as well. On Sun, Jan 27, 2002 at 04:42:50PM +1000, Phil Homewood wrote: > A system recently upgraded from a 6 month old 4.3-STABLE image > appears to be suddenly experiencing massive FS corruption (or > fsck is very confused when checking a readonly-mounted FS.) > > I've been following what seem to be a lot of dead-ends on this, > but seem to have tracked it down to the following specifics: > > * Only softupdate filesystems appear to have the problem. I > first thought I saw it on a non-softupdate FS, but may have > been mistaken. > > * 4.5-RC kernel (as of Jan 26, also reupdated today) breaks, > 4.3-STABLE does not. > (Tried 4.4-REL, no breakage, but the kernel I had didn't have > softupdates, so inconclusive.) > > * GENERIC kernel is sufficient to reproduce > > * Unmounting the FS and fscking it doesn't *seem* to show > the problem up. fscking the filesystem after remounting > readonly does cause breakage. Unmounting, remounting > readonly, and fscking seems safe. > > * dd'ing a mounted fs to another identical device (I've been > backing up the root fs to a twin partition like this forever) > and then fscking the backup device exhibits the breakage. > (Maybe I should change that to dump|restore, like I thought > I'd been doing all along :-) **> I didn't try this. > * The bug seems to be most easily tickled using MAKEDEV. I can > reproduce the problem reliably by doing: > > # newfs /dev/da0s2g > # tunefs -n enable /dev/da0s2g > # mount /dev/da0s2g /tmp > # cd /tmp > # mkdir dev > # cd dev > # cp /dev/MAKEDEV . > # sh MAKEDEV all > # cd / > # mount -u -r /tmp > # fsck /tmp > > * Occasionally the fsck or (if fsck comes up clean) subsequent mount > will panic: so far I've seen > > panic: handle_workitem_remove: bad file delta > > softdep_deallocate_dependencies: dangling deps > (that one was interspersed with fsck output, so I could have got it wrong) > > and one that may have involved a dup alloc; I unfortunately didn't > copy it down. > > Attached is a copy of my dmesg and kernel config. > > Any clues greatfully appreciated... Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.5-RC #0: Thu Jan 24 17:48:50 EST 2002 root@dev.nethouse.com:/usr/obj/usr/src/sys/DEV Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 501139598 Hz CPU: Pentium III/Pentium III Xeon/Celeron (501.14-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x673 Stepping = 3 Features=0x383f9ff real memory = 335478784 (327616K bytes) avail memory = 323174400 (315600K bytes) Preloaded elf kernel "kernel" at 0xc02ae000. Preloaded userconfig_script "/boot/kernel.conf" at 0xc02ae09c. Pentium Pro MTRR support enabled Using $PIR table, 7 entries at 0xc00fdf00 npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib2: at device 1.0 on pci0 pci1: on pcib2 isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xe000-0xe00f at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: at 7.2 irq 11 pci0: at 15.0 irq 10 xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xec00-0xec7f mem 0xd4001000-0xd400107f irq 7 at device 17.0 on pci0 xl0: Ethernet address: 00:01:03:d1:ae:8f miibus0: on xl0 ukphy0: on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto pcib1: on motherboard pci2: on pcib1 orm0: