From owner-freebsd-bugs@FreeBSD.ORG Fri Dec 16 19:20:21 2005 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3F32416A41F for ; Fri, 16 Dec 2005 19:20:21 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id B21D043D77 for ; Fri, 16 Dec 2005 19:20:04 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id jBGJK4Hc080465 for ; Fri, 16 Dec 2005 19:20:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id jBGJK4nJ080461; Fri, 16 Dec 2005 19:20:04 GMT (envelope-from gnats) Resent-Date: Fri, 16 Dec 2005 19:20:04 GMT Resent-Message-Id: <200512161920.jBGJK4nJ080461@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Nate Eldredge Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 255ED16A420 for ; Fri, 16 Dec 2005 19:16:21 +0000 (GMT) (envelope-from nge@cs.hmc.edu) Received: from turing.cs.hmc.edu (turing.cs.hmc.edu [134.173.42.99]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6A77143D60 for ; Fri, 16 Dec 2005 19:16:14 +0000 (GMT) (envelope-from nge@cs.hmc.edu) Received: by turing.cs.hmc.edu (Postfix, from userid 26983) id C6E0A53231; Fri, 16 Dec 2005 11:16:10 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by turing.cs.hmc.edu (Postfix) with ESMTP id B14805A8DE for ; Fri, 16 Dec 2005 11:16:10 -0800 (PST) Message-Id: Date: Fri, 16 Dec 2005 11:16:10 -0800 (PST) From: Nate Eldredge To: freebsd-gnats-submit@FreeBSD.org Cc: Subject: kern/90512: Snapshot corruption after fs activity X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Dec 2005 19:20:21 -0000 >Number: 90512 >Category: kern >Synopsis: Snapshot corruption after fs activity >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Dec 16 19:20:03 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Nate Eldredge >Release: FreeBSD 6.0-RELEASE amd64 >Organization: >Environment: System: FreeBSD vulcan.lan 6.0-RELEASE FreeBSD 6.0-RELEASE #0: Wed Dec 14 20:08:57 PST 2005 nate@vulcan.lan:/usr/obj/usr/src/sys/VULCAN amd64 >Description: When you use mksnap_ffs to make a snapshot on a filesystem which then has a lot of stuff deleted and re-created, the snapshot becomes corrupt. I think this is fairly serious since snapshots may be used for backup purposes. That's how I originally discovered the problem; I made a snapshot on /usr before making a bunch of changes, during which I accidentally moved most of /usr/local to another partition :). I moved it back but wanted to verify that everything was back as it was, which is when I discovered my snapshot was no good. Note this is on amd64. I have not tried i386. >How-To-Repeat: # dd if=/dev/zero of=snaptest.img bs=1024k count=1000 # mdconfig -a -t vnode -f snaptest.img md0 # newfs /dev/md0 # mount /dev/md0 /mnt/md0 # cd /mnt/md0 # tar xjf /usr/ports/distfiles/gap/gap4r4p6.tar.bz2 # mksnap_ffs /mnt/md0 /mnt/md0/.snap/snap1 # mdconfig -a -t vnode -f .snap/snap1 WARNING: opening backing store: /mnt/md0/.snap/snap1 readonly md1 # mount -r /dev/md1 /mnt/md1 ###### inspecting /mnt/md1 reveals the snapshot is apparently okay # rm -r gap4r4 ###### snapshot still apparently okay # !tar tar xjf /usr/ports/distfiles/gap/gap4r4p6.tar.bz2 # ls -l /mnt/md1/gap4r4 ls: Makefile.in: Bad file descriptor ls: bin: Bad file descriptor ls: cnf: Bad file descriptor ls: configure: Bad file descriptor ls: doc: Bad file descriptor ls: etc: Bad file descriptor ls: gap.shi: Bad file descriptor ls: grp: Bad file descriptor ls: pkg: Bad file descriptor ls: prim: Bad file descriptor ls: small: Bad file descriptor ls: src: Bad file descriptor ls: sysinfo.in: Bad file descriptor ls: trans: Bad file descriptor ls: tst: Bad file descriptor total 38 -rw-r--r-- 1 nate nate 4782 Aug 29 06:19 README -rw-r--r-- 1 nate nate 9725 May 11 2005 description4r4p5 -rw-r--r-- 1 nate nate 11660 Aug 29 06:05 description4r4p6 drwxr-xr-x 2 nate nate 9728 Aug 30 06:27 lib Doing truss on ls reveals that lstat() is returning EBADF on the offending files (which doesn't make any sense as there is no file descriptor involved; EIO might be better). Also, umounting and then fscking /dev/md1 produces a cornucopia of errors, including as a representative sample: PARTIALLY TRUNCATED INODE I=70662 3689066227402421815 BAD I=70662 4121129229942796344 BAD I=70662 3833180345978203193 BAD I=70662 4051046384641915184 BAD I=70662 3688509874569295664 BAD I=70662 3472592161990062385 BAD I=70662 3906084542581519160 BAD I=70662 4049637910162848049 BAD I=70662 4123381021216356400 BAD I=70662 3979273551213759020 BAD I=70662 4051327820913194809 BAD I=70662 EXCESSIVE BAD BLKS I=70662 INCORRECT BLOCK COUNT I=70662 (960 should be 736) PARTIALLY TRUNCATED INODE I=70719 UNALLOCATED I=23552 OWNER=nate MODE=0 DIRECTORY CORRUPTED I=70660 OWNER=nate MODE=40755 MISSING '.' I=71129 OWNER=nate MODE=40755 SIZE=1536 MTIME=Aug 30 06:27 2005 UNREF DIR I=117760 OWNER=nate MODE=40755 SIZE=512 MTIME=Aug 30 06:27 2005 LINK COUNT DIR I=2 OWNER=root MODE=40755 SIZE=512 MTIME=Dec 16 10:34 2005 COUNT 4 SHOULD BE 3 The original filesystem /dev/md0 apparently remains okay and fsck reports no errors for it. There are no kernel error messages this time, though a previous attempt (when the snapshot was on /dev/md0) yielded /mnt/md0: bad dir ino 3182535 at offset 0: mangled entry /mnt/md0: bad dir ino 2953 at offset 0: mangled entry ...4 or 5 more... Also at that time there were directories which changed to files of size 1 which dumped many, many bytes of garbage when cat'ted. >Fix: Unknown. Thanks! >Release-Note: >Audit-Trail: >Unformatted: