From owner-freebsd-hackers Sat Sep 28 02:23:52 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id CAA07874 for hackers-outgoing; Sat, 28 Sep 1996 02:23:52 -0700 (PDT) Received: from nike.efn.org (resnet.uoregon.edu [128.223.170.28]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id CAA07777 for ; Sat, 28 Sep 1996 02:23:34 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by nike.efn.org (8.7.5/8.7.3) with SMTP id CAA27035; Sat, 28 Sep 1996 02:21:41 -0700 (PDT) Date: Sat, 28 Sep 1996 02:21:41 -0700 (PDT) From: John-Mark Gurney X-Sender: gurney_j@nike Reply-To: John-Mark Gurney To: Terry Lambert cc: FreeBSD-hackers@FreeBSD.org Subject: Re: cvs commit: src/sbin/fsdb fsdb.c In-Reply-To: <199609272124.OAA10412@phaeton.artisoft.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-hackers@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk On Fri, 27 Sep 1996, Terry Lambert wrote: > > The strange thing is that this should be impossible to happen. Anyway, > > the problem is that sometimes an filesystem passes the fsck but still makes > > the kernel panic with a bad dir: mangled entry (or something like that). > > The reason is that the size of the directory is beyond the last datablock, > > thus effectively making a sparse directory file (at least in my case). > > Fsck doesn't find anything becuase it only examines the present datablocks. > > The kernel does see such a non-present block as a bunch of zero's. And > > that causes the panic because a non-used directory chunk should have a > > reclen field of 255. The fix (until fsck is fixed) is to fsdb the filesystem, > > chdir to the bad dir and do an ls. You will then see the last entry and you > > can reset the size of the directory untill just after that entry. well.. I just ended up unlinking the dir... and I didn't know about fsdb at the time of my problem :) > This FS *was* fsck'ed after a crash, or it *wasn't* fsck'ed after a > crash? > > If it *wasn't*, then the loop was created in the FS code. > > If it *was*, then the fsck code is faulty. I had a problem with the same above on 0323-SNAP and the fsck code DIDN"T detect it when I ran it in single user mode... the way I fixed this was to write a small c program to call unlink on the directory which when ran removed it then fsck was able to recover and the disk was back to normal... one quick question... any reason why FreeBSD doesn't have an unlink command? at least only accessable to root? > The correct mechanism for recovery would be for the fck to travers the > last directory block in a directory to make sure it has at least one > valid entry, and perform a full traversal with a file truncation if > otherwise, to complete the directory "shrink". > > > Since the lost+found and the truncate back were the two major fsck > impactful semantic changes (all other operations *should* be idempotent), > then this should be the last one lurking in the "4.4 semantic changes" > for fsck. > > > So: can you tell me if the condition resulted from fsck not catching > it after a crash, or if it resulted from normal operation of the FS? my problem was from a system crash with the fs mount async... I know this isn't good but I've had a large number of crashes when I didn't lose any data... and it does improve speed (i.e. config kernname completes in about a 1second :) )... the errors that I got when trying to access the directory was: /usr:bad dir ino 85312 at offset 0: mangled entry panic:bad dir which was repeated with the offset increasing by 512 every time.. until power cycled the machine when it hit 706048... when I would run fsck on it it would, on phase 2, it would report: MISSING '..' I=85312 OWNER=root MODE=46572 SIZE=1768385024 TIME=Jun 30 03:18 1961 DIR=/X11R6/lib/X11/locale/iso8859-7/XLC_LOCALE I would fix it, but it never really got fixed (as it's invalid)... and after a reboot fsck again reported the above error... so I made a small c prog to call unlink with the dir's name then I ran fsck on the drive which then found a dir with improper link count and then I wouldn't have the problem again... I hope this helps... I was using a serial console at the time so I can send you the log file on what all I did if you need to know.. ttyl.. John-Mark gurney_j@efn.org http://resnet.uoregon.edu/~gurney_j/ Modem/FAX: (541) 683-6954 (FreeBSD Box) Live in Peace, destroy Micro$oft, support free software, run FreeBSD (unix)