From owner-freebsd-fs@FreeBSD.ORG Thu Feb 14 19:58:15 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 70F3316A417 for ; Thu, 14 Feb 2008 19:58:15 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (unknown [IPv6:2a01:170:102f::2]) by mx1.freebsd.org (Postfix) with ESMTP id F21DF13C465 for ; Thu, 14 Feb 2008 19:58:14 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (localhost [127.0.0.1]) by lurza.secnetix.de (8.14.1/8.14.1) with ESMTP id m1EJwD9m078517; Thu, 14 Feb 2008 20:58:13 +0100 (CET) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.14.1/8.14.1/Submit) id m1EJwCoZ078516; Thu, 14 Feb 2008 20:58:12 +0100 (CET) (envelope-from olli) Date: Thu, 14 Feb 2008 20:58:12 +0100 (CET) Message-Id: <200802141958.m1EJwCoZ078516@lurza.secnetix.de> From: Oliver Fromme To: freebsd-fs@FreeBSD.ORG X-Newsgroups: list.freebsd-fs User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (FreeBSD/6.2-STABLE-20070808 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.1.2 (lurza.secnetix.de [127.0.0.1]); Thu, 14 Feb 2008 20:58:13 +0100 (CET) Cc: Subject: UFS2 corruption (RELENG_7, amd64) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Feb 2008 19:58:15 -0000 Hi, We have a problem with a large file system (1.5 TB). This is a UFS2 that was newfs'ed on FreeBSD/amd64 RELENG_7 from December 17. It was formatted with bsize 64K and fsize 8K because of performance reasons. Especially fsck is significantly faster with these settings. There's no disklabel; the whole raw disk is used. Also, the inode density was reduced to one inode per 256 KB, so there are 5.7 million inodes available, of which 33,000 are actually in use currently. The file system contains a busy news spool area. It is mounted with soft-updates and noatime. After while of usage, some corruption seems to occur, and df(1) output becomes very strange, see below. This happened several times already. ================== snip ================== Filesystem 1K-blocks Used Avail Capacity /dev/da45 1463107704 -1202122007974910440 1202122009320969528 -89306778483% # umount /dev/da45 # fsck -y /dev/da45 ** /dev/da45 ** Last Mounted on [...] ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames UNALLOCATED I=2107428 OWNER=2283830233 MODE=0 SIZE=0 MTIME=Feb 5 08:43 2008 NAME=/D.0131ae2e/B.0786 UNEXPECTED SOFT UPDATE INCONSISTENCY REMOVE? yes ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? yes SUMMARY INFORMATION BAD SALVAGE? yes BLK(S) MISSING IN BIT MAPS SALVAGE? yes 24667 files, 116050264 used, 66838199 free (5135 frags, 8354133 blocks, 0.0% fragmentation) ***** FILE SYSTEM WAS MODIFIED ***** ================== snip ================== After a while, the same thing happens again. In dmesg I see the following messages, but I have no idea if they are related to the above problem: bad block 724405012343388976, ino 759812 pid 48 (softdepflush), uid 0 inumber 759812 on /news2/spool/news/22: bad block bad block -4916881576150019921, ino 759812 pid 48 (softdepflush), uid 0 inumber 759812 on /news2/spool/news/22: bad block bad block -131736542334903744, ino 759812 pid 48 (softdepflush), uid 0 inumber 759812 on /news2/spool/news/22: bad block bad block -6421737673234919641, ino 759812 pid 48 (softdepflush), uid 0 inumber 759812 on /news2/spool/news/22: bad block handle_workitem_freeblocks: block count I do not see any SCSI error messages, so I don't think it is a hardware problem. These are the related boot messages: mpt0: da49 at mpt0 bus 0 target 10 lun 4 da49: Fixed Direct Access SCSI-4 device da49: 300.000MB/s transfers da49: Command Queueing Enabled da49: 1430288MB (2929229824 512 byte sectors: 255H 63S/T 182336C) Did anyone encounter similar problems? Could it be caused by the nonstandard bsize/fsize settings? Are there any known problems, especially on amd64? Any hint or advice is very much appreciated! Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "Perl will consistently give you what you want, unless what you want is consistency." -- Larry Wall