From owner-freebsd-current@FreeBSD.ORG Mon Jul 7 16:30:20 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 37268106567E for ; Mon, 7 Jul 2008 16:30:20 +0000 (UTC) (envelope-from freebsd-current@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id BA7858FC23 for ; Mon, 7 Jul 2008 16:30:19 +0000 (UTC) (envelope-from freebsd-current@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1KFtbW-0000jU-Op for freebsd-current@freebsd.org; Mon, 07 Jul 2008 16:30:18 +0000 Received: from mulderlab.f5.com ([205.229.151.151]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 07 Jul 2008 16:30:18 +0000 Received: from atkin901 by mulderlab.f5.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 07 Jul 2008 16:30:18 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-current@freebsd.org From: Mark Atkinson Date: Mon, 07 Jul 2008 09:30:07 -0700 Lines: 52 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: mulderlab.f5.com User-Agent: KNode/0.10.5 Sender: news Subject: Re: Has anyone else seen any form of in memory or on disk corruption? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jul 2008 16:30:20 -0000 gnn@freebsd.org wrote: > Hi, > > I've been working on the following brain teasing (breaking?) problem > for about a week now. What I'm seeing is that on large memory > machines, those with more than 4G of RAM, the ungzipping/untarring of > files fails due to gzip thinking the file is corrupt. The way to > reproduce this is: > > 1) Create a bunch of gzip/tar balls in the 1-20MB range. > 2) Reboot FreeBSD 7.0 release > 3) Run gzip -t over all the files. > > I have hundreds of these files to run this over, and a full check > takes about 3 hours, but I usually see some form of corruption within > the first 20 minutes. > > Other important factors: > > 1) This is on very modern, 2P/4Core (8 cores total) hardware > 2) The disks are 1TB SATA set up in JBOD. > 3) The machines have 16G of RAM. > 4) Corruption is seen only after a reboot, if the machines continue to > run corruption is never seen again, until another reboot. > 5) The systems are all Xeon running amd64 > 6) The disk controller is an AMCC 9650, but we do see this very rarely > with the on board controlller. > 7) All boards are > > http://www.supermicro.com/products/motherboard/Xeon1333/5400/X7DWU.cfm > > 8) All machines have 3 1TB drives. > 9) The corruption is in 4K chunks. That is N x 4K. > 10) Files are not normally corrupted on disk, but this can happen. > > I have already tried a few of the obvious things, such as making sure > that we sync pages before we shutdown the twa driver. > > Given what I have seen I believe this is something that happens from > startup, and not at shutdown. > > Thoughts? Have you tried turning off background fsck on boot to see if the problem goes away? -- Mark Atkinson atkin901@yahoo.com (!wired)?(coffee++):(wired);