From owner-freebsd-current@FreeBSD.ORG Sun Jul 6 14:10:30 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 608841065695 for ; Sun, 6 Jul 2008 14:10:30 +0000 (UTC) (envelope-from astrodog@gmail.com) Received: from yw-out-2324.google.com (yw-out-2324.google.com [74.125.46.31]) by mx1.freebsd.org (Postfix) with ESMTP id 18B148FC13 for ; Sun, 6 Jul 2008 14:10:29 +0000 (UTC) (envelope-from astrodog@gmail.com) Received: by yw-out-2324.google.com with SMTP id 9so746610ywe.13 for ; Sun, 06 Jul 2008 07:10:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=1OHXu3+0LomwLKbMYB3xF5dfuVHIJsNMSCFqz99hr3Q=; b=hVv1mr/ZKQPwlNvzEPTRfkgq5SlGG2mEdliq13kVJPg8zOEoNBx+wWzR3m3I5lqwgJ ZuL1rC8ODOnN2YaLy+Giv/77DxT3N/bvtK20kOrfP9JYZ9ANp8MvkNQsih6clwaHRaKI E4ypYt7GWgr4Emnl+TELEwqIXKvXHQhrhMnJg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=H7DeztGimGtXymwmkJ7aAwLbKSi+4bOAObWaQ7UD6/+mo1C80iXl/kqa4wzzcd7U+o h/S6b2Wbd7mbct3E/B5rIsHYd4yBJUAdi7DKaFsKoR1GbFpJLXQjpYZX8qltn6kbWoRX M+l9utYTADJDdLLcKmxaRMuhGvsBmhu4osFxs= Received: by 10.151.155.19 with SMTP id h19mr6161185ybo.36.1215351912772; Sun, 06 Jul 2008 06:45:12 -0700 (PDT) Received: by 10.150.178.1 with HTTP; Sun, 6 Jul 2008 06:45:12 -0700 (PDT) Message-ID: <2fd864e0807060645j65b67f97s5dc1e81145660c9d@mail.gmail.com> Date: Sun, 6 Jul 2008 21:45:12 +0800 From: Astrodog To: "gnn@freebsd.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: Cc: current@freebsd.org Subject: Re: Has anyone else seen any form of in memory or on disk corruption? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Jul 2008 14:10:30 -0000 On 7/5/08, gnn@freebsd.org wrote: > Hi, > > I've been working on the following brain teasing (breaking?) problem > for about a week now. What I'm seeing is that on large memory > machines, those with more than 4G of RAM, the ungzipping/untarring of > files fails due to gzip thinking the file is corrupt. The way to > reproduce this is: > > 1) Create a bunch of gzip/tar balls in the 1-20MB range. > 2) Reboot FreeBSD 7.0 release > 3) Run gzip -t over all the files. > > I have hundreds of these files to run this over, and a full check > takes about 3 hours, but I usually see some form of corruption within > the first 20 minutes. > > Other important factors: > > 1) This is on very modern, 2P/4Core (8 cores total) hardware > 2) The disks are 1TB SATA set up in JBOD. > 3) The machines have 16G of RAM. > 4) Corruption is seen only after a reboot, if the machines continue to > run corruption is never seen again, until another reboot. > 5) The systems are all Xeon running amd64 > 6) The disk controller is an AMCC 9650, but we do see this very rarely > with the on board controlller. > 7) All boards are > > http://www.supermicro.com/products/motherboard/Xeon1333/5400/X7DWU.cfm > > 8) All machines have 3 1TB drives. > 9) The corruption is in 4K chunks. That is N x 4K. > 10) Files are not normally corrupted on disk, but this can happen. > > I have already tried a few of the obvious things, such as making sure > that we sync pages before we shutdown the twa driver. > > Given what I have seen I believe this is something that happens from > startup, and not at shutdown. > > Thoughts? > > Best, > George > As a datapoint for you, I use a number of the 9650s with 15 750GB or 1TB drives, on a Supermicro motherboard with Opteron processors and 4GB of memory. With this configuration I have not experienced any data corruption. --- Harrison