From owner-freebsd-current@FreeBSD.ORG Fri Jul 4 18:19:58 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 131A5106567D; Fri, 4 Jul 2008 18:19:58 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from weak.local (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 3B0F58FC15; Fri, 4 Jul 2008 18:19:56 +0000 (UTC) (envelope-from kris@FreeBSD.org) Message-ID: <486E69CC.50205@FreeBSD.org> Date: Fri, 04 Jul 2008 20:19:56 +0200 From: Kris Kennaway User-Agent: Thunderbird 2.0.0.14 (Macintosh/20080421) MIME-Version: 1.0 To: gnn@freebsd.org References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: current@freebsd.org Subject: Re: Has anyone else seen any form of in memory or on disk corruption? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jul 2008 18:19:58 -0000 gnn@freebsd.org wrote: > Hi, > > I've been working on the following brain teasing (breaking?) problem > for about a week now. What I'm seeing is that on large memory > machines, those with more than 4G of RAM, the ungzipping/untarring of > files fails due to gzip thinking the file is corrupt. The way to > reproduce this is: > > 1) Create a bunch of gzip/tar balls in the 1-20MB range. > 2) Reboot FreeBSD 7.0 release > 3) Run gzip -t over all the files. > > I have hundreds of these files to run this over, and a full check > takes about 3 hours, but I usually see some form of corruption within > the first 20 minutes. > > Other important factors: > > 1) This is on very modern, 2P/4Core (8 cores total) hardware > 2) The disks are 1TB SATA set up in JBOD. > 3) The machines have 16G of RAM. > 4) Corruption is seen only after a reboot, if the machines continue to > run corruption is never seen again, until another reboot. > 5) The systems are all Xeon running amd64 > 6) The disk controller is an AMCC 9650, but we do see this very rarely > with the on board controlller. > 7) All boards are As a negative data point, I have a number of 2*4 core amd64 systems with 8GB of RAM and ATA disks that do not see data corruption (at boot or after it). Kris