From owner-freebsd-questions Wed Mar 12 18:32:31 2003 Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2B80537B401; Wed, 12 Mar 2003 18:32:27 -0800 (PST) Received: from ecserv7.uwaterloo.ca (ecserv7.uwaterloo.ca [129.97.50.127]) by mx1.FreeBSD.org (Postfix) with ESMTP id EF51243FA3; Wed, 12 Mar 2003 18:32:25 -0800 (PST) (envelope-from bruce@engmail.uwaterloo.ca) Received: from ecserv7.uwaterloo.ca (localhost.uwaterloo.ca [127.0.0.1]) by ecserv7.uwaterloo.ca (8.12.6/8.12.6) with ESMTP id h2D2WNpc031930; Wed, 12 Mar 2003 21:32:23 -0500 (EST) (envelope-from bruce@engmail.uwaterloo.ca) Received: (from www@localhost) by ecserv7.uwaterloo.ca (8.12.6/8.12.6/Submit) id h2D2WMBc031929; Wed, 12 Mar 2003 21:32:22 -0500 (EST) X-Authentication-Warning: ecserv7.uwaterloo.ca: www set sender to bruce@engmail.uwaterloo.ca using -f Received: from 65.93.97.169 ( [65.93.97.169]) as user bruce@engmail.uwaterloo.ca by www.nexusmail.uwaterloo.ca with HTTP; Wed, 12 Mar 2003 21:32:22 -0500 Message-ID: <1047522742.3e6fedb6d7f9c@www.nexusmail.uwaterloo.ca> Date: Wed, 12 Mar 2003 21:32:22 -0500 From: Bruce Campbell To: Simon Cc: "freebsd-hardware@freebsd.org" , "freebsd-questions@freebsd.org" Subject: Re: problem on 1TB filesystem RAID 5 3ware References: <200303130144.h2D1iO706855@engmail.uwaterloo.ca> In-Reply-To: <200303130144.h2D1iO706855@engmail.uwaterloo.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.1 / FreeBSD-4.6.2 X-Originating-IP: 65.93.97.169 Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Quoting Simon : > > I can only hope I don't have the same issue. I'm currently building a 1.75TB > NAS to do daily backups using 3ware 7500-8 and maxtor drives. Tiny bit more info: - NFS was starting to be implicated, but on one of my backup servers I had let it run 2 dumps of our Network Appliance, basically: rsh netapp dump ... | gzip > file and I tried "gunzip -t" to test the file, and both were corrupt. My backup system I've been running with vinum for a long time does a weekly "gunzip -t" on all files, and I've not seen a problem before. This also removes the network card from suspicion, as if it was the problem, the .gz file would still be valid (it would just be compressed garbage, but it would not be corrupt itself) Here is the program I wrote to test the partitions: http://www.freebsd.uwaterloo.ca/twiki/bin/view/Freebsd/BurnInProcedure (obviously not an outstanding test, since it passed my system) > > -Simon > > On Wed, 12 Mar 2003 20:38:13 -0500, Bruce Campbell wrote: > > > > >File corruption on 2 identical systems, designed to be backup > >servers to contain dumps of other systems: > > > >FreeBSD ecserv18.uwaterloo.ca 4.7-RELEASE FreeBSD 4.7-RELEASE #0: Wed Oct 9 > > >15:08:34 GMT 2002 root@builder.freebsdmall.com:/usr/obj/usr/src/sys/GENERIC > > >i386 > > > >with 1TB /backup partition, on a 3ware 7500-8 ATA RAID card, RAID 5: > > > >Filesystem 1K-blocks Used Avail Capacity Mounted on > >/dev/twed0s1a 20644846 906552 18086708 5% / > >procfs 4 4 0 100% /proc > >/dev/twed0s1e 938819776 279031856 584682338 32% /backup > > > >disks are 6 x Western Digital 2000JB (200GB) > > > >I ran tests on /backup for 10 days on each system (fill disk with > >50GB files of pseudo random data, then reading them all back and > >verify contents, then erase, then start over). Tests ran perfectly. > > > >details on hardware config at: > > > >http://www.freebsd.uwaterloo.ca/twiki/bin/view/Freebsd/BackupServerHardware > > > >Then, I was ready to put the systems into production, so I copied > >data from my 2 older backup servers (which have 360GB vinum partitions) > >and after copying the data (approx 250GB in 325 files) about a dozen > >files were corrupt after the copy. I copied via an NFS mount. > > > >All corruption started on a 64K boundary, except one which was on a 16K > >boundary. Recopied the dozen corrupt files, and then only 6 were corrupt. > >Same problem on both systems, each which copied from a different source > >server. > > > >File seems corrupt to the end after first corruption starts, I have > >not looked for a pattern to see if it is another files contents, > >or misplaced contents from the same file. > > > >fsck shows no problems > > > >Restarted my test filling with 50GB files again, has run perfectly. > > > >I plan to try: > > > > - turn off soft updates > > - RAID 10 instead of 5 > > - different file system parameters, for example I don't need > > 100 million inodes. > > - rcp'ing the files > > - staring at computer screen > > > >By the way, 3ware has not officially approved the WD 200GB drive last > >time I checked. > > > >Lots of good experience with the motherboard (ASUS P4S533) and > >network card (Intel Pro/100). Lots of good experience with > >vinum striped partitions of smaller size (360GB) > > > >Does anyone have any suggestions ? > > > >-- > >Bruce Campbell > >Engineering Computing > >CPH-2374B > >University of Waterloo > >(519)888-4567 ext 5889 > > > >---------------------------------------- > >This mail sent through www.mywaterloo.ca > > > >To Unsubscribe: send mail to majordomo@FreeBSD.org > >with "unsubscribe freebsd-hardware" in the body of the message > > > > > -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 ---------------------------------------- This mail sent through www.mywaterloo.ca To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message