Date: Mon, 27 Oct 2008 19:41:43 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Carl Voth <cvoth@telus.net> Cc: freebsd-questions@freebsd.org Subject: Re: gmirror slice insertion, "FAILURE - READ_DMA status=51<READY, DSC, ERROR>" Message-ID: <20081028024143.GA37131@icarus.home.lan> In-Reply-To: <49067148.6080307@telus.net> References: <49067148.6080307@telus.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 27, 2008 at 06:56:24PM -0700, Carl Voth wrote: > I'm setting up a dual-disk server and am trying to bring it up with > gmirror and gjournal. One slice per disk, the goal being to create a > single mirror from said slices with some of the partitions journaled. > Installed FreeBSD-7.0RELEASE to ad4, then used technique from here to > create single-disk mirror/gm0 on ad6: > > http://people.freebsd.org/~rse/mirror/ > > Modified ad4s1a /boot.config to pass control to boot stage 3 on ad6. So > far, so good. Began Ralf's procedure for inserting ad4s1 into > mirror/gm0. The synchronization began and reached 6% when this little > horror appeared: > > ad6: FAILURE - READ_DMA status=51<READY,DSC,ERROR> > error=40<UNCORRECTABLE> LBA=134802751 Are you sure you don't have a bad hard disk? This looks to be like a classic block/sector failure. This does not appear to be the infamous famous "DMA timeout" problem, especially if this is the only error you're getting. > I reinstalled FB7 to ad4, redid the /boot.config modification to make > ad6/gm0 bootable again and retried the insertion of ad4 into gm0. Exact > same error messages at exactly the same point with same consequences. So you're saying that the *exact* same READ_DMA error, at the *exact* same LBA, is reported on ad4? If so, that's very bizarre. > Now, I see that other folks are having unexplained DMA problems too, > albeit in different contexts. What should I be concluding here? Those > other folks don't seem to be concluding it's bad drives. If there were > bad sectors, I'd get different error messages, yes? The "error=40<UNCORRECTABLE>" part of what you're seeing seems to imply there's an uncorrectable read transaction that's happened. What other people see are DMA timeouts, but no actual sign of uncorrectable errors. The problem with the "DMA timeout" issue is that it manifests itself in hundreds of different ways. Each case so far has to be handled on an individual basis. > FWIW, I'm using gjournal on 3 partitions in mirror/gm0. > > Here's my server's parts list: > - Seagate ST31000340AS Barracuda 7200.11, 1TB, SATA (x2). Can you please provide the output from the following commands? dmesg vmstat -i atacontrol list atacontrol cap ad4 atacontrol cap ad6 smartctl -a /dev/ad4 smartctl -a /dev/ad6 Thanks. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081028024143.GA37131>