Date: Fri, 18 Jun 2010 17:15:41 +0300 From: Alexander Motin <mav@FreeBSD.org> To: Jeremy Chadwick <freebsd@jdc.parodius.com> Cc: Matthew Lear <matt@bubblegen.co.uk>, freebsd-stable@freebsd.org, Miroslav Lachman <000.fbsd@quip.cz> Subject: Re: 7.2-RELEASE-p4, IO errors & RAID1 failure Message-ID: <4C1B7F8D.3010809@FreeBSD.org> In-Reply-To: <20100618120247.GA40782@icarus.home.lan> References: <1276844904.7519.19.camel@almscliff.bubblegen.co.uk> <20100618082127.GA34578@icarus.home.lan> <4C1B5A55.1040608@quip.cz> <20100618120247.GA40782@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
Jeremy Chadwick wrote: > On Fri, Jun 18, 2010 at 01:36:53PM +0200, Miroslav Lachman wrote: >> Jeremy Chadwick wrote: >>> On Fri, Jun 18, 2010 at 08:08:24AM +0100, Matthew Lear wrote: >> [...] >> >>>> The drives in the RAID exist on two seperate ATA channels: >>>> [root@meshuga /home/matt]# atacontrol list >>>> ATA channel 0: >>>> Master: ad0<WDC WD3200AAKS-00VYA0/12.01B02> SATA revision 2.x >>>> Slave: ad1<FB160C4081/HPF0> SATA revision 1.x >>>> ATA channel 1: >>>> Master: ad2<WDC WD3200AAKS-00VYA0/12.01B02> SATA revision 2.x >>>> Slave: no device present >>>> ATA channel 2: >>>> Master: acd0<HL-DT-ST DVDRAM GH22NS40/NL01> SATA revision 1.x >>>> Slave: no device present >>>> ATA channel 3: >>>> Master: no device present >>>> Slave: no device present >>>> >>>> ad1 is a third 160G drive that I periodically back up to using cron. >>> So your RAID-1 array consists of ad0 and ad2? You didn't provide >>> "atacontrol status" output so I'm going to assume that's the case. >>> >>> What's odd to me is that you somehow have two disks on a single ATA >>> channel -- look closely at channel 0. SATA has a 1:1 device-to-channel >>> mapping, so I'm a little surprised to see there's two devices on channel >>> 0. To me, this indicates your system BIOS is configured to run in >>> "Emulation" mode -- where the ATA controller pretends to be a PATA/IDE >>> controller, thus SATA-0 and SATA-1 devices appear as primary master and >>> primary slave, respectively. >>> >>> What motherboard is this? Can you change the setting to either >>> "Native", "Enhanced", or (even better) "AHCI"? I've seen some systems >>> where the Serial ATA option in the BIOS has an "Auto" option, which does >>> totally bizarre things at times. >>> >>> But before changing the setting, I would recommend dealing with the disk >>> problem first. Changing the SATA controller operation mode will almost >>> certainly change all of your device names (you'll have to go into >>> single-user mode, mount filesystems by hand, fix /etc/fstab, etc.). >> [...] >> >> It is "normal" on HP G5 series. I have ProLiant ML 110 G5. I tried >> all type of settings in BIOS, but all of them shows two disks on one >> ATA channel: >> >> HP ProLiant ML 110 G5 >> >> FreeBSD 7.2-RELEASE-p4 amd64 GENERIC >> >> root@kiwi ~/# atacontrol list >> ATA channel 0: >> Master: ad0 <SAMSUNG HD103UJ/1AA01113> SATA revision 2.x >> Slave: ad1 <SAMSUNG HD103UJ/1AA01113> SATA revision 2.x >> ATA channel 1: >> Master: ad2 <SAMSUNG HD103UJ/1AA01113> SATA revision 2.x >> Slave: ad3 <SAMSUNG HD103UJ/1AA01113> SATA revision 2.x >> ATA channel 2: >> Master: acd0 <HL-DT-ST DVD-RAM GH15L/FA01> SATA revision 1.x >> Slave: no device present >> ATA channel 3: >> Master: no device present >> Slave: no device present >> >> >> >> atapci0: <Intel ICH9 SATA300 controller> port >> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1c10-0x1c1f,0x1c00-0x1c0f at >> device 31.2 on pci0 >> ata0: <ATA channel 0> on atapci0 >> ata0: [ITHREAD] >> ata1: <ATA channel 1> on atapci0 >> ata1: [ITHREAD] >> pci0: <serial bus, SMBus> at device 31.3 (no driver attached) >> atapci1: <Intel ICH9 SATA300 controller> port 0x1c68-0x1c6f,0x1c5c-0x1c5f,0x1c60-0x1c67,0x1c58-0x1c5b,0x1c30-0x1c3f,0x1c20-0x1c2f >> irq 18 at device 31.5 on pci0 >> atapci1: [ITHREAD] >> ata2: <ATA channel 0> on atapci1 >> ata2: [ITHREAD] >> ata3: <ATA channel 1> on atapci1 >> ata3: [ITHREAD] >> >> >> pciconf -lv >> atapci0@pci0:0:31:2: class=0x01018a card=0x31f4103c >> chip=0x29208086 rev=0x02 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82801IB/IR/IH (ICH9 Family) 4 port Serial ATA >> Storage Controller 1' >> class = mass storage >> subclass = ATA >> >> atapci1@pci0:0:31:5: class=0x010185 card=0x31f4103c >> chip=0x29268086 rev=0x02 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82801IB/IR/IH (ICH9 Family) 2 port Serial ATA >> Storage Controller 2' >> class = mass storage >> subclass = ATA >> >> >> >> ad0: 953869MB <SAMSUNG HD103UJ 1AA01113> at ata0-master SATA300 >> ad1: 953869MB <SAMSUNG HD103UJ 1AA01113> at ata0-slave SATA300 >> ad2: 953869MB <SAMSUNG HD103UJ 1AA01113> at ata1-master SATA300 >> ad3: 953869MB <SAMSUNG HD103UJ 1AA01113> at ata1-slave SATA300 >> da0 at umass-sim0 bus 0 target 0 lun 0 >> da0: <USB 2.0 USB Flash Drive 0.00> Removable Direct Access SCSI-2 device >> da0: 40.000MB/s transfers >> da0: 1928MB (3948544 512 byte sectors: 255H 63S/T 245C) >> acd0: DVDR <HL-DT-ST DVD-RAM GH15L/FA01> at ata2-master SATA150 >> >> >> I am using this machine as storage for backups with ZFS RAIDZ >> without any timeouts so I think that two disks on one channel is not >> causing the timeouts (only little slowdown) > > Wow, that's really... interesting. :-) What this indicates is that the > controller is running in Native/Enhanced mode yet devices attached to > SATA ports #0/#1 are master/slave on channel 0, and ports #2/#3 are > master/slave on channel 1. Except AHCI, all other modes are just variations of PATA emulation. "subclass = ATA" means that there is no AHCI enabled. PATA emulation itself should not be a problem, but it is definitely not good from performance and hot-swap points. As already told, ata(4) has very strict timeout values. It may happen, that due to medium errors drive needs too much time co complete I/O. It is theoretically possible that SMART may complete the test due to higher timeout values. The better test would be to run MHDD tool on disk to find/remap pre-failure sectors, if any. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C1B7F8D.3010809>
