Date: Wed, 19 Jan 2005 15:06:04 -0800 From: Jon Simola <jsimola@gmail.com> To: freebsd-stable@freebsd.org Subject: Re: Bad disk or kernel (ATA Driver) problem? Message-ID: <8eea04080501191506237fc762@mail.gmail.com> In-Reply-To: <8eea040805011913334b140af6@mail.gmail.com> References: <20050119151301.A22310@Denninger.Net> <8eea040805011913334b140af6@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 19 Jan 2005 13:33:12 -0800, Jon Simola <jsimola@gmail.com> wrote: > I've got a few 1U Supermicro boxes running dual SATA drives: > I've run into all sorts of problems with every one, and changing the > IDE channel settings in the BIOS always fixes it. Which really annoys > me, because I setup a new box, run it for a couple weeks, then the > drives start getting flaky under load. Then I go change the setting in > the BIOS (that I always forget to do on initial setup) and it's dead > stable for months at a time. I was politely asked to actually dig up the settings, which cut through my lack of sleep. I should have done this earlier :) On this one box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c): 5.2.1-RELEASE-p4 atapci0: <Intel ICH5 SATA150 controller> port 0xf000-0xf00f,0-0x3,0-0x7,0-0x3,0- 0x7 irq 16 at device 31.2 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] GEOM: create disk ad0 dp=0xc671a560 ad0: 70911MB <WDC WD740GD-00FLA0> [144073/16/63] at ata0-master UDMA100 GEOM: create disk ad1 dp=0xc671a460 ad1: 70911MB <WDC WD740GD-00FLA0> [144073/16/63] at ata0-slave UDMA100 acd0: CDROM <CD-224E> at ata1-master PIO4 That's a pair of SATA 74GB WD Raptors. The BIOS IDE setting is for "Combined" - SATA drives will appear on the Primary IDE channel. On a different box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c): 5.3-STABLE-20050107 atapci0: <Intel ICH5 UDMA100 controller> port 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 atapci1: <Intel ICH5 SATA150 controller> port 0xd000-0xd00f,0xcc00-0xcc03,0xc800-0xc807,0xc400-0xc403,0xc000-0xc007 irq 18 at device 31.2 on pci0 ata2: channel #0 on atapci1 ata3: channel #1 on atapci1 acd0: CDROM <CD-224E/1.9A> at ata1-master UDMA33 ad4: 78167MB <Maxtor 6Y080M0/YAR51HW0> [158816/16/63] at ata2-master SATA150 ad6: 78167MB <Maxtor 6Y080M0/YAR51HW0> [158816/16/63] at ata3-master SATA150 A pair of Maxtor 80GBs, the BIOS is set for "Enhanced", up to 6 drives (4 IDE + 2 SATA). Crazy as though it seems, I wasn't kidding about changing the BIOS. The other 2 settings are "SATA only" and "Auto". When the drives started flaking out (timeouts on reads) I would go into the BIOS and cycle through the BIOS settings. After changing it once or twice, things would be fine for months at a time. My best suspicion is that "something" makes the ICH5 a little flaky, and twiddling the BIOS clears it somehow. My only evidence supporting that is that twice the bios stalled on probing the drives once this error had happened, and I had to physically remove the drives, twiddle the bios settings, and replace the drives before it would work again. On OpenBSD, this problem on the same hardware manifests as a read timeout failure during the initial boot probes. Same fix, play with the BIOS and it suddenly works. There's a term in the Jargon file for this, but I can't recall it at the moment.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8eea04080501191506237fc762>