Date: Tue, 31 Dec 2002 16:23:30 -0500 From: "Matthew Emmerton" <matt@gsicomp.on.ca> To: "Bruce Campbell" <bruce@engmail.uwaterloo.ca>, <freebsd-hardware@FreeBSD.ORG>, <freebsd-questions@FreeBSD.ORG> Cc: <sos@FreeBSD.ORG> Subject: Re: ata "fallback to PIO mode" on dual processor AMD systems Message-ID: <025701c2b112$ddfbf580$1200a8c0@gsicomp.on.ca> References: <1041368236.3e1204ac45da5@www.nexusmail.uwaterloo.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
[ cc'ing Soren since he's the ATA guru ] > I am seeing a problem with ata disks on 4 new systems, which > I believe is either a bug in the ata driver, or a problem with > the onboard IDE controller, or something else. Systems are as follows: > > Motherboard: ASUS A7M266-D > CPUs : 2 x 2000+ AMD MP > Memory : 2 x 512MB Crucial part: CT6472Y265 > > Disks (all UDMA100): > > Master Slave > System 1: WDC WD400BB WDC WD1000BB > System 2: WDC WD400BB WDC WD1000BB > System 3: WDC WD400BB WDC WD800BB > System 4: WDC WD400BB Maxtor 98196H8 > > Kernel : 4.7-RELEASE, custom kernel (compared to GENERIC): > > commented out: > > cpu I386_CPU > cpu I486_CPU > > enabled > > options SMP # Symmetric MultiProcessor Kernel > options APIC_IO # Symmetric (APIC) I/O > > > I am running a test with "dbench" (/usr/ports/benchmarks/dbench) > with a script which runs: > > dbench 1 > sleep for 5 minutes > dbench 2 > sleep for 5 minutes > dbench 3 > ... > > to simulate 1,2,3... clients. > > The following has happened on systems 2,3 and 4, after about 15 hours > of running the test: > > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 - > resetting > Dec 30 23:26:59 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 > resetting > Dec 30 23:27:00 ecserv13 /kernel: ad0: timeout waiting for cmd=ef s=d0 e=00 > Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode > Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done > > The test continues to run with the ata controller in PIO mode, with > slower performance, and higher load average. > > Once the master drops to PIO, attempts to access the slave then cause > it to drop to PIO. > > If I run: > > atacontrol mode 0 UDMA100 UDMA100 > > attempts to access either drive result in a delay until the controller > drops to PIO, and then operations resume. A soft reboot and things > work in UDMA mode again. Also tried UDMA33 and UDMA66 with no change. > I also tried "atacontrol reinit 0" with no help. > > Theories when I search the web for "fallback to PIO mode" include: > > - bad disks > - something to do with thermal recalibration > > I don't believe the problems are bad disks, as the slave drops to PIO > after the master does, and I can't get in back to UDMA, other than by > soft reboot. Plus I see the problem on 6 of 8 disks. > > The problem is very repeatable. > > Can anyone offer any ideas, or suggest investigative steps ? I have a system > in PIO mode right now. The reason the slave drops to PIO after the master does is by design - the master and slave have to use the same signalling mode since they're on the same cable. (People often report lackluster performance of fast UDMA hard drives with non-UDMA CD-ROMs on the same channel.) Are you using 80-conductor cables on all your drives? These are required to get consistent high throughput, and running without them may cause the problems you're seeing. -- Matt Emmerton To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hardware" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?025701c2b112$ddfbf580$1200a8c0>