Date: Tue, 30 Sep 2008 14:20:28 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Mel <fbsd.questions@rachie.is-a-geek.net> Cc: Reid Linnemann <lreid@cs.okstate.edu>, freebsd-questions@freebsd.org Subject: Re: SATA READ_DMA timeouts - SOLVED? Message-ID: <20080930212028.GA56646@icarus.home.lan> In-Reply-To: <200809301929.27126.fbsd.questions@rachie.is-a-geek.net> References: <48E1465A.5040903@cs.okstate.edu> <20080930023736.GA22907@icarus.home.lan> <48E259B4.3040100@cs.okstate.edu> <200809301929.27126.fbsd.questions@rachie.is-a-geek.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Sep 30, 2008 at 07:29:26PM +0200, Mel wrote: > On Tuesday 30 September 2008 18:54:12 Reid Linnemann wrote: > > Jeremy Chadwick wrote: > > > (I'm not subscribed to freebsd-questions, so please CC me on replies. > > > I'm also not sure how I ended up getting this mail in the first place; > > > it looks like someone BCC'd my koitsu@freebsd.org address). > > > > Yes, I BCC'd you since you are maintaining a page on the wiki > > documenting SATA DMA problems. > > > > > Furthermore, one of the most common reports on the FreeBSD lists is the > > > exact opposite -- users complaining that "their disks are SATA300 but > > > only operate at SATA150" (caused by that jumper). Users are told to > > > remove the jumper, and are reminded that the reason the jumper is > > > enabled by default is said chipset incompatibilities. > > > > > > That said, your mail confuses me for one reason: > > > > > > Were you receiving DMA errors with the jumper REMOVED (e.g. SATA300 > > > operation), or with the jumper ENABLED (SATA150 operation)? Your below > > > description does not state what exactly you did with the jumper to make > > > your drives work reliably, only "that the jumper capability on your > > > disks was available". > > > > I should have been more clear. > > > > My disks came with no cap on the SATA150 jumper, although FreeBSD > > reported that they were in SATA150 mode. The system would be unusable > > from READ_DMA timeouts if the system was ever powered off and brought > > back up. I had to do some voodoo of booting in single user mode with > > ACPI turned off to repair filesystems and rebuild my gmirror, then load > > ACPI and drop back into multi-user mode. I even had to do this if the > > system was powered off gracefully. So far, since I capped the jumpers > > this has not been the case. I still get them periodically if I do > > something like rebuild a gmirror component, so I can no longer say my > > problem is completely resolved. > > Is this on 7.x? Sounds very similar to my experience described in: > http://www.freebsd.org/cgi/query-pr.cgi?pr=122572&cat=kern > > The machine is now operational and working in UDMA33 mode with two gmirror'ed > SATA, using 6.3-p4. Unfortunately, I can't risk "trying 7.x" anymore, since > it's emergency storage for the main fileserver, so dataloss is > unacceptable :/. I do not know about the jumper state at the moment. I will > inform if there will be a window real soon now, to check for jumpers. > > Ata info: > # atacontrol list > ATA channel 0: > Master: acd0 <HL-DT-STDVD-ROM GDR-T10N/1.02> ATA/ATAPI revision 5 > Slave: no device present > ATA channel 1: > Master: no device present > Slave: no device present > ATA channel 2: > Master: ad4 <WDC WD6400AAKS-65A7B0/01.03B01> Serial ATA II > Slave: no device present > ATA channel 3: > Master: ad6 <WDC WD6400AAKS-65A7B0/01.03B01> Serial ATA II > Slave: no device present > > # atacontrol cap ad4 > > Protocol Serial ATA II > device model WDC WD6400AAKS-65A7B0 > serial number WD-WMASY1885186 > firmware revision 01.03B01 > cylinders 16383 > heads 16 > sectors/track 63 > lba supported 268435455 sectors > lba48 supported 1250263728 sectors > dma supported > overlap not supported > > Feature Support Enable Value Vendor > write cache yes yes > read ahead yes yes > Native Command Queuing (NCQ) yes - 31/0x1F > Tagged Command Queuing (TCQ) no no 31/0x1F > SMART yes yes > microcode download yes yes > security no no > power management yes yes > advanced power management no no 0/0x00 > automatic acoustic management yes yes 128/0x80 128/0x80 > > # atacontrol mode ad4 > current mode = UDMA33 No -- what Reid is reporting is very different. His problem is that his disks came out-of-the-box operating at SATA300 speeds, and his SATA chipset does not work reliably with SATA300. He found that by setting the SAT150-limiting jumper, he achieved stability. What you're seeing here (a SATA drive being limited to ATA33 speed) could be due to one of the following things: 1) BIOS options have set the SATA ports to "Compatible" or "Emulated". What this does is tell your southbridge to emulate the SATA disks as old PATA disks, and I believe the emulation layer does use ATA33 (not ATA66/100/133). This is available so you can use SATA disks on very old operating systems (possibly things like MS-DOS). "Enhanced" means to run the disks and controller in a standard SATA fashion. "Enhanced" can also provide you extra functionality, such as "Enhanced IDE", "Enhanced AHCI", or "Enhanced RAID". It depends greatly on the chip being used, and what features it has. 2) Board is using a SATA chipset which lacks a PCI ID table entry in FreeBSD, yet is somehow operating in a "generic" fashion (I'm not referring to generic AHCI either, although that could also apply here, as ata(4) has "generic AHCI" support). 3) Board is using a SATA chipset which has a PCI ID entry in the table, but actual code that interfaces with it in ata(4). In the case of items #2 and #3, the results are mixed. Some people have reported that when "UDMA33" is shown with SATA disks, that it's purely cosmetical -- that is to say, the actual transfer speed can exceed 33MByte/sec. A series of "dd" tests reading/writing to the disk should be sufficient to determine this. In the case "UDMA33" is printed and the actual transfer speed *is* in fact operating at ATA33, that is a strong indicator that FreeBSD lacks the code to initialise/handle your SATA chipset correctly, and is defaulting to UDMA33. If that's the case, I'd recommend working with the ata(4) folks (I can point you to them) to get support added for your chip. Otherwise, support will either be added many years from now when someone else points it out, or will never get added at all. You didn't provide any dmesg output so I can't tell what SATA chipset or motherboard you're using. Many ATA and SATA chips have been added to RELENG_7, and I doubt the changes will be backported to RELENG_6. It would be worthwhile if you could consider booting a RELENG_7 LiveCD ISO and see if your disks are seen -- and if so, if they show up at SATA speeds. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080930212028.GA56646>