From owner-freebsd-bugs Sun Jan 12 0: 7:28 2003 Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ACDB137B401 for ; Sun, 12 Jan 2003 00:07:26 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6D85F43F13 for ; Sun, 12 Jan 2003 00:07:25 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id TAA18222; Sun, 12 Jan 2003 19:07:18 +1100 Date: Sun, 12 Jan 2003 19:07:49 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: jjramsey@pobox.com Cc: freebsd-bugs@FreeBSD.ORG Subject: Re: Semirandom bug in FreeBSD's ATA querying In-Reply-To: <20030112005620.10542.qmail@web10703.mail.yahoo.com> Message-ID: <20030112183638.I5726-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Sat, 11 Jan 2003, James J. Ramsey wrote: > ... > I have been repeated told that this is a hardware > problem. I disagree for two reasons: > > 1) The problem has never occured under Linux, Windows, > or an OpenBSD 3.1 snapshot (although the last one had > other problems). > > 2) The kernel messages from Linux indicate that it is > correctly querying the hard drive. Here's what I mean. > The following kernel message is from a FreeBSD > install. I captured it by pressing Scroll Lock and > pressing the Page Up key to get to the text: > > ad0: 8866663634010175MB > > [16955114026566160/17/63] at ata0-master PIO4 > > The garbage between the angle brackets should be the > name of the hard drive, "QUANTUM FIREBALLP". Compare > this with the kernel message from Linux: > > hda: QUANTUM FIREBALLP LM20.5, ATA DISK drive > > The name of the disk drive is reported correctly. > > Considering that the disk drive name only comes as a > result of a particular ATA command, it is clear that > Linux is reading the results of the command correctly. > It is simply doing something correctly that FreeBSD is > not. That is not indicative of a hidden hardware bug. Actually, it is indicative of a timing bug, which may be in either the hardware or the driver but is most likely in the hardware. Linux and the old FreeBSD driver (wd) have more compatibility cruft including delays to support broken drives. Try adding some delays near the broken command. From ata-all.c: %%% /* apparently some devices needs this repeated */ do { if (ata_command(atadev, command, 0, 0, 0, ATA_WAIT_INTR)) { ata_prtdev(atadev, "%s identify failed\n", command == ATA_C_ATAPI_IDENTIFY ? "ATAPI" : "ATA"); free(ata_parm, M_ATA); return -1; } if (retry++ > 4) { ata_prtdev(atadev, "%s identify retries exceeded\n", command == ATA_C_ATAPI_IDENTIFY ? "ATAPI" : "ATA"); free(ata_parm, M_ATA); return -1; } } while (ata_wait(atadev, ((command == ATA_C_ATAPI_IDENTIFY) ? ATA_S_DRQ : (ATA_S_READY|ATA_S_DSC|ATA_S_DRQ)))); ATA_INSW(atadev->channel->r_io, ATA_DATA, (int16_t *)ata_parm, sizeof(struct ata_params)/sizeof(int16_t)); %%% Try adding delays before ata_command() and ata_wait(), and before ATA_INSW(). The Linux driver (16 Jan 2001 version at least) has 50 msec delays near here. At least in old version of ATA, delays of 400 nsec are supposed to work, but the Linux driver uses much larger delays for superstituous and/or hisorical reasons. The ata driver has corresponding 1-10 usec delays in its command and wait functions, but seems to have these slightly misplaced (no wait after getting !ATA_S_BUSY before checking the other status bits for the first time). Many (most?) drives don't need any delays. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message