From owner-freebsd-hackers@FreeBSD.ORG Sat Jun 20 06:56:20 2015 Return-Path: Delivered-To: freebsd-hackers@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 18709E8F for ; Sat, 20 Jun 2015 06:56:20 +0000 (UTC) (envelope-from hans@beastielabs.net) Received: from testsoekris.hotsoft.nl (unknown [IPv6:2001:888:1227:0:200:24ff:fec9:5934]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C81787D6 for ; Sat, 20 Jun 2015 06:56:19 +0000 (UTC) (envelope-from hans@beastielabs.net) Received: from beastie.hotsoft.nl (beastie.hotsoft.nl [IPv6:2001:888:1227:0:219:d1ff:fee8:91eb]) by testsoekris.hotsoft.nl (8.14.7/8.14.7) with ESMTP id t5K6uCLk009038; Sat, 20 Jun 2015 08:56:12 +0200 (CEST) (envelope-from hans@beastielabs.net) Message-ID: <55850E78.3040600@beastielabs.net> Date: Sat, 20 Jun 2015 08:55:52 +0200 From: Hans Ottevanger User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Brandon Valentine CC: freebsd-hackers@freebsd.org Subject: Re: debugging ATA command timeouts on 10.1-RELEASE References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Jun 2015 06:56:20 -0000 On 06/19/15 18:03, Brandon Valentine wrote: > [ Starting with -hackers before a possible PR. If there's a better place > for this thread please advise. ] > > Howdy, > > I have an older Soekris net4801 with a NatSemi SC1100 ATA chipset. Runs > great under FreeBSD 8.3, but 10.1-RELEASE-p13 spews the following error, in > a loop, upon boot: > > (aprobe0:ata0:0:1:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 > (aprobe0:ata0:0:1:0): CAM status: Command timeout > (aprobe0:ata0:0:1:0): Retrying command > > The atapci driver recognizes it as: > > atapci0: port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 18.2 on pci0 > > And eventually, after a lot of command timeouts, I will get: > > ada0: 16.700MB/s transfers (PIO4, PIO 512bytes) > ada0: 3825MB (7835184 512 byte sectors: 16H 63S/T 7773C) > ada0: Previously was known as ad0 > > However, the system continues issuing failed ATA_IDENTIFY commands after > this and never succeeds in booting. That ada device being detected there is > a 4GB CompactFlash card. In order to eliminate the possibility that the > hardware or card do not support UDMA, write caching, etc, I am booting this > 10.1-RELEASE-p13 kernel with the following kernel hints in loader.conf: > > hint.ata.0.mode="PIO4" > hint.ata.1.mode="PIO4" > hint.ahci.0.msi="0" > hint.atapci.0.msi="0" > hint.acpi.0.disabled="1" > kern.cam.ada.write_cache="0" > > Removing these hints does not make any difference in the outcome. > > It has been a while but I'm no stranger to -hackers or this sort of > debugging, but I'm wholly unfamiliar with the CAM subsystem. I've compiled > and booted a 10.1-RELEASE-p13 kernel with all CAM debug flags enabled and > the complete debug log of a boot attempt can be seen here: > > https://gist.github.com/bval/0ab616a57b2846f633ab > > Is there a developer more familiar with CAM who can take a look at this and > advise me on what might be happening or where to go next in debugging this? > I'm willing to do the legwork just need some guidance. > The oldest Soekris 4801 boards that I have indeed do not support UDMA. I have always used this in /boot/loader.conf: hw.ata.ata_dma="0" to prevent issues like you describe. My systems are now at 10.1-STABLE. Kind regards, Hans Ottevanger Eindhoven, Netherlands www.beastielabs.net