From owner-freebsd-stable@FreeBSD.ORG Thu Jun 2 08:38:10 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 38813106564A for ; Thu, 2 Jun 2011 08:38:10 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 756AC8FC08 for ; Thu, 2 Jun 2011 08:38:08 +0000 (UTC) Received: by bwz12 with SMTP id 12so1102552bwz.13 for ; Thu, 02 Jun 2011 01:38:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:sender:message-id:date:from:user-agent :mime-version:to:cc:subject:references:in-reply-to :x-enigmail-version:content-type:content-transfer-encoding; bh=dZEiHoJ/QfyPa3qZxTOXQg7mdZ+JdKR3W+6zwOR4M9Q=; b=S6ia9nFhxAFLWSNsXh1tzYxBuWpGQbDoCmksTOrarZVPBi0bNhphG8HhH64SdNn9BY WqAoSXX5fd7yFhfAwQsTTKUSMNFWep05r82c/5imMRQ1WsHLWHltPSDISGZfUPebkfzR cbuTqd2lcXiYylMmuA2HyxhCc0WgavYTnGU/U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=VrokwxdepH7+1nda/tZIVbvDSV78XSWGR2HN5baZNsa3QJaD6bmRh+vA35s7/hFjHB MJD3I5MFpVdsWWWu9kAuGZAYnktQl6fJxmbu0gMs9Uzq5s/lcgU5QG0qIA8UCvtaDwTQ TNZrOtj47M0hFPPwgm7iSxBrRzdYSGK/W0gJY= Received: by 10.204.3.207 with SMTP id 15mr458790bko.178.1307003888220; Thu, 02 Jun 2011 01:38:08 -0700 (PDT) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id z1sm279929bka.8.2011.06.02.01.38.06 (version=SSLv3 cipher=OTHER); Thu, 02 Jun 2011 01:38:07 -0700 (PDT) Sender: Alexander Motin Message-ID: <4DE74BCD.8080002@FreeBSD.org> Date: Thu, 02 Jun 2011 11:37:33 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20091212) MIME-Version: 1.0 To: Jeremy Chadwick References: <814C9E9472FDCC40AAC3FC95A2D67E3B0BD88C69@msx3.exchange.alogis.com> <20110601085454.GA19434@icarus.home.lan> <814C9E9472FDCC40AAC3FC95A2D67E3B0BD88DC0@msx3.exchange.alogis.com> <20110601095610.GA20255@icarus.home.lan> <814C9E9472FDCC40AAC3FC95A2D67E3B0BD88F48@msx3.exchange.alogis.com> <814C9E9472FDCC40AAC3FC95A2D67E3B0BD890BD@msx3.exchange.alogis.com> <4DE73386.5040505@FreeBSD.org> <20110602075118.GA42026@icarus.home.lan> In-Reply-To: <20110602075118.GA42026@icarus.home.lan> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "stable@freebsd.org" Subject: Re: 8-STABLE won't boot with ZFSv28 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jun 2011 08:38:10 -0000 Jeremy Chadwick wrote: > On Thu, Jun 02, 2011 at 09:53:58AM +0300, Alexander Motin wrote: >> Holger Kipp wrote: >>> got the same messages over and over again - panic took some time: >>> >>> unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 >>> ata0: reinit done .. >>> ata0: reiniting channel .. >>> ata0: DISCONNECT requested >>> >>> >>> >>> ata0: p0: SATA connect time=0ms status=00000113 >>> ata0: p1: SATA connect timeout status=00000000 >>> ata0: reset tp1 mask=03 ostat0=00 ostat1=00 >>> ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb >>> ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb >>> ata0: reset tp2 stat0=00 stat1=00 devices=0x30000 >>> unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 >>> ata0: reinit done .. >>> ata0: reiniting channel .. >>> ata0: DISCONNECT requested >> I see two problems here: >> 1. "devices=0x30000" means that two ATAPI devices were detected instead >> of one. I can reproduce it also with other Intel chipsets. It looks like >> a hardware bug to me. It can be workarounded by reconnecting ATAPI >> device to even (2 or 4) SATA port, or connecting any other device there. >> 2. "DISCONNECT requested" means that controller reported PHY status >> change for some device on channel, triggering infinite retry. Unluckily >> I have no ICH9 board, while I can't reproduce it with ICH10 or above. >> >> This patch should workaround the first problem in software: >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26 >> Try it please and let's see if with some luck it do something about the >> second problem. > > With regards to item #1: I don't see anything in the ICH9 errata that > indicates a silicon bug if the only device attached to the controller is > an ATAPI device and connected to SATA port 0 (presumably), or an > odd-numbered port? If this problem exists on other ICHxx and/or ESBxx > chips, I sure would hope it'd be documented. > > I haven't tried confirming it myself, but if need be I can set up a test > box with a SATA-based DVD drive hooked up to it + provide remote serial > console/etc. if it'd be of any help. I don't think it would be (sounds > like you have lots of hardware :-) ), but I'm willing to help in any way > I can. Intel probably don't see issue there, as the same behavior can be found even on latest chipsets. But according to my ATA specs understanding and real PATA devices behavior analysis, this behavior is not correct. When ATAPI device connected to the first of two SATA ports, routed to the same legacy-/PATA-emulated ATA channel (master device), soft-reset sequence returns false-positive slave ATAPI device presence. Problem doesn't expose with ATA disk devices, or if some other device really attached to the slave port. Problem looks like it was there always, but before ATA_CAM it was not usually noticed, due to very small IDENTIFY command timeouts in ata(4). If somebody can give better explanation or propose better workaround -- welcome, as I am not very like this solution. > With regards to item #2: could this be at all related to OOB (bit 15) > somehow being set in PCS (SATA register offset 0x92)? I'm doubting it > but I thought I'd ask. My thought process, which is probably wrong > (consider it an educational discussion :-) ): > > The ICH9 specification states that the default value for this register > is 0x0000, and b15=0 means "SATA controller will not retry after an OOB > failure", while b15=1 causes the controller to indefinitely retry after > OOB failure. I imagine system BIOSes and other things can change this > default value, but we don't seem to print it anywhere in > ata_intel_chipinit() during a verbose boot. > > Looking at chipsets/ata-intel.c, it looks like we only touch PCS in > ata_intel_chipinit() and ata_intel_reset(). In the former, we avoid > touching bits 4 through 15, and in the latter we mask out only what we > want to adjust (e.g. the SATA port per ch variable). As as I can see, ata_intel.c should not change that bit if it was set for some reason. Theoretically, OOB (Out-of-Band signaling) is the function of the same state machine which sets that PHY changes status flag. But friendly speaking, I have no idea what result can be from setting of this bit. In this legacy/PATA emulation mode there are too many things not documented to be sure in anything. -- Alexander Motin