Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 02 Oct 2012 18:40:17 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        geoffroy desvernay <dgeo@centrale-marseille.fr>
Cc:        freebsd-stable@FreeBSD.org, Andriy Gapon <avg@FreeBSD.org>
Subject:   Re: ahcich reset -> cannot mount  zfs root in 9.1-PRE
Message-ID:  <506B0AE1.5050303@FreeBSD.org>
In-Reply-To: <506AF15D.1010707@FreeBSD.org>
References:  <506AE944.3020806@centrale-marseille.fr> <506AF15D.1010707@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 02.10.2012 16:51, Andriy Gapon wrote:
> on 02/10/2012 16:16 geoffroy desvernay said the following:
>> Hi all,
>>
>> Trying to upgrade a system from 9.0-RELEASE to 9.1-PRE from yesterday on
>> my machine (GEOM+ZFS mirror setup on ada[01]p3), the new kernel becomes
>> unable to mount root... The only way to recover is to boot from 9.0 kernel.
>> The disks were already named ada[01] in 9.0, so I suspect nothing there...
>>
>> I tried
>>   - disabling AHCI in bios (no change seen)
>>   - change cables, check PSU, test disks with smartctl
>>
>> Here are some bits (via serial console):
>> ahci0: <ATI IXP600 AHCI SATA controller> port
>> 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f
>> mem 0xfe9ff800-0xfe9ffbff irq 22 at device 18.0 on pci0
>> ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported
>> ahci0: Caps: 64bit NCQ SNTF MPS AL CLO 3Gbps PM PMD SSC PSC 32cmd CCC 4ports
>> ahcich0: <AHCI channel> at channel 0 on ahci0
>> ahcich0: Caps: HPCP
>> ahcich1: <AHCI channel> at channel 1 on ahci0
>> ahcich1: Caps: HPCP
>> ahcich2: <AHCI channel> at channel 2 on ahci0
>> ahcich2: Caps: HPCP
>> ahcich3: <AHCI channel> at channel 3 on ahci0
>> ahcich3: Caps: HPCP
>> ahcich0: AHCI reset...
>> ahcich0: SATA connect time=100us status=00000123
>> ahcich0: AHCI reset: device found
>> ahcich0: AHCI reset: device ready after 0ms
>>
>> The difference with 9.0 is after that: here is 9.0's next lines: (same
>> for ahcich1)
>> (aprobe0:ahcich0:0:15:0): Command timed out
>> (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted
>> (aprobe0:ahcich0:0:0:0): SIGNATURE: 0000
>>
>> And 9.1-PRE's:
>> (aprobe0:ahcich0:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
>> (aprobe0:ahcich0:0:15:0): CAM status: Command timeout
>> (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted
>>
>> In both cases ada[01] are detected and available, but with 9.1-PRE I see:
>> GEOM_RAID: Promise: Disk ada0 state changed from NONE to SPARE.
>> GEOM_RAID: Promise: Disk ada1 state changed from NONE to SPARE.
>>
>> (I see the same when I # kldload geom_raid # from running 9.0, doesn't
>> breaks anything...)
>>
>> I attach the full boot log with 9.1-PRE (bios with NO-raid nor AHCI
>> enabled, but this changes nothing in the output)
>>
>> I could test patches or try any command required to debug this… But for
>> the moment I don't know where to search (and kernel code is far away
>> from my current skills in debugging…)
>
> You probably need to clear RAID metadata on the disks as I think that disabling
> geom_raid is not possible in 9.1-PRE.
> I think that Alexander can help you more here.

The right way is to clear RAID metadata on disks. If it is possible to 
boot from any other source, you can just do `graid delete Promise` and 
then reboot.

Alternatively it is possible to disable geom_raid module using recently 
added loader tunable kern.geom.raid.enable=0. After that your system 
should boot and run fine. I would still recommend you to erase metadata, 
but after setting that tunable it will be impossible to do it via graid 
tool, only with manual dd surgery. In case of Promise format metadata 
use up to 63 last sectors of the disk. You can identify respective 
sectors to erase by signature "Promise Technology, Inc." in the 
beginning of the sector.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?506B0AE1.5050303>