Date: Wed, 14 Dec 2011 01:26:24 -0800 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: "Patrick M. Hausen" <hausen@punkt.de> Cc: FreeBSD Stable <freebsd-stable@freebsd.org> Subject: Re: Hot-changing a failed HDD with ahci.ko Message-ID: <20111214092624.GA96153@icarus.home.lan> In-Reply-To: <B0A139EC-F6A3-48DA-A347-21A5ED0507BF@punkt.de> References: <B0A139EC-F6A3-48DA-A347-21A5ED0507BF@punkt.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Dec 14, 2011 at 09:29:52AM +0100, Patrick M. Hausen wrote: > Hi, all, > > while most cheap servers with SATA disks are not really hot-plug > capable, changing a failed disk (either gmirror or zfs) was possible > without a reboot by executing e.g. if ad4 failed: > > atacontrol detach ata2 > <change disks> > atacontrol attach ata2 > > What is the proper equivalent for ahci, ada0 and camcontrol? None is needed: yank the disk, reinsert, wait a few seconds, done. Validation, with full output, hardware, etc: http://koitsu.wordpress.com/2010/07/22/freebsd-and-zfs-hot-swapping-sata-disks-with-ahci/ I've made videos to demonstrate this as well, but need to edit them and upload them. > Stop unit commands seem not to work with SATA disks, so I > tried: > > <forcefully unplug "broken" disk> > -> system logs about lost device, so far so good > <insert new disk> > camcontrol reset 1 > camcontrol devlist > -> disk still not there > camcontrol rescan 1 > -> command hangs > <login to a second session, system still responsive> > shutdown -r now > -> system panics, eventually reboots Before you yanked the disk, were any non-ZFS filesystems mounted? This sounds similar to what happens if you were to yank a classic SATA disk from a non-AHCI system, or under ata(4), without detaching first. Or, on some systems, when SATA disks are yanked without use of a hot-swap backplane. > I can provide details about the panic if someone is interested, > but maybe there is a proper procedure already, which I simply missed. > > System is RELENG_8_2 amd64. > ahci0: <Intel Cougar Point AHCI SATA controller> port 0xf090-0xf097,0xf080-0xf083,0xf070-0xf077,0xf060-0xf063,0xf020-0xf03f mem 0xfb921000-0xfb9217ff irq 19 at device 31.2 on pci0 > ada0 at ahcich0 bus 0 scbus1 target 0 lun 0 > ada0: <ST31000340NS SN05> ATA-8 SATA 1.x device > ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) > ada0: Command Queueing enabled > ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) > ada1 at ahcich1 bus 0 scbus2 target 0 lun 0 > ada1: <ST31000340NS SN05> ATA-8 SATA 1.x device > ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) > ada1: Command Queueing enabled > ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) You might try booting RELENG_9 (which has ahci.ko as the default, so no need to mess about) on a LiveCD or equivalent and attempt the same thing. I'm left wondering if there's some stuff in RELENG_8 (not a typo compared to the above RELENG_9 reference) that you do not have in RELENG_8_2. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111214092624.GA96153>