From owner-freebsd-geom@freebsd.org Mon Jan 30 16:07:03 2017 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F014BCC8867 for ; Mon, 30 Jan 2017 16:07:03 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 41C2E938 for ; Mon, 30 Jan 2017 16:07:02 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id SAA12948; Mon, 30 Jan 2017 18:07:01 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1cYETt-000HfO-3e; Mon, 30 Jan 2017 18:07:01 +0200 Subject: Re: gmirror and a flaky member To: Miroslav Lachman <000.fbsd@quip.cz>, freebsd-geom@FreeBSD.org References: <7e4164bd-9804-02d5-5990-bc15354989e9@FreeBSD.org> <77c40117-35ab-2430-07f8-e1df6b87fe1c@FreeBSD.org> <586FB32D.7050902@quip.cz> From: Andriy Gapon Message-ID: Date: Mon, 30 Jan 2017 18:06:05 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <586FB32D.7050902@quip.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jan 2017 16:07:04 -0000 On 06/01/2017 17:09, Miroslav Lachman wrote: > Andriy Gapon wrote on 2017/01/06 11:12: >> On 06/01/2017 11:54, Andriy Gapon wrote: >>> >>> Can a geom mirror handle a member that gets disconnected and then reappears >>> again? >>> >>> What I am seeing right now is that the mirror does not pick up the member when >>> it reappears. I have to add it back manually. > > It is intentional to mark disappeared device as broken. > If you want to remove working device from gmirror, you must use gmirror remove > command (or gmirror deactivate). > >> To add more substance, here is what gets logged when the disk disappears: >> >> GEOM_MIRROR: Request failed (error=6). ada0p2[READ(offset=2517700608, >> length=4096)] >> GEOM_MIRROR: Device swap: provider ada0p2 disconnected. >> >> And here's what gets logged when the disk reappears: >> GEOM_MIRROR: Component ada0p2 (device swap) broken, skipping. >> GEOM_MIRROR: Cannot add disk ada0p2 to swap (error=22). > > Was the disk removed by user or was it by some bad event? The latter. I suspect some problem with an SSD's controller or firmware. Basically, the disk disappeared and then re-appeared half a minute later. > >>> Even worse, the commands I have >>> to execute are: >>> $ gmirror forget ... >>> $ gmirror insert ... >>> >>> This does not appear to be a graceful way of reactivating the member. > > You can re-activate only member which was correctly deactivated. Not one yanked > out without any "graceful" command. I see. > And as gmirror doesn't work as ZFS mirror (cannot do resilver) the re-added > device is always fully rebuilt. Indeed. But it would be nice if gmirror was able to handle my situation automatically: - when a disk disappears, mark it broken (missing to be more precise) - when it re-appears, rebuild it and add back to the mirror It seems that the main problem at the moment is that gmirror doesn't distinguish between a disk getting many errors (e.g. disk going bad) and a disk disappearing (maybe permanently, maybe temporarily). But I could be mistaken about this. -- Andriy Gapon