From owner-freebsd-geom@freebsd.org Mon Jan 30 09:34:14 2017 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0A442CC688C for ; Mon, 30 Jan 2017 09:34:14 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 601B423D for ; Mon, 30 Jan 2017 09:34:12 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA12106; Mon, 30 Jan 2017 11:34:04 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1cY8Lc-000HKI-Bo; Mon, 30 Jan 2017 11:34:04 +0200 Subject: Re: g_disk_done() vs a destroyed disk To: Poul-Henning Kamp , freebsd-geom@FreeBSD.org References: <31395.1485554104@critter.freebsd.dk> <8de79017-f0b0-c86a-93c5-65be4d97b21c@FreeBSD.org> <33960.1485609820@critter.freebsd.dk> From: Andriy Gapon Message-ID: <8d0093d9-759b-7674-650d-4caff1bc29a6@FreeBSD.org> Date: Mon, 30 Jan 2017 11:33:08 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <33960.1485609820@critter.freebsd.dk> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jan 2017 09:34:14 -0000 On 28/01/2017 15:23, Poul-Henning Kamp wrote: > -------- > In message <8de79017-f0b0-c86a-93c5-65be4d97b21c@FreeBSD.org>, Andriy Gapon wri > tes: > >> So, the correct sequence should be: >> - call disk_gone() to prevent new I/O >> - handle all in-flight I/O >> - call disk_destroy() >> Is that right? > > exactly! Thank you! And, just in case, I am seeing this problem with mfi driver. It uses disk(9) API directly to represent disks behind the controller. It seems that we have a class of such drivers and probably all of them are affected. Here is a list of files where I see disk_destroy, but no disk_gone: /usr/src/sys/dev/mlx/mlx_disk.c /usr/src/sys/dev/aac/aac_disk.c /usr/src/sys/dev/twe/twe_freebsd.c /usr/src/sys/dev/ips/ips_disk.c /usr/src/sys/dev/ida/ida_disk.c /usr/src/sys/dev/mfi/mfi_syspd.c /usr/src/sys/dev/mfi/mfi_disk.c /usr/src/sys/dev/cfi/cfi_disk.c /usr/src/sys/dev/amr/amr_disk.c The list is not complete. -- Andriy Gapon From owner-freebsd-geom@freebsd.org Mon Jan 30 13:59:05 2017 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1F676CC7F9A for ; Mon, 30 Jan 2017 13:59:05 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from smtp.rlwinm.de (smtp.rlwinm.de [IPv6:2a01:4f8:201:31ef::e]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DF3FE104E for ; Mon, 30 Jan 2017 13:59:04 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from crest.local (unknown [87.253.189.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.rlwinm.de (Postfix) with ESMTPSA id D955C118D3 for ; Mon, 30 Jan 2017 14:59:02 +0100 (CET) Subject: Re: AHCI Issues with WDC Black drives and GEOM_MIRROR To: freebsd-geom@freebsd.org References: From: Jan Bramkamp Message-ID: <2cee9128-4234-5fd5-463c-c51b47592220@rlwinm.de> Date: Mon, 30 Jan 2017 14:59:01 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jan 2017 13:59:05 -0000 Enabling this knob exposes an older interface to the OS over PCI. It was required to install Windows XP without workarounds, because Windows XP lacked AHCI drivers on the install disks. On 27/01/2017 06:05, Octavian Hornoiu wrote: > Thank Andriy, > > it appears my motherboard had a "IDE/SATA" compatibility mode which was > enabled by default and i had no idea what the option meant but once i > turned it off everything is showing up as AHCI. > > Thanks for your suggestion! > > > Octavian > > On Thu, Jan 26, 2017 at 12:46 AM, Andriy Gapon wrote: > >> On 26/01/2017 00:59, Octavian Hornoiu wrote: >>> OS: 10.3-RELEASE-p12 >>> Motherboard: ASRock FM2A85X Extreme6 FM2 AMD A85X (Hudson D4) SATA 6Gb/s >>> USB 3.0 HDMI ATX AMD Motherboard >>> Memory: 16 GB RAM >>> 6 Drives: 6x 1TB WD1001FALS (Caviar BLACK) >>> >>> I am having a strange issue where I have 6 drives attached to my >>> motherboard and 4 of them are coming up as SATA 2.x and the others are >>> coming up in a strange downgraded mode. The first 4 disks are always >> shown >>> as being normal and ada4/5 always have the strange configuration. I know >>> the motherboard chipset is good and supports 7 drives of SATA 2/3 in any >>> combination so I'm perplexed as to what the issue is. >>> >>> Why are drives 4 and 5 listed as being on bus ata0 and ata1 instead of >>> ahcich4 and 5? >> >> Check your BIOS settings. Sometimes they have a separate IDE/AHCI knob >> for the >> last two channels. >> >> >> -- >> Andriy Gapon >> > _______________________________________________ > freebsd-geom@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" > From owner-freebsd-geom@freebsd.org Mon Jan 30 16:07:03 2017 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F014BCC8867 for ; Mon, 30 Jan 2017 16:07:03 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 41C2E938 for ; Mon, 30 Jan 2017 16:07:02 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id SAA12948; Mon, 30 Jan 2017 18:07:01 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1cYETt-000HfO-3e; Mon, 30 Jan 2017 18:07:01 +0200 Subject: Re: gmirror and a flaky member To: Miroslav Lachman <000.fbsd@quip.cz>, freebsd-geom@FreeBSD.org References: <7e4164bd-9804-02d5-5990-bc15354989e9@FreeBSD.org> <77c40117-35ab-2430-07f8-e1df6b87fe1c@FreeBSD.org> <586FB32D.7050902@quip.cz> From: Andriy Gapon Message-ID: Date: Mon, 30 Jan 2017 18:06:05 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <586FB32D.7050902@quip.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jan 2017 16:07:04 -0000 On 06/01/2017 17:09, Miroslav Lachman wrote: > Andriy Gapon wrote on 2017/01/06 11:12: >> On 06/01/2017 11:54, Andriy Gapon wrote: >>> >>> Can a geom mirror handle a member that gets disconnected and then reappears >>> again? >>> >>> What I am seeing right now is that the mirror does not pick up the member when >>> it reappears. I have to add it back manually. > > It is intentional to mark disappeared device as broken. > If you want to remove working device from gmirror, you must use gmirror remove > command (or gmirror deactivate). > >> To add more substance, here is what gets logged when the disk disappears: >> >> GEOM_MIRROR: Request failed (error=6). ada0p2[READ(offset=2517700608, >> length=4096)] >> GEOM_MIRROR: Device swap: provider ada0p2 disconnected. >> >> And here's what gets logged when the disk reappears: >> GEOM_MIRROR: Component ada0p2 (device swap) broken, skipping. >> GEOM_MIRROR: Cannot add disk ada0p2 to swap (error=22). > > Was the disk removed by user or was it by some bad event? The latter. I suspect some problem with an SSD's controller or firmware. Basically, the disk disappeared and then re-appeared half a minute later. > >>> Even worse, the commands I have >>> to execute are: >>> $ gmirror forget ... >>> $ gmirror insert ... >>> >>> This does not appear to be a graceful way of reactivating the member. > > You can re-activate only member which was correctly deactivated. Not one yanked > out without any "graceful" command. I see. > And as gmirror doesn't work as ZFS mirror (cannot do resilver) the re-added > device is always fully rebuilt. Indeed. But it would be nice if gmirror was able to handle my situation automatically: - when a disk disappears, mark it broken (missing to be more precise) - when it re-appears, rebuild it and add back to the mirror It seems that the main problem at the moment is that gmirror doesn't distinguish between a disk getting many errors (e.g. disk going bad) and a disk disappearing (maybe permanently, maybe temporarily). But I could be mistaken about this. -- Andriy Gapon From owner-freebsd-geom@freebsd.org Mon Jan 30 19:16:36 2017 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16105CC8F68 for ; Mon, 30 Jan 2017 19:16:36 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 360C71C8D; Mon, 30 Jan 2017 19:16:34 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA13255; Mon, 30 Jan 2017 21:16:33 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1cYHRI-000HpA-S5; Mon, 30 Jan 2017 21:16:32 +0200 Subject: Re: gmirror and a flaky member To: freebsd-geom@FreeBSD.org References: <7e4164bd-9804-02d5-5990-bc15354989e9@FreeBSD.org> <77c40117-35ab-2430-07f8-e1df6b87fe1c@FreeBSD.org> Cc: Miroslav Lachman <000.fbsd@quip.cz>, Alexander Motin From: Andriy Gapon Message-ID: <3952383e-e03a-1b27-f798-bfb1cf0b6007@FreeBSD.org> Date: Mon, 30 Jan 2017 21:15:12 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <77c40117-35ab-2430-07f8-e1df6b87fe1c@FreeBSD.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jan 2017 19:16:36 -0000 On 06/01/2017 12:12, Andriy Gapon wrote: > To add more substance, here is what gets logged when the disk disappears: > > GEOM_MIRROR: Request failed (error=6). ada0p2[READ(offset=2517700608, length=4096)] > GEOM_MIRROR: Device swap: provider ada0p2 disconnected. > > And here's what gets logged when the disk reappears: > GEOM_MIRROR: Component ada0p2 (device swap) broken, skipping. > GEOM_MIRROR: Cannot add disk ada0p2 to swap (error=22). I think I see a problem. There are three places where G_MIRROR_DISK_STATE_DISCONNECTED event is posted: 1. g_mirror_orphan() that is called when GEOM notifies us that a disk is gone 2. g_mirror_regular_request(), when we get an error writing or reading data 3. g_mirror_sync_request(), when e get an error writing data to a disk being synchronized 4. g_mirror_write_metadata() when we get an error while writing (updating) the metadata to a member's label #1 is called when the disk disappears when there is no I/O. If the disk disappears while there is some I/O, then we can get either #1 or #2. We can get #3 during disk re-synchronization. We can get #4 in "rare" cases when we update the metadata (e.g. change the mirror configuration). In case #1 the code sets G_MIRROR_BUMP_SYNCID flag before posting the event. In cases #2, #3 and #4 the code sets G_MIRROR_BUMP_GENID flag. I believe that the code should set G_MIRROR_BUMP_GENID only in case #4. In that case the metadata becomes different between the mirror members and, thus, there is no way for the code to automatically rebuild the mirror. In cases #1, #2 and #3 only the data becomes stale on a member and, thus, there should be a chance to re-synchronize that member. In fact, in case #3 the member is already being synchronized. I could be missing something, of course. So, any comments and corrections are very welcome. Thanks! -- Andriy Gapon From owner-freebsd-geom@freebsd.org Mon Jan 30 19:23:55 2017 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C4DEACC739A for ; Mon, 30 Jan 2017 19:23:55 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E6906212; Mon, 30 Jan 2017 19:23:53 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA13279; Mon, 30 Jan 2017 21:23:51 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1cYHYN-000HpV-LJ; Mon, 30 Jan 2017 21:23:51 +0200 Subject: Re: gmirror and a flaky member To: freebsd-geom@FreeBSD.org References: <7e4164bd-9804-02d5-5990-bc15354989e9@FreeBSD.org> <77c40117-35ab-2430-07f8-e1df6b87fe1c@FreeBSD.org> <3952383e-e03a-1b27-f798-bfb1cf0b6007@FreeBSD.org> Cc: Miroslav Lachman <000.fbsd@quip.cz>, Alexander Motin From: Andriy Gapon Message-ID: <69d4d61c-dcb3-a7ac-ecd3-e47facd19b2e@FreeBSD.org> Date: Mon, 30 Jan 2017 21:22:56 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <3952383e-e03a-1b27-f798-bfb1cf0b6007@FreeBSD.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jan 2017 19:23:55 -0000 On 30/01/2017 21:15, Andriy Gapon wrote: > On 06/01/2017 12:12, Andriy Gapon wrote: >> To add more substance, here is what gets logged when the disk disappears: >> >> GEOM_MIRROR: Request failed (error=6). ada0p2[READ(offset=2517700608, length=4096)] >> GEOM_MIRROR: Device swap: provider ada0p2 disconnected. >> >> And here's what gets logged when the disk reappears: >> GEOM_MIRROR: Component ada0p2 (device swap) broken, skipping. >> GEOM_MIRROR: Cannot add disk ada0p2 to swap (error=22). > > I think I see a problem. > There are three places where G_MIRROR_DISK_STATE_DISCONNECTED event is posted: > 1. g_mirror_orphan() that is called when GEOM notifies us that a disk is gone > 2. g_mirror_regular_request(), when we get an error writing or reading data > 3. g_mirror_sync_request(), when e get an error writing data to a disk being > synchronized > 4. g_mirror_write_metadata() when we get an error while writing (updating) the > metadata to a member's label > > #1 is called when the disk disappears when there is no I/O. > If the disk disappears while there is some I/O, then we can get either #1 or #2. > We can get #3 during disk re-synchronization. > We can get #4 in "rare" cases when we update the metadata (e.g. change the > mirror configuration). > > In case #1 the code sets G_MIRROR_BUMP_SYNCID flag before posting the event. > In cases #2, #3 and #4 the code sets G_MIRROR_BUMP_GENID flag. > > I believe that the code should set G_MIRROR_BUMP_GENID only in case #4. > In that case the metadata becomes different between the mirror members and, > thus, there is no way for the code to automatically rebuild the mirror. > > In cases #1, #2 and #3 only the data becomes stale on a member and, thus, there > should be a chance to re-synchronize that member. In fact, in case #3 the > member is already being synchronized. > > I could be missing something, of course. > So, any comments and corrections are very welcome. > Thanks! > At the very minimum I would like to change G_MIRROR_BUMP_GENID to G_MIRROR_BUMP_SYNCID in g_mirror_regular_request() for ENXIO. -- Andriy Gapon