From owner-freebsd-fs@freebsd.org Sat Jun 17 20:52:50 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 415B0C094DC for ; Sat, 17 Jun 2017 20:52:50 +0000 (UTC) (envelope-from lukasz@wasikowski.net) Received: from mail.freebsd.systems (mail.freebsd.systems [5.196.167.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E3CA477B6C; Sat, 17 Jun 2017 20:52:49 +0000 (UTC) (envelope-from lukasz@wasikowski.net) Received: from mail.freebsd.systems (mail.freebsd.systems [IPv6:2001:41d0:2:1276::1]) by mail.freebsd.systems (Postfix) with ESMTP id CD07072C; Sat, 17 Jun 2017 22:52:40 +0200 (CEST) X-Virus-Scanned: amavisd-new at freebsd.systems Received: from mail.freebsd.systems ([5.196.167.1]) by mail.freebsd.systems (scan.freebsd.systems [5.196.167.1]) (amavisd-new, port 10026) with ESMTP id hGJQ5eCRLeZ9; Sat, 17 Jun 2017 22:52:40 +0200 (CEST) Received: from [192.168.168.1] (89-70-62-144.dynamic.chello.pl [89.70.62.144]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.freebsd.systems (Postfix) with ESMTPSA id 2BD29729; Sat, 17 Jun 2017 22:52:40 +0200 (CEST) Authentication-Results: mail.freebsd.systems; dmarc=none header.from=wasikowski.net Authentication-Results: mail.freebsd.systems; spf=pass smtp.mailfrom=lukasz@wasikowski.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wasikowski.net; s=default; t=1497732760; bh=b+t+9t0XQRu5neJcgTxBjsOh0meko6vTagCIOrT96KE=; h=To:Cc:References:From:Date:In-Reply-To; b=bT2T9l53uv2EzMhFTOhTLjAONPFCihnovppVJkC3pUBzEGwEF/qr0Dx2ZqadhpfHS zUXiB9dw2bxI0BBLmsizG24LkwkiJKAftXwcauq+GBQGgwOUzjgePnCM94+U+lvluA f7pYMxkVtwrjfTrIR/yQnCN72S/X7X0LE3gPFR8vHsKfH5fKhun1p6fJH+O2TTYahM +Jr6sscMGb6hakwkgfhYaCcRGAX0IJdTYA2zGA2zzE1a809zkcQ7XTedlgtv7u4CHd JBU04PVE+GcgE7fU9b5Eqcack3Pfr2znFV3WVV+HlbN+i+OksUEYbn8IHH6JUF8ax1 19r2Ex6eAAcbQ== Subject: Re: Problem with zpool remove of log device To: Stephen McKay Cc: freebsd-fs@freebsd.org References: <9188a169-cd81-f64d-6b9e-0e3c6b4af1bb@wasikowski.net> <0410af$1dldvp4@ipmail04.adl6.internode.on.net> <4df1ea6d-148e-f3ab-d204-9277c513a468@wasikowski.net> <0fc687$sij78n@ipmail05.adl6.internode.on.net> From: =?UTF-8?Q?=c5=81ukasz_W=c4=85sikowski?= Message-ID: <31ae8d40-9c28-a14c-2b7c-b62a6125df04@wasikowski.net> Date: Sat, 17 Jun 2017 22:52:41 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: pl Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Jun 2017 20:52:50 -0000 W dniu 2017-06-15 o 13:54, Stephen McKay pisze: > On Friday, 9th June 2017, lukasz@wasikowski.net wrote: > >> W dniu 2017-06-09 o 05:18, Stephen McKay pisze: >> >>> while : >>> do >>> date > foo >>> fsync foo >>> done >>> >>> With this running, my system does 600 writes per second to the log >>> according to zpool iostat. That drops to zero once I kill the script. >> >> Zero, so no writes to log are performed during execution of this script. > > OK. I believe this means your log is in a "pending removal" state and > has not been finally removed because ZFS thinks there's still data > stored on it. I'm happy for true ZFS experts to confirm or deny this > theory. Anybody? > >> I applied this patch to 11.1-PRERELASE, nothing changed. Still zpool >> remove exits with errcode 0, but log device is still attached to pool. > > Thanks for trying this out, but I managed to leave out essential > information. > > In my rush to do things before going away (see above), I didn't read my > notes on this event. The patch has the safety feature of requiring the > log to be offline. This means you will have to break the mirror (by > detaching one disk from it), then offline the remaining disk, and finally, > trigger the hack by attempting to remove the remaining disk. > > When I was stuck in this situation, I had already reduced my log to > a single disk before discovering the accounting error problem, so I > don't know if you can activate the patch code without first breaking > the mirror. I don't think you can offline a mirror. I've not tried. > > I've now had time to review my notes (sorry I didn't do that first up). > My pool had a data mirror-0 (gpt/data0 and gpt/data1) and a log mirror-1 > (gpt/log0 and gpt/log1). The sequence I did, minus most of the failed > attempts at random stuff, status checks, and so forth, was: > > # zpool remove pool mirror-1 #Did nothing but should have worked. > # zpool detach pool gpt/log1 #Broke the log mirror. > # zpool remove pool gpt/log0 #Did nothing but should have worked. > # zpool offline pool gpt/log0 #Just fiddling to change the state. > # zpool remove pool gpt/log0 #Still nothing. > ... Discovered plausible hack. Built and booted hacked kernel. ... > # zpool remove pool gpt/log0 #Glorious success! > > So, my log was already down to one offline disk before I got hacky. > That's why I forgot to mention it. > > You could break your mirror or you could modify the hack to remove the > VDEV_STATE_OFFLINE check. If you have already saved all your important > data then either would be fine as an experiment. My tests was done after zpool remove pool mirror-1, zpool detach pool gpt/log1. I don't remember if I offlined gpt/log0. It's possible that I haven't done it. Bulit and booted hacked kernel, zpool remove pool gpt/log0 didn't work. Unfortunately, lease of this box ended 14.06, I zeroed drives on 13.06 :( Again, thank you for your help! -- best regards, Lukasz Wasikowski