From owner-freebsd-fs@freebsd.org Thu Jun 15 11:59:56 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D20EED88A36 for ; Thu, 15 Jun 2017 11:59:56 +0000 (UTC) (envelope-from mckay@freebsd.org) Received: from ipmail07.adl2.internode.on.net (ipmail07.adl2.internode.on.net [150.101.137.131]) by mx1.freebsd.org (Postfix) with ESMTP id 2FD9572CF7; Thu, 15 Jun 2017 11:59:55 +0000 (UTC) (envelope-from mckay@freebsd.org) Message-Id: X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2A+/wDZdEJZAEbA0HZcgxqBQIg+h2CUKJR1QwGFWgQCAoJbWAMBAQEBAQIPAQEBMk+FGQZWFQ4QCANGOR4GE4orrHWXZIQ7EgGDQIJPBYlHiEyMNpVciTSGUJR/VoEAC4EBCIdqLodYgjABAgM X-IronPort-SPAM: SPAM Received: from ppp118-208-192-70.bras1.hba1.internode.on.net (HELO localhost) ([118.208.192.70]) by ipmail07.adl2.internode.on.net with ESMTP; 15 Jun 2017 21:24:20 +0930 From: Stephen McKay To: =?UTF-8?Q?=c5=81ukasz_W=c4=85sikowski?= cc: Stephen McKay , freebsd-fs@freebsd.org Subject: Re: Problem with zpool remove of log device References: <9188a169-cd81-f64d-6b9e-0e3c6b4af1bb@wasikowski.net> <0410af$1dldvp4@ipmail04.adl6.internode.on.net> <4df1ea6d-148e-f3ab-d204-9277c513a468@wasikowski.net> <0fc687$sij78n@ipmail05.adl6.internode.on.net> In-Reply-To: from =?UTF-8?Q?=c5=81ukasz_W=c4=85sikowski?= at "Fri, 09 Jun 2017 12:15:32 +0200" Date: Thu, 15 Jun 2017 21:54:18 +1000 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Jun 2017 11:59:56 -0000 Sorry for the slow response. I've been away (without email) for a few days. On Friday, 9th June 2017, lukasz@wasikowski.net wrote: >W dniu 2017-06-09 o 05:18, Stephen McKay pisze: > >> while : >> do >> date > foo >> fsync foo >> done >> >> With this running, my system does 600 writes per second to the log >> according to zpool iostat. That drops to zero once I kill the script. > >Zero, so no writes to log are performed during execution of this script. OK. I believe this means your log is in a "pending removal" state and has not been finally removed because ZFS thinks there's still data stored on it. I'm happy for true ZFS experts to confirm or deny this theory. Anybody? >I applied this patch to 11.1-PRERELASE, nothing changed. Still zpool >remove exits with errcode 0, but log device is still attached to pool. Thanks for trying this out, but I managed to leave out essential information. In my rush to do things before going away (see above), I didn't read my notes on this event. The patch has the safety feature of requiring the log to be offline. This means you will have to break the mirror (by detaching one disk from it), then offline the remaining disk, and finally, trigger the hack by attempting to remove the remaining disk. When I was stuck in this situation, I had already reduced my log to a single disk before discovering the accounting error problem, so I don't know if you can activate the patch code without first breaking the mirror. I don't think you can offline a mirror. I've not tried. I've now had time to review my notes (sorry I didn't do that first up). My pool had a data mirror-0 (gpt/data0 and gpt/data1) and a log mirror-1 (gpt/log0 and gpt/log1). The sequence I did, minus most of the failed attempts at random stuff, status checks, and so forth, was: # zpool remove pool mirror-1 #Did nothing but should have worked. # zpool detach pool gpt/log1 #Broke the log mirror. # zpool remove pool gpt/log0 #Did nothing but should have worked. # zpool offline pool gpt/log0 #Just fiddling to change the state. # zpool remove pool gpt/log0 #Still nothing. ... Discovered plausible hack. Built and booted hacked kernel. ... # zpool remove pool gpt/log0 #Glorious success! So, my log was already down to one offline disk before I got hacky. That's why I forgot to mention it. You could break your mirror or you could modify the hack to remove the VDEV_STATE_OFFLINE check. If you have already saved all your important data then either would be fine as an experiment. Cheers, Stephen.