From nobody Fri Sep 6 17:02:14 2024 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4X0jFx1bMzz5VW9h for ; Fri, 06 Sep 2024 17:02:29 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4X0jFv4rmnz45BW for ; Fri, 6 Sep 2024 17:02:27 +0000 (UTC) (envelope-from asomers@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=freebsd.org (policy=none); spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.167.181 as permitted sender) smtp.mailfrom=asomers@gmail.com Received: by mail-oi1-f181.google.com with SMTP id 5614622812f47-3df0e0e5577so1348768b6e.0 for ; Fri, 06 Sep 2024 10:02:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725642146; x=1726246946; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kN5DyQgL9diXk46IhlX0prsAncx02Mo5xQcPaQDhwis=; b=UIOY/dprR4vjPTiQrFtMnPfiA3mFvj6RS137/W4VVmti9tNQQPRjA5NVBfHFbBtfVb 9laJzQalHgARyNCoEqaIkF7XMp1dvIqVvfeTW7Zezn65Qn2BOP/WQuUa+jBit/CIL7f2 Hx6/Rb90CPQC424AJ3GirREN4Cpnh3v7Vsw9E6oEd+7uypcvMfDG2YB2z4RMy07syYVQ lXlvf1Rg9khiSESiZYt2mnM3H2hLdwU278GYDzIYaNeHhHGR39QD3uI9rnrrotJf5rDA pH6hbk2n4335DbTlUk6EeiVpd4Ex6zF5FQPaU+PiYRCdanCSrMWGqB7mrvSQfO9KwF1b dwow== X-Forwarded-Encrypted: i=1; AJvYcCUFamsfzHMkW/qc2XvI12JeXHah1O8PkmBrkkZtKp8yOcaKxqH+xa7ubiLMxzB0XEjj3Mej+vsQgenc@freebsd.org X-Gm-Message-State: AOJu0YxmZIneO6ERyPWhnaIb5HIuzSbsZKz1pr3EpZ88Xy/Mu74wYeGq tqCOtT5zYgWGc+qMWyVKcwvuMcJInInjt37GhnXxHisB6XOgZ4loIljJCVPMiIUoy6pieONV3Nh eNHVHVMV7AH+KTc/HX0ZRGi2PRIuaVQha X-Google-Smtp-Source: AGHT+IE7dQJ6OKrwSMW1f2xMmDSKA7VaiWnGNLrnzVT1LR9XhunOBq8DH6mGk9BEMgzAnfmUk+dDvkVTPpDVn6+n6/k= X-Received: by 2002:a05:6808:3c44:b0:3d9:2aa5:4077 with SMTP id 5614622812f47-3e029ceaa81mr4022046b6e.5.1725642146175; Fri, 06 Sep 2024 10:02:26 -0700 (PDT) List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@FreeBSD.org MIME-Version: 1.0 References: <5ED5CB56-2E2A-4D83-8CDA-6D6A0719ED19@distal.com> In-Reply-To: From: Alan Somers Date: Fri, 6 Sep 2024 11:02:14 -0600 Message-ID: Subject: Re: Unable to replace drive in raidz1 To: Chris Ross Cc: mike tancsa , FreeBSD Filesystems Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.90 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.997]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : SPF not aligned (relaxed), No valid DKIM,none]; RCPT_COUNT_THREE(0.00)[3]; FROM_HAS_DN(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_ONE(0.00)[1]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FREEMAIL_ENVFROM(0.00)[gmail.com]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; FREEFALL_USER(0.00)[asomers]; MLMMJ_DEST(0.00)[freebsd-fs@freebsd.org]; TAGGED_RCPT(0.00)[freebsd]; R_DKIM_NA(0.00)[]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.167.181:from]; RCVD_IN_DNSWL_NONE(0.00)[209.85.167.181:from] X-Rspamd-Queue-Id: 4X0jFv4rmnz45BW On Fri, Sep 6, 2024 at 10:51=E2=80=AFAM Chris Ross wrote: > > > > > On Sep 6, 2024, at 11:32, Alan Somers wrote: > > > > "zpool replace" is indeed the correct command. There's no need to run > > "zpool offline" first, and "zpool remove" is wrong. Since "zpool > > replace" is still failing, are you sure that da10 is still the correct > > device name after all disks got renumbered? If you're sure, then you > > might run "zdb -l /dev/da10" to see what ZFS thinks is on that disk. > > > > I can confirm that da10 is still the new disk I put into place of prior d= a3. > > > > On Sep 6, 2024, at 11:43, mike tancsa wrote: > > I would triple check to see what the devices are that are part of the p= ool. I wish there was a way to tell zfs to only display one or the other. = So list out what diskid/DISK-K1GMBN9D, diskid/DISK-K1GMEDMD... to diskid/D= ISK-3WJ7ZMMJ are in terms of /dev/da* actually are. I have some controller= s that will re-order the disks on every reboot. glabel status and camcontr= ol devlist should help verify > > > camcontrol devlist lets me know that the three HGST drives making up > zraid1-1 are da3,da4,da5 and the three WD drives making up > zraid1-2 are da6,da7,da8. So, like before, just moved down a > number because the prior da3 went away and a new disk in that > physical slot became da10. (da9 is a loose JBOD single with ufs > on it, previously da10, in slot 12 of 12) > > da10 is in fact still the disk in slot3 of the chassis, zdb -l shows > the below. I did add and remove it as a spare while trying things, > that may be why it shows up this way. > > - Chris > > % sudo zdb -l /dev/da10 > ------------------------------------ > LABEL 0 > ------------------------------------ > version: 5000 > name: 'tank' > state: 0 > txg: 0 > pool_guid: 3456317866677065800 > errata: 0 > hostid: 2747523522 > hostname: 'frizzen02.devit.ciscolabs.com' > top_guid: 2495145666029787532 > guid: 2495145666029787532 > vdev_children: 3 > vdev_tree: > type: 'disk' > id: 0 > guid: 2495145666029787532 > path: '/dev/da10' > phys_path: 'id1,enc@n584b2612f2c321bd/type@0/slot@3/elmdesc@Array= Device03' > whole_disk: 1 > metaslab_array: 0 > metaslab_shift: 0 > ashift: 12 > asize: 22000965255168 > is_log: 0 > create_txg: 18008413 > features_for_read: > com.delphix:hole_birth > com.delphix:embedded_data > create_txg: 18008413 > labels =3D 0 1 2 3 This looks like you got into a split-brain situation where the disks have inconsistent labels. Most disks think that da10 is not a member of the pool, but da10 thinks that it is. Perhaps you added it as a spare, then physically removed it, and then did a "zpool remove" to remove the spare from the configuration? If you're very very very sure that there is no data on da10 that you care about, you can do "zpool labelclear -f /dev/da10"