From nobody Sun Jan 7 18:34:06 2024 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4T7Qp46NNYz55dm5 for ; Sun, 7 Jan 2024 18:34:20 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4T7Qp370rBz4ZLf for ; Sun, 7 Jan 2024 18:34:19 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-a271a28aeb4so104263266b.2 for ; Sun, 07 Jan 2024 10:34:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1704652458; x=1705257258; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=dBMcbWknmEeiLCmm6juxjLzU8YNhAOJUpO5ANn/muuc=; b=ZvEWqX1IqdTdeCWQ04qqte+y193MLyB+/LSQZtnC5PZxRCNQ77+1FhUHlDZp/4wIm6 0VQdG7J5XJtLCisDytqG0dpGxUsilVTitK4mQYwMyB6/ZuSqTrKVcYyFP+r7beR1pAWB hq0Y9juNtKujr91CYjmWW37FaN+XMrbhuiYGLgzNXGzBf/NudE5NR+qiqDchTWF+gCWw QjGDbN0yxceAD92Npi+6Fa0lbF3mNrNnph5h/nXZFS8qJn16Ipzjx7ut0vDPFXmasVw2 qhIMbyzJ5ugyvz0hVg2YCoaLoN5XuOCwT6bBhN0/1SuTuJaBnXjj+DPp2mcx5HSpIJRG usmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704652458; x=1705257258; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dBMcbWknmEeiLCmm6juxjLzU8YNhAOJUpO5ANn/muuc=; b=O6IykYfBNmj5DNwanMuqHb0aulkmYmb/0YW/2xmrZu37ShZBaD69mMIhCxten0+qOR xU9xApG335kyS/ypg5jAdZiXoYtT15k9FQcCCcUyLA7w7dQtAqpFGF6HTi1BRXKsNZjX nUC+SJbVVNkwJtobr0YflaXz0Rv8B8yvXssHcHlivibxRvf+WyibUSj4wl/RY4olGFgm yJogQVk7g5AG051f6Mrv2pgDD9z/+RpGXFzVe0e2e9BhambaYlUDcaksg6OwA4Mihrpy LhI8LW1ZjQ3Nkye3xX30dU8VsV2dASpTGSwNk6xIAgFXAnBKzvuBTIwGwNbslaT+KHF9 cAZg== X-Gm-Message-State: AOJu0YzN+aZzbgMjaFBI/VBOmogdGMDo9GIBAK18UNIMi8fM7ePTQUtC bexxOnWY8LlN8TrcIXP05MDomKsS/6pyPSRqlLnivgQFFkFPPw== X-Google-Smtp-Source: AGHT+IE66MuDCSajV1hdhhJ/pX7xKewBtGmG4U3kYk05yB0o/S1q7cZjTiCCNWFUrjfEOkoG9z/kjxjVvVpO2TwVIRs= X-Received: by 2002:a17:906:d9c7:b0:a27:5343:d3e8 with SMTP id qk7-20020a170906d9c700b00a275343d3e8mr1068109ejb.97.1704652457829; Sun, 07 Jan 2024 10:34:17 -0800 (PST) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 References: <065f4f5c-f38b-45f4-b7e7-5248f871f7e6@FreeBSD.org> In-Reply-To: <065f4f5c-f38b-45f4-b7e7-5248f871f7e6@FreeBSD.org> From: Warner Losh Date: Sun, 7 Jan 2024 11:34:06 -0700 Message-ID: Subject: Re: FreeBSD 13.2-STABLE can not boot from damaged mirror AND pool stuck in "resilver" state even without new devices. To: lev@freebsd.org Cc: Miroslav Lachman <000.fbsd@quip.cz>, freebsd-fs , freebsd-stable Content-Type: multipart/alternative; boundary="00000000000036f5f2060e5f548d" X-Rspamd-Queue-Id: 4T7Qp370rBz4ZLf X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] --00000000000036f5f2060e5f548d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Jan 7, 2024 at 10:57=E2=80=AFAM Lev Serebryakov w= rote: > On 07.01.2024 16:38, Miroslav Lachman wrote: > > >>> ZFS: i/o error - all block copies unavailable > >>> ZFS: can't read MOS of pool zroot > >>> > >>> after that. > >> I've re-created pool from scratch > >> > >> zpool create znewroot ada0p3 && zfs send zroot | zfs receive znewroo= t > && zpool destroy zroot && zpool attach znewroot ada0p3 ada1p3 > >> > >> but gptzfsboot still can not boot from it with same diagnostics :-( > I must have missed it. What were the diagnostics? > > How large are your disks in a question? > 2TB > > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > ada0: ACS-2 ATA SATA 3.x device > ada0: Serial Number K5HPZZLD > ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) > ada0: Command Queueing enabled > ada0: 1907729MB (3907029168 512 byte sectors) > ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 > ada1: ATA8-ACS SATA 3.x device > ada1: Serial Number WD-WMC1P0504169 > ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) > ada1: Command Queueing enabled > ada1: 1907729MB (3907029168 512 byte sectors) > < 4294967296 sectors should be good. So these drives shouldn't see this problem. the BIOS interfaces should have no trouble here. > > As far as I search the internet it is caused by the boot code (later > stage which is in a file in /boot directory) was moved too far from the > beginning of the disk and some old BIOS cannot allow the system to contin= ue > booting. > Oh, it is good hypothesis. It is Haswell-time MSI board (old Hetzner > EX40 instance)... > Yes. If the drives are > 2TB you lose. BIOS is not for you... Unless you make special partitions that are in the first 2TB of the drive and only boot off of those. Also, if the drives are 4k, you likely lose, though it's hit or miss. Those are the hard limits of the BIOS ABI. > It can also be avoided if your machine supports EFI boot, but my HP > Microserver Gen 8 does not support it. > I'll try to switch to EFI, but it needs some luck to get to BIOS with > provided KVM, it is very unstable :-) > BIOS booting is dying. It will be unsupportable in not too many more years and the code removed. The rapid proliferation of ZFS crypto and compression types is hastening the race to see who can use up the most space in the boot loader. We can do marginal things to make it better wrt the 640k limit, sure, but then we hit other limits like the 2TB address space, like not being able to reliably support 4k drives, etc. BIOS booting likely will support an increasingly small subset of all possible booting methods as we go forward. The current crazy mix of different alternative firmwares makes it hard to know what will survive, but as we hit these limitations, it will make it harder and harder to configure, deploy and manage these systems. The Linux on ZFS root pages, btw, recommend having two pools on two partitions on the disk. One that's a few GB that's the bool that has the kernel in it, and the other, rest of the disk, that's rpool for the root pool. If people want to continue to support BIOS booting (or rather, booting using the CSM interfaces), then somebody is going to need to step up to the plate and implement a similar option in bsdinstall, bectl, freebsd-update, etc. Warner --00000000000036f5f2060e5f548d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Sun, Jan 7, 2024 at 10:57=E2=80=AF= AM Lev Serebryakov <lev@freebsd.org> wrote:
On= 07.01.2024 16:38, Miroslav Lachman wrote:

>>> ZFS: i/o error - all block copies unavailable
>>> ZFS: can't read MOS of pool zroot
>>>
>>> =C2=A0=C2=A0=C2=A0 after that.
>> =C2=A0=C2=A0I've re-created pool from scratch
>>
>> =C2=A0=C2=A0zpool create znewroot ada0p3 && zfs send zroot= | zfs receive znewroot && zpool destroy zroot && zpool att= ach znewroot ada0p3 ada1p3
>>
>> =C2=A0=C2=A0but gptzfsboot still can not boot from it with same di= agnostics :-(

> How large are your disks in a question?
=C2=A0 =C2=A02TB

ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <HGST HUS726020ALE610 APGNTD05> ACS-2 ATA SATA 3.x device
ada0: Serial Number K5HPZZLD
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 1907729MB (3907029168 512 byte sectors)
ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
ada1: <WDC WD2000FYYZ-01UL1B1 01.01K02> ATA8-ACS SATA 3.x device
ada1: Serial Number WD-WMC1P0504169
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 1907729MB (3907029168 512 byte sectors)

<=C2=A04294967296 sectors should be good. So these drives shouldn= 't see this problem. the BIOS interfaces should have no trouble here.
=C2=A0
> As far as I search the internet it is caused by the boot code (later s= tage which is in a file in /boot directory) was moved too far from the begi= nning of the disk and some old BIOS cannot allow the system to continue boo= ting.
=C2=A0 =C2=A0Oh, it is good hypothesis. It is Haswell-time MSI board (old H= etzner EX40 instance)...

Yes. If the dr= ives are > 2TB you lose. BIOS is not for you...=C2=A0 Unless you make sp= ecial partitions that are in the first 2TB of the drive and only boot off o= f those. Also, if the drives are 4k, you likely lose, though it's hit o= r miss. Those are the hard limits of the BIOS ABI.

> It can also be avoided i= f your machine supports EFI boot, but my HP Microserver Gen 8 does not supp= ort it.
=C2=A0 =C2=A0I'll try to switch to EFI, but it needs some luck to get t= o BIOS with provided KVM, it is very unstable :-)

=
BIOS booting is dying. It will be unsupportable in not too many = more years and the code removed. The rapid proliferation of ZFS crypto and = compression types is hastening the race to see who can use up the most spac= e in the boot loader. We can do marginal things to make it better wrt the 6= 40k limit, sure, but then we hit other limits like the 2TB address space, l= ike not being able to reliably support 4k drives, etc. BIOS booting likely = will support an increasingly small subset of all possible booting methods a= s we go forward. The current crazy mix of different alternative firmwares m= akes it hard to know what will survive, but as we hit these limitations, it= will make it harder and harder to configure, deploy and manage these syste= ms.

The Linux on ZFS root pages, btw, recommend ha= ving two pools on two partitions on the disk. One that's a few GB that&= #39;s the bool that has the kernel in it, and the other, rest of the disk, = that's rpool for the root pool. If people want to continue to support B= IOS booting (or rather, booting using the CSM interfaces), then somebody is= going to need to step up to the plate and implement a similar option in bs= dinstall, bectl, freebsd-update, etc.

Warner
--00000000000036f5f2060e5f548d--