Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Sep 2023 13:58:35 +0200
From:      Dimitry Andric <dim@FreeBSD.org>
To:        Frank Behrens <frank@harz2023.behrens.de>
Cc:        stable@freebsd.org, Warner Losh <imp@FreeBSD.org>
Subject:   Re: nvd->nda switch and blocksize changes for ZFS
Message-ID:  <E16E9C54-A552-4D86-9E59-71E0C68AC483@FreeBSD.org>
In-Reply-To: <bae9c711-5cc9-7dca-f6aa-445166cc540e@harz2023.behrens.de>
References:  <1b6190d1-1d42-6c99-bef6-c6b77edd386a@harz2023.behrens.de> <D20AFDEE-45F4-40AF-A401-023E69A5C8A6@FreeBSD.org> <779546e4-1135-c808-372f-e77d347ecf65@aetern.org> <bae9c711-5cc9-7dca-f6aa-445166cc540e@harz2023.behrens.de>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_1FDA81CF-C4AB-484D-954B-89C1A3569B46
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

On 25 Sep 2023, at 08:42, Frank Behrens <frank@harz2023.behrens.de> =
wrote:
>=20
> Hi Dimitry, Yuri and also Mark, thanks for your fast responses!
>=20
> Am 23.09.2023 um 20:58 schrieb Yuri Pankov:
...
> # smartctl -a /dev/nvme0
> Namespace 1 Formatted LBA Size:     512
> ...
> Supported LBA Sizes (NSID 0x1)
> Id Fmt  Data  Metadt  Rel_Perf
>  0 +     512       0         0

This is the default compatibility sector size of 512 bytes, so it is not =
relevant.


> # nvmecontrol identify nda0 and # nvmecontrol identify nvd0 (after =
hw.nvme.use_nvd=3D"1" and reboot) give the same result:
> Number of LBA Formats:       1
> Current LBA Format:          LBA Format #00
> LBA Format #00: Data Size:   512  Metadata Size:     0  Performance: =
Best
> ...
> Optimal I/O Boundary:        0 blocks
> NVM Capacity:                1000204886016 bytes
> Preferred Write Granularity: 32 blocks
> Preferred Write Alignment:   8 blocks
> Preferred Deallocate Granul: 9600 blocks
> Preferred Deallocate Align:  9600 blocks
> Optimal Write Size:          256 blocks

My guess is that the "Preferred Write Granularity" is the optimal size, =
in this case 32 'blocks' of 512 bytes, so 16 kiB. This also matches the =
stripe size reported by geom, as you showed.

The "Preferred Write Alignment" is 8 * 512 =3D 4 kiB, so you should =
align partitions etc to at least this. However, it cannot hurt to align =
everything to 16 kiB either, which is an integer multiple of 4 kiB.


> The recommended blocksize for ZFS is GEOM's stripesize and there I see =
a difference:
>=20
> # diff -w -U 10  gpart_list_nvd.txt gpart_list_nda.txt
> -Geom name: nvd0
> +Geom name: nda0
>  modified: false
>  state: OK
>  fwheads: 255
>  fwsectors: 63
>  last: 1953525127
>  first: 40
>  entries: 128
>  scheme: GPT
>  Providers:
> -1. Name: nvd0p1
> +1. Name: nda0p1
>     Mediasize: 272629760 (260M)
>     Sectorsize: 512
> -   Stripesize: 4096
> -   Stripeoffset: 0
> +   Stripesize: 16384
> +   Stripeoffset: 4096

Yeah, I am suspecting that nda reports the "stripesize" from the NVMe =
"Preferred Write Granularity" and "stripeoffset" from the NVMe =
"Preferred Write Alignment". I think Warner's the resident expert on =
NVMe drivers, so maybe he's got some clue. :)

-Dimitry


--Apple-Mail=_1FDA81CF-C4AB-484D-954B-89C1A3569B46
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.2

iF0EARECAB0WIQR6tGLSzjX8bUI5T82wXqMKLiCWowUCZRF16wAKCRCwXqMKLiCW
ozj7AJ4tjqxzB3PICZQs2RfvSailtzzWGQCeNbCjAQacFh8OWjxsEhW1sHr5p6c=
=L89J
-----END PGP SIGNATURE-----

--Apple-Mail=_1FDA81CF-C4AB-484D-954B-89C1A3569B46--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E16E9C54-A552-4D86-9E59-71E0C68AC483>