From nobody Tue Jun 6 12:59:53 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Qb9YS1bRxz4bW1j; Tue, 6 Jun 2023 12:59:56 +0000 (UTC) (envelope-from corvink@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Qb9YS0nmmz3Mvp; Tue, 6 Jun 2023 12:59:56 +0000 (UTC) (envelope-from corvink@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1686056396; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1vN98Fh3Nq2DBVmYQPjrAHHKOWlDEDcMwncl7e2Jg5Y=; b=f6CUoyc/OBbnFvWWfC2+fAvDmqls7qX4SkVfUTk+U4hBXwc/15X329WztJu9yAEaDh1MMH SSsBTmnoScxTIuQS5wJX96L8Z9IbkUHO4pfyjiBmHctxwlW1GJWFg66oIRT1VqTrgLqdWC 78zBbo2p0b0DRD7B42Muzdy3xdUu+xcAG2YmK74SHw6/qWVOsy6lCq0eRhB9wOSxHwJqm9 ZMBLPG6RgbFTMkgRMa76/X4T7Lm50qdV0EHkD9AHEjjTeqOQlf7X5UTBQIox4RpSoIigm+ WzcwaxhMbrhTGlgdBjx35NQx6TJZmw6e5WCzSQlKGC9bhcs/ackDaPHnE/xSxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1686056396; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1vN98Fh3Nq2DBVmYQPjrAHHKOWlDEDcMwncl7e2Jg5Y=; b=REPd6aNXwtXL2nCWhc2ISveRmzhFfkumKspLqqGGg1TqwZY11ZLvCFyDELcu0cUFgQTmPv ko/hwCHag+4HQf939ly8uHxGh2hUlsFRZgQRt/w90X7eiV3dv9UdAB/ndMr6ZL/wCkBXOX eiq8y3p3j/gL1wd0/ET5gq7Mq/MMGHZnAoH7T0EozI2uCDr/tSLwpu5PoBro+HJGEbLetE QpCr69VhnRCHBAhkjWAjD+6wM87Je0KAACCzoolA7yH2GBS4p6TR1+jUB01ACHIlyFxWjf iXa5HwzZchyfkmN0OCDosIgDlgoyJgy+V5O+VCZVQhuR+B/4TBEx5lPZdrDbKw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1686056396; a=rsa-sha256; cv=none; b=sb0gn9VFSOZvnSFshPQqEg+Y1CLknJtcWZNR9a+ORKBbTYg8eaSqrCUKcD3dEW4s1oFS6E FskZkZHd13R6Y7D1bVbJRahw9MqketO3X16k9zVnh/0lkxpb+bbnggX3rqFIB5WS3FbHNx wUqZa5ZLCCfNiYECKx1HB1BrAwSqwNAbSKQgVwPwl5R9ePwP1uevrJVGpo8POpqqhb14O5 uQ+tu/9a1Zqn/oX03MvfoEpPrJPfRU/EjbT4xN3als+f315ICP+5+e3cCsvu17VCe+ZhwG MBE6pvpqps+UdgFmlRPVGkc8Zyy4mUjRIxkkDaYbnmbRYSXb4HmgUoEWrEjCgA== Received: from [IPv6:2001:9e8:da59:8e00:1c7e:7163:67b2:a7d8] (unknown [IPv6:2001:9e8:da59:8e00:1c7e:7163:67b2:a7d8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: corvink) by smtp.freebsd.org (Postfix) with ESMTPSA id 4Qb9YR3hydzkrr; Tue, 6 Jun 2023 12:59:55 +0000 (UTC) (envelope-from corvink@FreeBSD.org) Message-ID: Subject: Re: BHYVE SNAPSHOT image format proposal From: Corvin =?ISO-8859-1?Q?K=F6hne?= To: Vitaliy Gusev Cc: virtualization@freebsd.org, freebsd-hackers@freebsd.org Date: Tue, 06 Jun 2023 14:59:53 +0200 In-Reply-To: <8387AC83-6667-48E5-A3FA-11475EA96A5F@gmail.com> References: <67FDC8A8-86A6-4AE4-85F0-FF7BEF9F2F06@gmail.com> <6b98da58a5bd8e83bc466efa20b5a900298210aa.camel@FreeBSD.org> <8387AC83-6667-48E5-A3FA-11475EA96A5F@gmail.com> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-EnCgH2AdnOZbtGH9VMEd" User-Agent: Evolution 3.46.4 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N --=-EnCgH2AdnOZbtGH9VMEd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Vitaliy, thanks for your answers. See comments below. On Tue, 2023-06-06 at 14:25 +0300, Vitaliy Gusev wrote: > Hi Corvin,=C2=A0 >=20 > Thanks for your comments and advices.=C2=A0 >=20 > Answers are below, >=20 > > On 5 Jun 2023, at 18:32, Corvin K=C3=B6hne wrote: > >=20 > > On Tue, 2023-05-23 at 19:05 +0300, Vitaliy Gusev wrote: > > > 2. HEADER PHYS format:=C2=A0 > > >=20 > > > =C2=A00 =C2=A0 =C2=A0+=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94+=C2=A0 > > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0IDENT STRING = =C2=A0- 64 BYTES =C2=A0 =C2=A0 =C2=A0 =C2=A0 | > > > =C2=A064 =C2=A0 +=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94+ =C2=A0=C2=A0 > > > =C2=A0 =C2=A0 =C2=A0=C2=A0| NVLIST SIZE =C2=A0- 4 BYTES =C2=A0 | =C2= =A0NVLIST DATA | > > > =C2=A072 =C2=A0 +=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94+ > > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 | > > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 NVLIST DATA =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | > > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 | > > > =C2=A04096 +=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94=E2=80=94+ > > >=20 > > > >=20 > > > > IDENT STRING - Each producer can set its own value to specify > > > > image. > > > > NVLIST SIZE =C2=A0- The following packed header nvlist data size. > > > > NVLIST DATA - Packed nvlist header data. > > > >=20 > > > > 4KB should be enough for the HEADER to keep basic information > > > > about > > > > Sections. However, it can > > > > be enlarged lately, without breaking backward compatibility.=C2=A0 > > > >=20 > >=20 > > I can't see an advantage of using a fixed sized header of 4KB. You > > have > > to parse the offset and size of every section anyways. If it's for > > alignment requirements you can still align all sections on save and > > set > > the offset accordingly. So, why complicating things by using a > > fixed > > header size? >=20 >=20 > You are right about 4KB restriction. I will correct it in updated > format proposal. Idea is > to reserve enough space for HEADER and write it after all finished > stages at the beginning=C2=A0 > of a snapshot file. >=20 > Implementation (snapshot path) should know estimated maximum size of > the header and can > use the possible maximum. Currently 4KB is enough and easily can be > increased in the bhyve=E2=80=99s code without any problem.=C2=A0 >=20 > Alignment is useful to debug and looking into snapshot image file. >=20 Ah okay, I see the issue. Bhyve has to estimate the size. It's fine to use a fixed header size in the code. However, our image format shouldn't be defined that way. > >=20 > > The IDENT STRING seems to be very large. Even a GUID which should > > be a > > global unique identifier uses just 16 Bytes. Additionally, it might > > be > > worth using a dedicated ident and version field for an easier > > version > > parsing. E.g.: >=20 >=20 > Intention is to add enough space for the future version (as > reservation) and other producers > and companies to specify it=E2=80=99s own ID string with possible add-on > information. So adding =C2=A064 bytes > for the future is not so huge pay, but can be very useful. > During resume, if IDENT string is not the same as in bhyve, resume > can fail before parsing > other data, because it could be that internal format is not as > expected. >=20 > I would not to fix IDENT string format and just apply rule: >=20 > > During resume, bhyve compares its own IDENT string and IDENT string > > from an > > Snapshot image. If it is not the same, further assumption about > > format cannot be done, > > and resume should fail. >=20 We may have different version of the format from the same produce. IMHO, it makes sense to have a dedicated IDENT and VERSION field to easily figure out 1) if the producer of the image is known 2) if we support that version of the producer Even if you allocated a huge amount of free space, someone would need more. So, what do you think about this format: +---------------------------------------------------------------------+ | IDENT - 16 BYTES | +-------------------+-----------------------+-------------------------+ | VERSION - 4 BYTES | NVLIST SIZE - 4 BYTES | NVLIST OFFSET - 8 BYTES | +-------------------+-----------------------+-------------------------+ | POSSIBLE FREE SPACE (e.g. for custom data, alignment etc.) | +---------------------------------------------------------------------+ | NVLIST DATA | +---------------------------------------------------------------------+ | POSSIBLE FREE SPACE (for whatever reason) | +---------------------------------------------------------------------+ | SNAPSHOT DATA | +---------------------------------------------------------------------+ > >=20 > > +------------------+-------------------+ > > | IDENT - 56 BYTES | VERSION - 8 BYTES | > > +------------------+-------------------+ > >=20 > > IDENT - "BHYVE CHECKPOINT IMAGE" > > VERSION - 1 (as uint64_t) > >=20 > > Btw: I don't care but here we could leave some free space for > > possible > > forward compatibility. E.g.: > >=20 > > +------------------+-------------------+-------------------------+ > > | IDENT - 16 BYTES | VERSION - 8 BYTES | _FREE_SPACE_ - 40 BYTES | > > +------------------+-------------------+=E2=80=94=E2=80=94=E2=80=94=E2= =80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94+ > ... > > > 4. EXAMPLE: > > >=20 > > >=20 > > > =C2=A0IDENT STRING: > > >=20 > > > =C2=A0 =C2=A0 =C2=A0 =C2=A0"BHYVE CHECKPOINT IMAGE VERSION 1" > > >=20 > > > =C2=A0NVLIST HEADER:=C2=A0 > > >=20 > > > =C2=A0=C2=A0[config] > > > =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0config.offset =3D 0x1000 (4096) > > > =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0config.size =3D 0x1f6 (502) > > > =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0config.type =3D "text" > > >=20 > >=20 > > Not sure if it's just an example for the "text" type. bhyve > > converts it > > into a nvlist, so it could be saved directly as nvlist. > > Btw: I would only implement the "text" type if there's an usecase > > that > > can't be solved by one of the other types. >=20 >=20 >=20 > Intention is to use current engine to dump bhyve=E2=80=99s config and rea= d > config > from a file (-k option). >=20 > Advantage of using =E2=80=9Ctext=E2=80=9D type - simple implementation an= d as an > example > of flexibility of proposed image format. Image file can keep any > types that > a producer would like to use: text, nvlist, binary, diff-pages, etc. >=20 nvlist is bhyve's internal format. So, implementing it should be even more simple. We don't need an example of how flexible the image format is in our final implementation. Adding it as an example for this thread is fine. > >=20 > > All in all, it looks good. Keep on your work! > >=20 > > Regards checksum feature: > > We should focus on enabling this feature by default before adding > > advanced features. So, keep it simple and small. >=20 >=20 > Could you give a more example what you meant about =E2=80=9Cchecksum=E2= =80=9D > feature? Did you mean as > TAR=E2=80=99s checksum, i.e. only header? >=20 Sry for the confusion. I was referring to the discussion about checksums (like checksums for section data) in this thread. Just ignore my comment and focus on the rest. >=20 > >=20 > > Regards forward compatibility: > > Backward compatibility is way more important than forward > > compatibility. Nevertheless, forward compatibility would be nice to > > have. So, we should keep it in mind when modifying the layout. For > > the > > moment, just focus on a format which is backward compatible. > >=20 >=20 >=20 > It seems that having information about forward compatibility could be > very > useful, at least to get it in advance if it is impossible to restore. > I will add it during > implementing this format. >=20 Forward compatibility would be great but please just ignore it at the moment to speed up the development. We have versioning. So, we can add forward compatibility in a future version. --=20 Kind regards, Corvin --=-EnCgH2AdnOZbtGH9VMEd Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEgvRSla3m2t/H2U9G2FTaVjFeAmoFAmR/LckACgkQ2FTaVjFe AmrMLw//WL+go44XpMM39BYieAgn709rpTE1qMfR7M592bmT3iztxgSDwBkRQamU jbBmP1GE2jcY9/iuaHxTWtv9/iah87IyUZ+j7tm9I3i7aktvd0sRmQaC84GWXdvV Yu/MGpitkLtUoCrg3uU0l+FtaYhyK8I8bP5T1xrMQoj47dlsb9rrKqoMjznYaSPW 2SjmKpSjYKfN14dHGi3HRJ9WmyDtTyUvipb9E4Uma5SfZyXv9TcM7QjTmI0CFHVu 8Dhq+Q4jjoYLPLPk/c0xaZmlaL4QhKE4n9m9gNuH7vkGU4Sm01QU8XLG7VrY/J5J +vmqH1l/3xfIYcJbZ4qQhccG8OznERLQL/eHd7Bj/9fYLNmTjGQ21eCSWl0Lm0bo 8F9wPCOSZ/i7R3w0iBqXluF7ZMe9JmZot8nb3mUKCaChESbb8HExuRHpnnc0VpNr qJFQ2YRmNhwqP69mQtCxf0pM1f4OSCKB/Db6Xs43xbArG9xOEbEgVlr8E57ov+Vr 2Ad/eIBTknZQ93oMGDsB2KrcbfAYVF6egp3fwdYc/m1tyQ22YHS99A/vm30qYywW LfIYUHzhiw5MMpf9mDM+GmIf0YkaQCtMxNn3zBjOCsmR4afDo/DZ/SMRnZgjcZzV Mi/Yrth5DfMdMOZjEANU0Bxp6Wh0tjwB31IiKl5fuFp32+HNX6c= =k+Ep -----END PGP SIGNATURE----- --=-EnCgH2AdnOZbtGH9VMEd--