Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 06 Jun 2023 14:59:53 +0200
From:      Corvin =?ISO-8859-1?Q?K=F6hne?= <corvink@FreeBSD.org>
To:        Vitaliy Gusev <gusev.vitaliy@gmail.com>
Cc:        virtualization@freebsd.org, freebsd-hackers@freebsd.org
Subject:   Re: BHYVE SNAPSHOT image format proposal
Message-ID:  <d92db9bfbea181d6eb9d57b579d67e8e118ef4de.camel@FreeBSD.org>
In-Reply-To: <8387AC83-6667-48E5-A3FA-11475EA96A5F@gmail.com>
References:  <67FDC8A8-86A6-4AE4-85F0-FF7BEF9F2F06@gmail.com> <6b98da58a5bd8e83bc466efa20b5a900298210aa.camel@FreeBSD.org> <8387AC83-6667-48E5-A3FA-11475EA96A5F@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--=-EnCgH2AdnOZbtGH9VMEd
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Vitaliy,

thanks for your answers. See comments below.

On Tue, 2023-06-06 at 14:25 +0300, Vitaliy Gusev wrote:
> Hi Corvin,=C2=A0
>=20
> Thanks for your comments and advices.=C2=A0
>=20
> Answers are below,
>=20
> > On 5 Jun 2023, at 18:32, Corvin K=C3=B6hne <corvink@FreeBSD.org> wrote:
> >=20
> > On Tue, 2023-05-23 at 19:05 +0300, Vitaliy Gusev wrote:
> > > 2. HEADER PHYS format:=C2=A0
> > >=20
> > > =C2=A00 =C2=A0 =C2=A0+=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=
=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=
=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=
=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=
=80=94=E2=80=94=E2=80=94+=C2=A0
> > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0IDENT STRING =
=C2=A0- 64 BYTES =C2=A0 =C2=A0 =C2=A0 =C2=A0 |
> > > =C2=A064 =C2=A0 +=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=
=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=
=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=
=E2=80=94=E2=80=94+ =C2=A0=C2=A0
> > > =C2=A0 =C2=A0 =C2=A0=C2=A0| NVLIST SIZE =C2=A0- 4 BYTES =C2=A0 | =C2=
=A0NVLIST DATA |
> > > =C2=A072 =C2=A0 +=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=
=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=
=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=
=E2=80=94=E2=80=94+
> > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 |
> > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 NVLIST DATA =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |
> > > =C2=A0 =C2=A0 =C2=A0=C2=A0| =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 |
> > > =C2=A04096 +=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=
=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=
=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=
=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94=E2=80=94+
> > >=20
> > > >=20
> > > > IDENT STRING - Each producer can set its own value to specify
> > > > image.
> > > > NVLIST SIZE =C2=A0- The following packed header nvlist data size.
> > > > NVLIST DATA - Packed nvlist header data.
> > > >=20
> > > > 4KB should be enough for the HEADER to keep basic information
> > > > about
> > > > Sections. However, it can
> > > > be enlarged lately, without breaking backward compatibility.=C2=A0
> > > >=20
> >=20
> > I can't see an advantage of using a fixed sized header of 4KB. You
> > have
> > to parse the offset and size of every section anyways. If it's for
> > alignment requirements you can still align all sections on save and
> > set
> > the offset accordingly. So, why complicating things by using a
> > fixed
> > header size?
>=20
>=20
> You are right about 4KB restriction. I will correct it in updated
> format proposal. Idea is
> to reserve enough space for HEADER and write it after all finished
> stages at the beginning=C2=A0
> of a snapshot file.
>=20
> Implementation (snapshot path) should know estimated maximum size of
> the header and can
> use the possible maximum. Currently 4KB is enough and easily can be
> increased in the bhyve=E2=80=99s code without any problem.=C2=A0
>=20
> Alignment is useful to debug and looking into snapshot image file.
>=20

Ah okay, I see the issue. Bhyve has to estimate the size. It's fine to
use a fixed header size in the code. However, our image format
shouldn't be defined that way.

> >=20
> > The IDENT STRING seems to be very large. Even a GUID which should
> > be a
> > global unique identifier uses just 16 Bytes. Additionally, it might
> > be
> > worth using a dedicated ident and version field for an easier
> > version
> > parsing. E.g.:
>=20
>=20
> Intention is to add enough space for the future version (as
> reservation) and other producers
> and companies to specify it=E2=80=99s own ID string with possible add-on
> information. So adding =C2=A064 bytes
> for the future is not so huge pay, but can be very useful.

> During resume, if IDENT string is not the same as in bhyve, resume
> can fail before parsing
> other data, because it could be that internal format is not as
> expected.
>=20
> I would not to fix IDENT string format and just apply rule:
>=20
> > During resume, bhyve compares its own IDENT string and IDENT string
> > from an
> > Snapshot image. If it is not the same, further assumption about
> > format cannot be done,
> > and resume should fail.
>=20

We may have different version of the format from the same produce.
IMHO, it makes sense to have a dedicated IDENT and VERSION field to
easily figure out

1) if the producer of the image is known
2) if we support that version of the producer

Even if you allocated a huge amount of free space, someone would need
more. So, what do you think about this format:

+---------------------------------------------------------------------+
| IDENT - 16 BYTES                                                    |
+-------------------+-----------------------+-------------------------+
| VERSION - 4 BYTES | NVLIST SIZE - 4 BYTES | NVLIST OFFSET - 8 BYTES |
+-------------------+-----------------------+-------------------------+
| POSSIBLE FREE SPACE (e.g. for custom data, alignment etc.)          |
+---------------------------------------------------------------------+
| NVLIST DATA                                                         |
+---------------------------------------------------------------------+
| POSSIBLE FREE SPACE (for whatever reason)                           |
+---------------------------------------------------------------------+
| SNAPSHOT DATA                                                       |
+---------------------------------------------------------------------+

> >=20
> > +------------------+-------------------+
> > | IDENT - 56 BYTES | VERSION - 8 BYTES |
> > +------------------+-------------------+
> >=20
> > IDENT - "BHYVE CHECKPOINT IMAGE"
> > VERSION - 1 (as uint64_t)
> >=20
> > Btw: I don't care but here we could leave some free space for
> > possible
> > forward compatibility. E.g.:
> >=20
> > +------------------+-------------------+-------------------------+
> > | IDENT - 16 BYTES | VERSION - 8 BYTES | _FREE_SPACE_ - 40 BYTES |
> > +------------------+-------------------+=E2=80=94=E2=80=94=E2=80=94=E2=
=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=
=94+
> ...
> > > 4. EXAMPLE:
> > >=20
> > >=20
> > > =C2=A0IDENT STRING:
> > >=20
> > > =C2=A0 =C2=A0 =C2=A0 =C2=A0"BHYVE CHECKPOINT IMAGE VERSION 1"
> > >=20
> > > =C2=A0NVLIST HEADER:=C2=A0
> > >=20
> > > =C2=A0=C2=A0[config]
> > > =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0config.offset =3D 0x1000 (4096)
> > > =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0config.size =3D 0x1f6 (502)
> > > =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0config.type =3D "text"
> > >=20
> >=20
> > Not sure if it's just an example for the "text" type. bhyve
> > converts it
> > into a nvlist, so it could be saved directly as nvlist.
> > Btw: I would only implement the "text" type if there's an usecase
> > that
> > can't be solved by one of the other types.
>=20
>=20
>=20
> Intention is to use current engine to dump bhyve=E2=80=99s config and rea=
d
> config
> from a file (-k option).
>=20
> Advantage of using =E2=80=9Ctext=E2=80=9D type - simple implementation an=
d as an
> example
> of flexibility of proposed image format. Image file can keep any
> types that
> a producer would like to use: text, nvlist, binary, diff-pages, etc.
>=20

nvlist is bhyve's internal format. So, implementing it should be even
more simple.
We don't need an example of how flexible the image format is in our
final implementation. Adding it as an example for this thread is fine.

> >=20
> > All in all, it looks good. Keep on your work!
> >=20
> > Regards checksum feature:
> > We should focus on enabling this feature by default before adding
> > advanced features. So, keep it simple and small.
>=20
>=20
> Could you give a more example what you meant about =E2=80=9Cchecksum=E2=
=80=9D
> feature? Did you mean as
> TAR=E2=80=99s checksum, i.e. only header?
>=20

Sry for the confusion. I was referring to the discussion about
checksums (like checksums for section data) in this thread. Just ignore
my comment and focus on the rest.

>=20
> >=20
> > Regards forward compatibility:
> > Backward compatibility is way more important than forward
> > compatibility. Nevertheless, forward compatibility would be nice to
> > have. So, we should keep it in mind when modifying the layout. For
> > the
> > moment, just focus on a format which is backward compatible.
> >=20
>=20
>=20
> It seems that having information about forward compatibility could be
> very
> useful, at least to get it in advance if it is impossible to restore.
> I will add it during
> implementing this format.
>=20

Forward compatibility would be great but please just ignore it at the
moment to speed up the development. We have versioning. So, we can add
forward compatibility in a future version.


--=20
Kind regards,
Corvin

--=-EnCgH2AdnOZbtGH9VMEd
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEgvRSla3m2t/H2U9G2FTaVjFeAmoFAmR/LckACgkQ2FTaVjFe
AmrMLw//WL+go44XpMM39BYieAgn709rpTE1qMfR7M592bmT3iztxgSDwBkRQamU
jbBmP1GE2jcY9/iuaHxTWtv9/iah87IyUZ+j7tm9I3i7aktvd0sRmQaC84GWXdvV
Yu/MGpitkLtUoCrg3uU0l+FtaYhyK8I8bP5T1xrMQoj47dlsb9rrKqoMjznYaSPW
2SjmKpSjYKfN14dHGi3HRJ9WmyDtTyUvipb9E4Uma5SfZyXv9TcM7QjTmI0CFHVu
8Dhq+Q4jjoYLPLPk/c0xaZmlaL4QhKE4n9m9gNuH7vkGU4Sm01QU8XLG7VrY/J5J
+vmqH1l/3xfIYcJbZ4qQhccG8OznERLQL/eHd7Bj/9fYLNmTjGQ21eCSWl0Lm0bo
8F9wPCOSZ/i7R3w0iBqXluF7ZMe9JmZot8nb3mUKCaChESbb8HExuRHpnnc0VpNr
qJFQ2YRmNhwqP69mQtCxf0pM1f4OSCKB/Db6Xs43xbArG9xOEbEgVlr8E57ov+Vr
2Ad/eIBTknZQ93oMGDsB2KrcbfAYVF6egp3fwdYc/m1tyQ22YHS99A/vm30qYywW
LfIYUHzhiw5MMpf9mDM+GmIf0YkaQCtMxNn3zBjOCsmR4afDo/DZ/SMRnZgjcZzV
Mi/Yrth5DfMdMOZjEANU0Bxp6Wh0tjwB31IiKl5fuFp32+HNX6c=
=k+Ep
-----END PGP SIGNATURE-----

--=-EnCgH2AdnOZbtGH9VMEd--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d92db9bfbea181d6eb9d57b579d67e8e118ef4de.camel>