Date: Sun, 6 Nov 2022 15:10:34 -0500 From: Mark Johnston <markj@freebsd.org> To: Corvin =?iso-8859-1?Q?K=F6hne?= <corvink@freebsd.org> Cc: freebsd-virtualization@freebsd.org Subject: Re: bhyve nvlist constness bug Message-ID: <Y2gUujd9V2r5BO0C@nuc> In-Reply-To: <80881e5b-74d8-3831-85ac-cb8b8d3d0413@FreeBSD.org> References: <Y1WFjpZ8zzVOrhTK@nuc> <d158aa00-2639-a409-1fce-33538ab7eee0@FreeBSD.org> <924528458d3b4265a9bf36050c949c3f@beckhoff.com> <80881e5b-74d8-3831-85ac-cb8b8d3d0413@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 04, 2022 at 11:22:55AM +0100, Corvin Köhne wrote: > > On 10/23/22 11:18 AM, Mark Johnston wrote: > >> I'm going through compiler warnings in the bhyve code with the aim of > >> bumping WARNS, since the current setting hides bugs. There's one in the > >> config code that looks a bit tricky to resolve and I was hoping for some > >> guidance on the right way to do that. > >> > >> The basic problem is _lookup_config_node() may return an nvlist via > >> nvlist_get_nvlist(), but nvlist_get_nvlist() returns a const nvlist_t > >> *, so _lookup_config_node() is discarding the const qualifier. And > >> indeed, some callers will modify the returned node. The use of > >> nvlist_move_nvlist() in _lookup_config_node() has a similar problem: > >> nvlist_move_* "consumes" the passed value and is not supposed to be > >> mutated after. > >> > >> I'm pretty sure this is actually a non-issue with our nvlist > >> implementation; when adding an nvlist value to an nvlist, it doesn't > >> store anything except for the pointer itself, so you can mutate it > >> safely. Note, this is not true for all value types, specifically > >> arrays. However, there are multiple nvlist implementations out there, > >> and ours might change such that bhyve's config code becomes incorrect. > >> > >> The bug seems annoying to fix because consumers can do this: > >> > >> nvlist_t *nvl = find_config_node(path); > >> set_config_value_node(nvl, "foo", "bar"); > >> > >> So, if we naively changed find_config_node() to return a copy of the > >> nvlist at "path", set_config_value_node() would have to somehow figure > >> out where in the config tree to insert the updated nvlist, but it > >> doesn't have the path anymore. > >> > >> To fix it properly could change find_config_node() to return some opaque > >> type which contains the source path of the node so that we can DTRT when > >> mutating the node. IMO it's better if the config node type is opaque to > >> consumers. Or, make each nvlist node store its source path by storing > >> it in the nvlist itself. e.g., the nvlist node describing the device > >> model at "pci.0.1.2" would have a "__path" key with value "pci.0.1.2", > >> so that set_config_value_node() can figure out where in the tree to > >> replace an existing copy of "pci.0.1.2". Or is there a simpler way to > >> fix this that I missed? > >> > >> Any thoughts? I'm happy to implement the suggestions above but didn't > >> want to do that work without some buy-in. > > The config stuff uses nvlist on the premise that it is a useful way to > > store the data. However, it is true that nvlists seem to not be as > > useful for storing mutable data. Rather, they are useful as a way to > > pass structured data across domain boundaries (e.g. serialized over > > a socket, or between kernel and userland). If nvlist doesn't work well > > I'd rather just change the data structures we use in bhyve rather than > > add hidden nodes with path names, etc. To that end, if nvlist is too > > limiting then we can do what my first intution was which was to turn > > bhyve into a C++ program and use something like std::map<> to represent > > the tree instead. > > > > -- > > John Baldwin > > > I do agree with John. It looks like nvlist isn't made for mutable data. > So, when using nvlist we have to make sure that we don't modify the data > which is already stored in it or we have to use nvlist_take, modify the > data and write it back with nvlist_move/nvlist_add. This requires that > we patch the consumers to follow these rules. The consumers don't need to be patched with my "__path" proposal above since they use config.c accessors to get and set config values. Though, I admit I didn't implement it though so might be missing something. > I don't know how much work would be required to turn bhyve into a C++ > program. If it's doable, I vote for this approach. Ok, I'll try that. The first step is probably to make config nodes an opaque type; today all consumers have to pass nvlist_t *s around even though they use accessors in config.c for everything. Since my immediate goal was to raise WARNS for bhyve, and I'm quite close to that, I posted a patch which simply disarms the warning for the time being: https://reviews.freebsd.org/D37293
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Y2gUujd9V2r5BO0C>