From owner-freebsd-hackers@FreeBSD.ORG Sat Dec 13 11:59:29 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A57C1C9; Sat, 13 Dec 2014 11:59:29 +0000 (UTC) Received: from mailout06.t-online.de (mailout06.t-online.de [194.25.134.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mailout00.t-online.de", Issuer "TeleSec ServerPass DE-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 51E28EA0; Sat, 13 Dec 2014 11:59:28 +0000 (UTC) Received: from fwd27.aul.t-online.de (fwd27.aul.t-online.de [172.20.26.132]) by mailout06.t-online.de (Postfix) with SMTP id 9EE0328E1F6; Sat, 13 Dec 2014 12:59:19 +0100 (CET) Received: from [192.168.119.11] (Vah5x2ZHZhYwE2r1G1Zlgi2jNilRoZpwoMcQ-NlpPxvFl8dz7B1f0YKKLXUvT-kQ7H@[84.154.99.91]) by fwd27.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-SHA encrypted) esmtp id 1XzlLs-4GWDsu0; Sat, 13 Dec 2014 12:59:12 +0100 Message-ID: <548C2A0A.8090908@freebsd.org> Date: Sat, 13 Dec 2014 12:59:06 +0100 From: Stefan Esser User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Phil Shafer , Konstantin Belousov Subject: Suggestion of higher level libxo API (in addition to current one) References: <201412101935.sBAJZL5I076079@idle.juniper.net> In-Reply-To: <201412101935.sBAJZL5I076079@idle.juniper.net> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 7bit X-ID: Vah5x2ZHZhYwE2r1G1Zlgi2jNilRoZpwoMcQ-NlpPxvFl8dz7B1f0YKKLXUvT-kQ7H X-TOI-MSGID: 67f0a246-b9f8-4e52-bd02-a70aaffbe872 Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Dec 2014 11:59:29 -0000 Am 10.12.2014 um 20:35 schrieb Phil Shafer: > Konstantin Belousov writes: >> If it is required that the name be identical, it makes more sense >> to require some handle to be passed to functions instead of the string. >> I.e. either there should be some atom-like calls which convert string >> to identifier, or first call should return handle passed to consequent >> calls. > > The library supports "do the right thing" mode, where it records > the open tag value and doesn't need the user to provide it: > > http://juniper.github.io/libxo/libxo-manual.html#dtrt-mode > > I could enable this by default and use it when NULL is passed for > an instance name. Hi Phil, I've thought a lot about this over the last days. Based on my experience with the conversion of pciconf, which was not written with XML or JSON output in mind and which outputs quite complex data structures, I have started to wonder whether a different concept might be easier to use for the programmer. I think it is possible to provide a different API that can co-exist with the current functions. First what I think the current libxo API provides: The API is very low-level in the sense, that every call generates (at most) one XML or JSON element, and it generates it the moment the call is made. xo_open_list: writes JSON label and starts array with "[{", no output for XML xo_open_instance: ignored for JSON, starts sub-tree within given element for XML xo_open_container: writes label and opening "{", starts sub-tree within given element for XML xo_emit: writes label and value for XML, creates element with given value I'm not describing the xo_close_* functions, but they just emit the syntactically required closing tags or brackets/braces. Use of this API leads to the following code to print error information (which is suppressed in the output, if there are no errors to report, which is the normal case). --------------------------------------------------------------------- static int errors; static void print_header(const char *header) { if (errors++ == 0) xo_open_list("error-category"); xo_open_instance("error-category"); xo_emit("{:category/%14s} = ", header); } static void print_bits(const char *header, struct bit_table *table, uint32_t mask) { int first; first = 1; for (; table->desc != NULL; table++) if (mask & table->mask) { if (first) { print_header(header); xo_open_list("detected-error"); first = 0; } else xo_emit(" "); xo_open_instance("detected-error"); xo_emit("{:description/%s}\n", table->desc); xo_close_instance("error"); mask &= ~table->mask; } if (!first) xo_close_list("detected-error"); if (mask != 0) { if (first) { print_header(header); first = 0; } else xo_emit(" "); xo_emit("Unknown: {:unknown-errors-bitmask/0x%08x}\n", mask); } if (!first) xo_close_instance("error-category"); if (errors != 0) xo_close_list("error-category"); } --------------------------------------------------------------------- Maybe I'm doing things wrong, but I had to use two state variables to control the correct printing of opening and closing tags ("first" was in the non-libxo version, "errors" was added just for libxo). I have to be careful to explicitly record state and to close any of the containers/lists that I open. This could have been much easier to write, if the libxo API was more abstract ("higher level") and printing of XML elements was decoupled from xo_XXX calls. The semantics of xo_open_{list,instance,container} is, that a new level in the tree structure is created, and everything following that call is in the new level, until the corresponding xo_close_*() is called. I'd rather want to describe the structure in a different way, for which I'm using an example syntax: xo_start_instance(instance_name): Creates a new instance *at the same level* as the previous one, the previous one is automatically closed (if any is open). If this is the first list/array item, an xo_open_list with the same name as that of the instance is implied. This call is thus equivalent to (pseudo-code using Python style indenting): if IN_SOME_CONTAINER() xo_close_container_d() if NOT_IN_LIST(instance_name) if IN_SOME_LIST() xo_close_list_d() xo_open_list(instance_name) if INSTANCE_IS_OPEN(instance_name) xo_close_instance(instance_name) xo_open_instance(instance_name) xo_start_container(container_name): Starts a new container at the same level as the previous one, closing any currently open container or list. if IN_SOME_LIST() if IN_SOME_INSTANCE() xo_close_instance_d() xo_close_list_d() if IN_SOME_CONTAINER() xo_close_container_d() xo_open_container() xo_start_level(debug_name): Starts a new sub-tree (which is implicitly done by xo_open_*). No output is generated, except that a previously open container or instance is implicitly closed. The previous state is kept on a stack and will become visible after the corresponding xo_end_level() is called. The debug_name is optional and just there to be checked to match the one passed to a corresponding xo_finish_level(debug_name). if IN_SOME_LIST() or IN_SOME_CONTAINER() close_* xo_push_state() xo_finish_level(): Close all open instances and containers and finish the sub-tree, restore the state before the previous xo_start_level(). if IN_SOME_LIST() or IN_SOME_CONTAINER() close_* xo_pop_state() The following call sequences should generate identical output, with indentation reflecting "pretty" XML: xo_open_container("group1"); xo_open_container("subgroup1"); xo_emit("{:label1/value}"); xo_emit("{:label2/value}"); xo_close_container("subgroup1"); xo_open_list("array1") xo_open_instance("array1"); xo_emit("{:array1-label1/value}"); xo_emit("{:array1-label2/value}"); xo_close_instance("array1"); xo_open_instance("array1"); xo_emit("{:array1-label1/value}"); xo_emit("{:array1-label2/value}"); xo_close_instance("array1"); xo_close_list("array1"); xo_emit("{:label3/value}"); xo_close_container("group1"); value value value value value value value With the xo_start_* functions I'd use the following for the same output: xo_start_container("group1"); xo_start_group(""); // (*1*) xo_start_container("subgroup1"); xo_emit("{:label1/value}"); xo_emit("{:label2/value}"); xo_start_instance("array1"); xo_emit("{:array1-label1/value}"); xo_emit("{:array1-label2/value}"); xo_start_instance("array1"); xo_emit("{:array1-label1/value}"); xo_emit("{:array1-label2/value}"); xo_finish_group(""); xo_emit("{:label3/value}"); (*1*) If no new "group" was started, then the next container would be at the same level in the tree, i.e. would be closed before is opened. All the xo_close can be implied, and I do not have to remember, if an opening tag has been written. The closing tags ore braces are always generated as required. This syntax can express everything that is formally correct with the xo_open/xo_close calls. E.g. it automatically places xo_open_list() around instances - and the names are required to match for the well formed cases anyway. The only "complication" is, that I'd need to put xo_start_group() and xo_finish_group() calls around the output for a sub-tree, else the generated data structure will be "flat". There is also no risk of run-away data structures, due to the omitting of closing tags. This can easily be taken care of by start/finish_group. If there is a subroutine that emits XML/JSON data into a sub-tree, then just wrap the commands in this sub-routine in xo_start_group(func_name) and xo_stop_group(func_name) - the arguments are not used to generate any output, just to generate debug output if XOF_WARN is set. The calls I suggest are more similar to the JSON output generated than to the XML output (which is similar structured to xo_open/close_*). I am convinced, that the libxo API that I propose simplifies use of libxo. It is higher level in the sense, that it is more decoupled from the generated output. In fact, it allows for semantic extensions, that could further simplify use of libxo: 1) If there is no call to xo_emit() within a container or instance, the tags for the container/instance could be omitted. That would allow to always call xo_start_container, before testing for any data to actually put into that container. In the example code from pciconf that I quoted above, that could remove the need for the helper function, if xo_optional_emit() was introduced as a variant of xo_emit(), that is only written if any regular xo_emit() is also printed into the same container. 2) I have seen cases, where list instances are created out of order by the non-libxo code, and to adapt that code to libxo, the loop has to be completely rewritten (leading to different output than before for the XO_TEXT case). This could be fixed with xo_start_instance(), if all the instances were collected between xo_start_group() to xo_finish_group() and all the JSON array structures were only written with all collected instances when the xo_close_container() implied by xo_finish_group() is called. This would allow the following code: xo_start_instance("array1"); xo_emit(array1-label/value1); xo_start_instance("array2"); xo_emit(array2-label/value2); xo_start_instance("array1"); xo_emit(array1-label/value3); leading to the following JSON output: array1 [ { array1-label: value1 } { array1-label: value3 } ], array2 [ { array2-label: value2 } ] This is of course also possible with xo_open/xo_close, but as soon you start to dissociate these calls from the output (by delaying the output), you could as well use the xo_start_* variants I suggest ... Anyway, except for the badly chosen name "xo_optional_emit()" (which as described above is for data, that should be omitted if there is no xo_emit() within the same container/instance, I think this is a workable API for libxo, which just spares the programmer from working as if he was directly writing XML output (with the addition of xo_open_list(), which in fact is ignored for XML and just required for JSON). The two APIs could be mixed with each other, if the semantics of xo_open_* and xo_start_* were aligned. E.g. any xo_open_list() after a xo_start_container/instance would just be interpreted as xo_start_group(); xo_start_list(); xo_finish_group(); and similar for containers (instances are only allowed inside lists and are thus covered by the wrapping of xo_start_list() shown above). It is obvious that it is not possible to specify xo_open*() via xo_start and xo_start via xo_open without generating cyclic dependencies, but the real implementation could be either way (with emulation of xo_open via xo_start being the simpler case, since it only requires the wrapping with xo_start_group). Here is the example code from above with the xo_start API: --------------------------------------------------------------------- static void print_bits(const char *header, struct bit_table *table, uint32_t mask) { int first; xo_start_group("print_bits1"); // just a debug label xo_start_instance("error-category"); first = 1; for (; table->desc != NULL; table++) if (mask & table->mask) { if (first) { xo_emit("{:category/%14s} = ", header); xo_start_group("print_bits2"); first = 0; } else xo_emit(" "); xo_start_instance("detected-error"); xo_emit("{:description/%s}\n", table->desc); mask &= ~table->mask; } if (mask != 0) { if (first) { xo_emit("{:category/%14s} = ", header); xo_start_group("print_bits2"); first = 0; } else xo_emit(" "); xo_emit("Unknown: {:unknown-errors-bitmask/0x%08x}\n", mask); } if (!first) xo_finish_group("print_bits2"); xo_finish_group("print_bits1"); } --------------------------------------------------------------------- (While doing the conversion, I noticed that there might be a bug in the call sequence of my code quoted in the first example with xo_open API, which I had not spotted before.) The above example is not only shorter, the diff against the non-libxo version is also significantly smaller and easier to understand and review. I'm open to further discussion and I understand, that it is important to not void effort put in code that uses xo_open*(). But I'd really love to be able to use a higher-level API and I think what I suggested fits the bill ;-) Best regards, STefan