From owner-freebsd-arch@FreeBSD.ORG Sun Jun 30 09:49:30 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [8.8.178.116]) by hub.freebsd.org (Postfix) with ESMTP id 992059AB; Sun, 30 Jun 2013 09:49:30 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from butcher-nb.yandex.net (hub.freebsd.org [IPv6:2001:1900:2254:206c::16:88]) by mx2.freebsd.org (Postfix) with ESMTP id 9039D4E35; Sun, 30 Jun 2013 09:49:28 +0000 (UTC) Message-ID: <51CFFE71.9080802@FreeBSD.org> Date: Sun, 30 Jun 2013 13:46:25 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: [RFC] Migrate network statistics to PCPU counters X-Enigmail-Version: 1.4.6 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigB504481EDEFC3CE9845874F7" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Jun 2013 09:49:30 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigB504481EDEFC3CE9845874F7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi All, A quite time we have in the tree implementation of the percpu counters. The usage of PCPU counters shows better performance and precision, most notably on the machines with many CPUs. I prepared several patches, that moves most of statistics structures in the network stack to the PCPU counters. These patches can be divided to the three parts: 1. Prepare all structures to migration. In most cases this is similar to s/some_type_t/uint64_t/ 2. Add several macros to reduce the code rewriting. 3. Migrate all structures to PCPU counters. Since first part breaks ABI, I want to commit these patches to the head/ before stable/10 will be created. This is the list of structures: ahstat, espstat, ipcompstat, ipipstat, ipsec4stat, ipsec6stat, pfkeystat, mrtstat, pimstat, arpstat, ip6stat, icmp6stat, in6_ifstat, icmp6_ifstat, rip6stat, pim6stat, mrt6stat, udpstat. ipstat and tcpstat already moved to PCPU counters, I just updated implementation. Also there is several structures I didn't touch, e.g. igmpstat, sctpstat. The second part of patches adds several helper macros to sys/counter.h and net/vnet.h. The main idea is that we declare array of counter_u64_t with number of elements equal to number of uint64_t elements in the stats structure and use offsetof() to get access to needed counter. In the same time, the stats structure used as interface with userland. All patches can be found here: http://people.freebsd.org/~ae/stats/ I didn't do any benchmarks yet, for now I just want to discuss implementation and acceptability of these patches. --=20 WBR, Andrey V. Elsukov --------------enigB504481EDEFC3CE9845874F7 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQEcBAEBAgAGBQJRz/52AAoJEAHF6gQQyKF6S+sH/3FIsf4kBoGQHXDnweKm4Sev OhvHIGc4OsKuW24NUdPqNORafjCXzPOQc4Hk761c5t6cRBnrFvtM+1ulS0K8gqmD AN1U6XNnwyKDyvmO/8zQBEGCROeSV4Lb6ejmFuIfm9zixPSekOJJmZkbtevxIwal NOAxY5ghK+yOrX3hz/WJqo2t51Dcw1pfGS/CV1MvwRMvw/L5ASMZ6mhv277MezXe AByOB//lZo3SS1R1bHWkv2yhkZNbSuTKplSGeGS9+QYUrU2ioJ9wFKEQCtJFBhUT +67kOm1bTErH4qpXrI65YjbS5fsFna2A5QmXxzQQ6QxnsMUX5kSpk5O/OTLUB9U= =6000 -----END PGP SIGNATURE----- --------------enigB504481EDEFC3CE9845874F7-- From owner-freebsd-arch@FreeBSD.ORG Sun Jun 30 17:54:02 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7C24271D; Sun, 30 Jun 2013 17:54:02 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id E6B8516E0; Sun, 30 Jun 2013 17:54:01 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r5UHrviO076656; Sun, 30 Jun 2013 20:53:57 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r5UHrviO076656 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r5UHrvcJ076655; Sun, 30 Jun 2013 20:53:57 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 30 Jun 2013 20:53:57 +0300 From: Konstantin Belousov To: "Andrey V. Elsukov" Subject: Re: [RFC] Migrate network statistics to PCPU counters Message-ID: <20130630175357.GJ91021@kib.kiev.ua> References: <51CFFE71.9080802@FreeBSD.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="8vsJbyf2sKRbOxj1" Content-Disposition: inline In-Reply-To: <51CFFE71.9080802@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Jun 2013 17:54:02 -0000 --8vsJbyf2sKRbOxj1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jun 30, 2013 at 01:46:25PM +0400, Andrey V. Elsukov wrote: > Hi All, >=20 > A quite time we have in the tree implementation of the percpu counters. > The usage of PCPU counters shows better performance and precision, most > notably on the machines with many CPUs. >=20 > I prepared several patches, that moves most of statistics structures in > the network stack to the PCPU counters. These patches can be divided to > the three parts: >=20 > 1. Prepare all structures to migration. In most cases this is similar > to s/some_type_t/uint64_t/ > 2. Add several macros to reduce the code rewriting. > 3. Migrate all structures to PCPU counters. >=20 > Since first part breaks ABI, I want to commit these patches to the head/ > before stable/10 will be created. This is the list of structures: >=20 > ahstat, espstat, ipcompstat, ipipstat, ipsec4stat, ipsec6stat, > pfkeystat, mrtstat, pimstat, arpstat, ip6stat, icmp6stat, in6_ifstat, > icmp6_ifstat, rip6stat, pim6stat, mrt6stat, udpstat. >=20 > ipstat and tcpstat already moved to PCPU counters, I just updated > implementation. Also there is several structures I didn't touch, e.g. > igmpstat, sctpstat. >=20 > The second part of patches adds several helper macros to sys/counter.h > and net/vnet.h. The main idea is that we declare array of counter_u64_t > with number of elements equal to number of uint64_t elements in the > stats structure and use offsetof() to get access to needed counter. In > the same time, the stats structure used as interface with userland. Only reacting to the last sentence. I think this is notoriously bad, the in-kernel interfaces and usermode ABI should be split. The reuse of the structures for kernel and for interfacing to usermode is the reason why we cannot ensure the stability of the management ABI even on stable. I suggest to not doing this, instead having the dedicated definitions for user mode, and also to add padding to the usermode interface for future extensions. >=20 > All patches can be found here: >=20 > http://people.freebsd.org/~ae/stats/ >=20 > I didn't do any benchmarks yet, for now I just want to discuss > implementation and acceptability of these patches. >=20 > --=20 > WBR, Andrey V. Elsukov >=20 --8vsJbyf2sKRbOxj1 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJR0HC0AAoJEJDCuSvBvK1BjCoP/jcrZLSk6krGkjKxZq8+3oZV tdCK6L4IIDfCY7f7PnNI6+2hbgCGRX9hulKILgDXCqN2sPv0Wgk/zI1dOuPhQ6ei LkAG9Mc7kcdmvq8rs7iSgDPZO5NmQHPwAzx3fxGOQHu9xg5IGhAg/UsbnlGPkXcM sUJlV26++mWQX4V+qtxxuYWmkVuTsYSuPQpuOrKvIPAfCBnjphY/dXpYGA/7rAfs uiqhfBHJ2+Zf3m/ajgewa+iWjzIOfMCwQeqCPkkll/tj2il8NdnwX0wY5omSNxy3 qEJ6TGLluP5i7jhq/atEQaMxX+08ZY3mTRYbjg942P+iqi+Z9WivB8xyUQMKofLV waNIwZDqrk9Z/QAEfb2mIwRhoyeFVuQY2gDwrvHacIW5dRzwwGTjKGFthYKoGRQI +2f3jARq/FYyAkNSzl1N+w4AFTXZjIoKGKBLK0+b5x46gviVsTOH/PBPdDfLdK03 R/+7l90KdoUnJxw7vf3o8UeZNAD7abbvG1ck1pAlb0djKFA8tSK5iMLrxMMO7be0 xQazPxgtS37C30125/W5RqNftOuWKEVqGlZoNiSimSV4HjC1DtocUTvf3WaxJIg7 /TKXtbKYP0qPZcvADhrFDjLi0exqbqDl2nzcJJeARmZEPKdr96jKw8KnZI/XZsnk 9sSaOwYIcVEoIksqg3LA =SotU -----END PGP SIGNATURE----- --8vsJbyf2sKRbOxj1-- From owner-freebsd-arch@FreeBSD.ORG Sun Jun 30 19:20:40 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:1900:2254:206a::19:2]) by hub.freebsd.org (Postfix) with ESMTP id D92E39F4 for ; Sun, 30 Jun 2013 19:20:40 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from butcher-nb.yandex.net (hub.freebsd.org [IPv6:2001:1900:2254:206c::16:88]) by mx2.freebsd.org (Postfix) with ESMTP id 249F320CE; Sun, 30 Jun 2013 19:20:39 +0000 (UTC) Message-ID: <51D0845C.7060508@FreeBSD.org> Date: Sun, 30 Jun 2013 23:17:48 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: [RFC] Migrate network statistics to PCPU counters References: <51CFFE71.9080802@FreeBSD.org> <20130630175357.GJ91021@kib.kiev.ua> In-Reply-To: <20130630175357.GJ91021@kib.kiev.ua> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Jun 2013 19:20:40 -0000 On 30.06.2013 21:53, Konstantin Belousov wrote: >> The second part of patches adds several helper macros to sys/counter.h >> and net/vnet.h. The main idea is that we declare array of counter_u64_t >> with number of elements equal to number of uint64_t elements in the >> stats structure and use offsetof() to get access to needed counter. In >> the same time, the stats structure used as interface with userland. > Only reacting to the last sentence. > > I think this is notoriously bad, the in-kernel interfaces and usermode ABI > should be split. The reuse of the structures for kernel and for interfacing > to usermode is the reason why we cannot ensure the stability of the > management ABI even on stable. > > I suggest to not doing this, instead having the dedicated definitions for > user mode, and also to add padding to the usermode interface for future > extensions. Actually each structure described above only used in the userland. In the kernel we use arrays of counters and do fetch values for each PCPU counter, then copy them into userland via stats structures. -- WBR, Andrey V. Elsukov From owner-freebsd-arch@FreeBSD.ORG Wed Jul 3 13:19:29 2013 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8798923F for ; Wed, 3 Jul 2013 13:19:29 +0000 (UTC) (envelope-from luiz.souza@ad.com.br) Received: from tx2outboundpool.messaging.microsoft.com (tx2ehsobe001.messaging.microsoft.com [65.55.88.11]) by mx1.freebsd.org (Postfix) with ESMTP id 4A21D1C2D for ; Wed, 3 Jul 2013 13:19:28 +0000 (UTC) Received: from mail139-tx2-R.bigfish.com (10.9.14.254) by TX2EHSOBE014.bigfish.com (10.9.40.34) with Microsoft SMTP Server id 14.1.225.23; Wed, 3 Jul 2013 13:19:21 +0000 Received: from mail139-tx2 (localhost [127.0.0.1]) by mail139-tx2-R.bigfish.com (Postfix) with ESMTP id 3D42B600AF; Wed, 3 Jul 2013 13:19:21 +0000 (UTC) X-Forefront-Antispam-Report: CIP:207.46.4.203; KIP:(null); UIP:(null); IPV:NLI; H:SN2PRD8002HT001.lamprd80.prod.outlook.com; RD:none; EFVD:NLI X-SpamScore: -4 X-BigFish: PS-4(zz98dI9371Ic85fh148cI1432Ide40hzz1f42h1ee6h1de0h1d18h1fdah2073h1202h1e76h1d1ah1d2ah1fc6hzzz2fh2a8h668h839hd25he5bhf0ah1288h12a5h12bdh137ah139eh1441h1504h1537h162dh1631h1662h1758h1898h18e1h1946h19b5h19ceh1ad9h1b0ah1bceh1d0ch1d2eh1d3fh1dfeh1dffh1e23h34h1155h) Received-SPF: pass (mail139-tx2: domain of ad.com.br designates 207.46.4.203 as permitted sender) client-ip=207.46.4.203; envelope-from=luiz.souza@ad.com.br; helo=SN2PRD8002HT001.lamprd80.prod.outlook.com ; .outlook.com ; Received: from mail139-tx2 (localhost.localdomain [127.0.0.1]) by mail139-tx2 (MessageSwitch) id 1372857559181984_22565; Wed, 3 Jul 2013 13:19:19 +0000 (UTC) Received: from TX2EHSMHS018.bigfish.com (unknown [10.9.14.246]) by mail139-tx2.bigfish.com (Postfix) with ESMTP id 1E5AD160CB3; Wed, 3 Jul 2013 13:19:19 +0000 (UTC) Received: from SN2PRD8002HT001.lamprd80.prod.outlook.com (207.46.4.203) by TX2EHSMHS018.bigfish.com (10.9.99.118) with Microsoft SMTP Server (TLS) id 14.1.225.23; Wed, 3 Jul 2013 13:19:18 +0000 Received: from [10.10.1.8] (201.72.203.67) by pod51028.outlook.com (10.27.50.185) with Microsoft SMTP Server (TLS) id 14.15.129.14; Wed, 3 Jul 2013 13:19:15 +0000 Subject: Re: FDT Support for GPIO (gpiobus and friends) MIME-Version: 1.0 (Apple Message framework v1085) Content-Type: multipart/mixed; boundary="Apple-Mail-13--586638903" From: Luiz Otavio O Souza In-Reply-To: <1142ABEB-7FDA-4CFE-9D12-F8FD2D4C85D6@bsdimp.com> Date: Wed, 3 Jul 2013 10:19:12 -0300 Message-ID: References: <20121205.060056.592894859995638978.hrs@allbsd.org> <04AEF097-025D-4B3F-A345-98878AE4A822@ad.com.br> <1142ABEB-7FDA-4CFE-9D12-F8FD2D4C85D6@bsdimp To: Warner Losh X-Mailer: Apple Mail (2.1085) X-Originating-IP: [201.72.203.67] X-OriginatorOrg: ad.com.br Cc: freebsd-arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Jul 2013 13:19:29 -0000 --Apple-Mail-13--586638903 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" On Jun 24, 2013, at 11:52 AM, Warner Losh wrote: > I'm loving this patch, mostly. >=20 > But I don't see where the ivar gets freed... Perhaps this is because = we dont' support detaching GPIOs? >=20 No, you are right. Even if most (if not all) gpio drivers don't allow = the detach, gpiobus can be built as a module and as such it has a detach = routine which IMO should do the right thing. Thanks for noticing this. Here is a patch with a fix (i hope it's correct). Luiz --Apple-Mail-13--586638903 Content-Disposition: attachment; filename="gpioled-fdt.diff" Content-Type: application/octet-stream; name="gpioled-fdt.diff" Content-Transfer-Encoding: 7bit Index: dev/gpio/gpiobus.c =================================================================== --- dev/gpio/gpiobus.c (revision 251700) +++ dev/gpio/gpiobus.c (working copy) @@ -27,6 +27,8 @@ #include __FBSDID("$FreeBSD$"); +#include "opt_platform.h" + #include #include #include @@ -41,6 +43,12 @@ #include #include +#ifdef FDT +#include +#include +#include +#endif + #include #include #include "gpio_if.h" @@ -83,6 +91,130 @@ #define GPIOBUS_ASSERT_UNLOCKED(_sc) mtx_assert(&_sc->sc_mtx, MA_NOTOWNED); +#ifdef FDT +static int +gpiobus_fdt_parse_pins(device_t dev) +{ + struct gpiobus_ivar *devi; + struct gpiobus_softc *sc; + int i, len; + pcell_t *gpios; + phandle_t gpio, node; + + /* Retrieve the FDT node and check for gpios property. */ + node = ofw_bus_get_node(dev); + if ((len = OF_getproplen(node, "gpios")) < 0) + return (EINVAL); + + /* Retrieve the gpios property. */ + gpios = malloc(len, M_DEVBUF, M_NOWAIT | M_ZERO); + if (gpios == NULL) + return (ENOMEM); + if (OF_getprop(node, "gpios", gpios, len) < 0) { + free(gpios, M_DEVBUF); + return (EINVAL); + } + + /* + * The OF_getprop() is returning 4 pcells. + * The first one is the GPIO controller phandler. + * The last three are GPIO pin, GPIO pin direction and GPIO pin flags. + */ + if ((len / sizeof(pcell_t)) % 4) { + free(gpios, M_DEVBUF); + return (EINVAL); + } + devi = GPIOBUS_IVAR(dev); + devi->npins = len / (sizeof(pcell_t) * 4); + devi->pins = malloc(sizeof(uint32_t) * devi->npins, M_DEVBUF, + M_NOWAIT | M_ZERO); + if (devi->pins == NULL) { + free(gpios, M_DEVBUF); + return (ENOMEM); + } + + for (i = 0; i < devi->npins; i++) { + + /* Verify if we're attaching to the correct gpio controller. */ + gpio = OF_instance_to_package(fdt32_to_cpu(gpios[i * 4 + 0])); + if (!OF_hasprop(gpio, "gpio-controller") || + gpio != ofw_bus_get_node(device_get_parent( + device_get_parent(dev)))) { + free(devi->pins, M_DEVBUF); + free(gpios, M_DEVBUF); + return (EINVAL); + } + + /* Attach the child device to gpiobus. */ + sc = device_get_softc(device_get_parent(dev)); + + devi->pins[i] = fdt32_to_cpu(gpios[i * 4 + 1]); + /* (void)gpios[i * 4 + 2] - GPIO pin direction */ + /* (void)gpios[i * 4 + 3] - GPIO pin flags */ + + if (devi->pins[i] > sc->sc_npins) { + device_printf(dev, "invalid pin %d, max: %d\n", + devi->pins[i], sc->sc_npins - 1); + free(devi->pins, M_DEVBUF); + free(gpios, M_DEVBUF); + return (EINVAL); + } + + /* + * Mark pin as mapped and give warning if it's already mapped. + */ + if (sc->sc_pins_mapped[devi->pins[i]]) { + device_printf(dev, + "warning: pin %d is already mapped\n", + devi->pins[i]); + free(devi->pins, M_DEVBUF); + free(gpios, M_DEVBUF); + return (EINVAL); + } + sc->sc_pins_mapped[devi->pins[i]] = 1; + } + + free(gpios, M_DEVBUF); + return (0); +} + +int +gpiobus_fdt_add_child(driver_t *driver, device_t bus, phandle_t childnode) +{ + struct gpiobus_ivar *devi; + device_t child; + + devi = malloc(sizeof(*devi), M_DEVBUF, M_NOWAIT | M_ZERO); + if (devi == NULL) + return (-1); + + if (ofw_bus_gen_setup_devinfo(&devi->ofw, childnode) != 0) { + device_printf(bus, "could not set up devinfo\n"); + free(devi, M_DEVBUF); + return (-1); + } + + /* Add newbus device for the child. */ + child = device_add_child(bus, driver->name, -1); + if (child == NULL) { + device_printf(bus, "could not add child: %s\n", + devi->ofw.obd_name); + /* XXX should unmap */ + ofw_bus_gen_destroy_devinfo(&devi->ofw); + free(devi, M_DEVBUF); + return (-1); + } + device_set_ivars(child, devi); + if (gpiobus_fdt_parse_pins(child) != 0) { + device_delete_child(bus, child); + ofw_bus_gen_destroy_devinfo(&devi->ofw); + free(devi, M_DEVBUF); + return (-1); + } + return (0); +} +#endif + static void gpiobus_print_pins(struct gpiobus_ivar *devi) { @@ -151,6 +283,7 @@ if (i >= sc->sc_npins) { device_printf(child, "invalid pin %d, max: %d\n", i, sc->sc_npins - 1); + free(devi->pins, M_DEVBUF); return (EINVAL); } @@ -161,6 +294,7 @@ if (sc->sc_pins_mapped[i]) { device_printf(child, "warning: pin %d is already mapped\n", i); + free(devi->pins, M_DEVBUF); return (EINVAL); } sc->sc_pins_mapped[i] = 1; @@ -207,6 +341,7 @@ /* * Get parent's pins and mark them as unmapped */ + bus_generic_probe(dev); bus_enumerate_hinted_children(dev); return (bus_generic_attach(dev)); } @@ -218,19 +353,30 @@ static int gpiobus_detach(device_t dev) { - struct gpiobus_softc *sc = GPIOBUS_SOFTC(dev); - int err; + struct gpiobus_softc *sc; + struct gpiobus_ivar *devi; + device_t *devlist; + int i, err, ndevs; + sc = GPIOBUS_SOFTC(dev); KASSERT(mtx_initialized(&sc->sc_mtx), ("gpiobus mutex not initialized")); GPIOBUS_LOCK_DESTROY(sc); if ((err = bus_generic_detach(dev)) != 0) return (err); + if ((err = device_get_children(dev, &devlist, &ndevs)) != 0) + return (err); + for (i = 0; i < ndevs; i++) { + device_delete_child(dev, devlist[i]); + devi = GPIOBUS_IVAR(devlist[i]); + if (devi->pins) { + free(devi->pins, M_DEVBUF); + devi->pins = NULL; + } + } + free(devlist, M_TEMP); - /* detach and delete all children */ - device_delete_children(dev); - if (sc->sc_pins_mapped) { free(sc->sc_pins_mapped, M_DEVBUF); sc->sc_pins_mapped = NULL; @@ -445,6 +591,17 @@ return GPIO_PIN_TOGGLE(sc->sc_dev, devi->pins[pin]); } +#ifdef FDT +static const struct ofw_bus_devinfo * +gpiobus_get_devinfo(device_t bus, device_t child) +{ + struct gpiobus_ivar *devi; + + devi = device_get_ivars(child); + return (&devi->ofw); +} +#endif + static device_method_t gpiobus_methods[] = { /* Device interface */ DEVMETHOD(device_probe, gpiobus_probe), @@ -473,6 +630,16 @@ DEVMETHOD(gpiobus_pin_set, gpiobus_pin_set), DEVMETHOD(gpiobus_pin_toggle, gpiobus_pin_toggle), +#ifdef FDT + /* OFW bus interface */ + DEVMETHOD(ofw_bus_get_devinfo, gpiobus_get_devinfo), + DEVMETHOD(ofw_bus_get_compat, ofw_bus_gen_get_compat), + DEVMETHOD(ofw_bus_get_model, ofw_bus_gen_get_model), + DEVMETHOD(ofw_bus_get_name, ofw_bus_gen_get_name), + DEVMETHOD(ofw_bus_get_node, ofw_bus_gen_get_node), + DEVMETHOD(ofw_bus_get_type, ofw_bus_gen_get_type), +#endif + DEVMETHOD_END }; Index: dev/gpio/gpiobusvar.h =================================================================== --- dev/gpio/gpiobusvar.h (revision 251700) +++ dev/gpio/gpiobusvar.h (working copy) @@ -30,10 +30,16 @@ #ifndef __GPIOBUS_H__ #define __GPIOBUS_H__ +#include "opt_platform.h" + #include #include #include +#ifdef FDT +#include +#endif + #define GPIOBUS_IVAR(d) (struct gpiobus_ivar *) device_get_ivars(d) #define GPIOBUS_SOFTC(d) (struct gpiobus_softc *) device_get_softc(d) @@ -50,8 +56,15 @@ struct gpiobus_ivar { +#ifdef FDT + struct ofw_bus_devinfo ofw; /* FDT device info */ +#endif uint32_t npins; /* pins total */ uint32_t *pins; /* pins map */ }; +#ifdef FDT +int gpiobus_fdt_add_child(driver_t *, device_t, phandle_t); +#endif + #endif /* __GPIOBUS_H__ */ Index: dev/gpio/gpioled.c =================================================================== --- dev/gpio/gpioled.c (revision 251700) +++ dev/gpio/gpioled.c (working copy) @@ -27,6 +27,8 @@ #include __FBSDID("$FreeBSD$"); +#include "opt_platform.h" + #include #include #include @@ -43,11 +45,20 @@ #include #include "gpiobus_if.h" +#ifdef FDT +#include +#include +#include +#include +#endif + /* * Only one pin for led */ #define GPIOLED_PIN 0 +#define GPIOLED_MAXBUF 32 + #define GPIOLED_LOCK(_sc) mtx_lock(&(_sc)->sc_mtx) #define GPIOLED_UNLOCK(_sc) mtx_unlock(&(_sc)->sc_mtx) #define GPIOLED_LOCK_INIT(_sc) \ @@ -68,6 +79,35 @@ static int gpioled_attach(device_t); static int gpioled_detach(device_t); +#ifdef FDT +static void +gpioled_identify(driver_t *driver, device_t bus) +{ + phandle_t child, leds, root; + + root = OF_finddevice("/"); + if (root == 0) + return; + leds = fdt_find_compatible(root, "gpio-leds", 1); + if (leds == 0) + return; + for (child = OF_child(leds); child != 0; child = OF_peer(child)) { + + /* Check and process 'status' property. */ + if (!(fdt_is_enabled(child))) + continue; + + /* Property gpios must exist. */ + if (!OF_hasprop(child, "gpios")) + continue; + + /* Add the gpiobus child. */ + if (gpiobus_fdt_add_child(driver, bus, child) != 0) + continue; + } +} +#endif + static void gpioled_control(void *priv, int onoff) { @@ -93,15 +133,27 @@ gpioled_attach(device_t dev) { struct gpioled_softc *sc; - const char *name; + char *name; +#ifdef FDT + phandle_t node; +#endif sc = device_get_softc(dev); sc->sc_dev = dev; sc->sc_busdev = device_get_parent(dev); GPIOLED_LOCK_INIT(sc); +#ifdef FDT + name = malloc(GPIOLED_MAXBUF + 1, M_DEVBUF, M_NOWAIT | M_ZERO); + if (name == NULL) + return (ENOMEM); + node = ofw_bus_get_node(dev); + if (OF_getprop(node, "label", name, GPIOLED_MAXBUF) == -1) + OF_getprop(node, "name", name, GPIOLED_MAXBUF); +#else if (resource_string_value(device_get_name(dev), device_get_unit(dev), "name", &name)) name = NULL; +#endif GPIOBUS_PIN_SETFLAGS(sc->sc_busdev, sc->sc_dev, GPIOLED_PIN, GPIO_PIN_OUTPUT); @@ -109,6 +161,9 @@ sc->sc_leddev = led_create(gpioled_control, sc, name ? name : device_get_nameunit(dev)); +#ifdef FDT + free(name, M_DEVBUF); +#endif return (0); } @@ -130,6 +185,9 @@ static device_method_t gpioled_methods[] = { /* Device interface */ +#ifdef FDT + DEVMETHOD(device_identify, gpioled_identify), +#endif DEVMETHOD(device_probe, gpioled_probe), DEVMETHOD(device_attach, gpioled_attach), DEVMETHOD(device_detach, gpioled_detach), --Apple-Mail-13--586638903-- From owner-freebsd-arch@FreeBSD.ORG Thu Jul 4 21:53:05 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0E1A5483 for ; Thu, 4 Jul 2013 21:53:05 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 91EAA1089 for ; Thu, 4 Jul 2013 21:53:03 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id 73BADF40 for ; Thu, 4 Jul 2013 23:48:22 +0200 (CEST) Date: Thu, 4 Jul 2013 23:53:30 +0200 From: Pawel Jakub Dawidek To: arch@FreeBSD.org Subject: General purpose library for name/value pairs. Message-ID: <20130704215329.GG1402@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="pe+tqlI1iYzVj1X/" Content-Disposition: inline X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Jul 2013 21:53:05 -0000 --pe+tqlI1iYzVj1X/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi. During my last project I took time to implement general purpose name/value pair library similar to what exists in Solaris/IllumOS. Currently we have plenty of random methods to pass data around: - nmount(2) implements its own name/value pair approach, - GEOM uses XML to pass data to userland, - various sysctls and ioctls use binary structures to export data to userland. The libnv library I implemented operates around two types: - nvpair_t that describes single name/value pair, - nvlist_t that describes a list of name/value pairs. In most cases you can do without nvpair_t. Let's try an example: nvlist_t *nvl; nvl =3D nvlist_create(0); nvlist_add_string(nvl, "filename", "/tmp/foo"); nvlist_add_int32(nvl, "flags", O_CREAT | O_WRONLY); nvlist_add_uint16(nvl, "mode", 0600); if (nvlist_send(sock, nvl) < 0) { nvlist_destroy(nvl); warn(1, "nvlist_send() failed"); return (-1); } nvlist_destroy(nvl); return (0); As you can see we first allocate nvlist and add three elements to it. Note that we neither check if allocation succeeded nor individual additions. The library is designed so that write-like operations can gracefully handle previous failures. For example if nvlist_create() fails, it returns NULL, but nvlist_add_string() detects if nvl is NULL and does nothing. If nvlist_create() succeeded, but nvlist_add_string() failed, nvlist_add_string() will set an error within nvlist structure. Once we build out nvlist we can either call nvlist_send(), which will fail (if nvl is NULL or has an error set) or we can call nvlist_error() to check for error (if nvl is NULL, ENOMEM will be returned). nvlist_destroy() also accepts NULL nvl and it never modifies global errno value for convenience (that's why warn() above will show correct error message). The API doesn't support types like 'int', 'long' or any other type with arch-dependent size. This is on purpose. I want this API to be used for communication between userland and kernel, even if userland is 32bit and kernel is 64bit, as well as for network communication. nvlist_send() internally uses nvlist_pack() function that produce binary blob to send over the network or to the kernel. It implements adaptive endianess concept known from ZFS - it doesn't convert all values to the network byte order, but it records machine's order in the header, so if receiver uses different byte order, only then swapping will be done. If both machines are either little or big endian no extra swapping will occur. The library allows to send and receive descriptors, of course only over UNIX domain sockets: nvlist_t *nvl; int fd; fd =3D open("/etc/passwd", O_RDONLY); if (fd < 0) err(1, "open(/etc/passwd) failed"); nvl =3D nvlist_create(0); nvlist_add_string(nvl, "filename", "/etc/passwd"); nvlist_move_descriptor(nvl, "fd", fd); if (nvlist_send(sock, nvl) < 0) err(1, "nvlist_send() failed"); nvlist_destroy(nvl); Also note that I used nvlist_move_descriptor() function and not nvlist_add_descriptor(). The former will allow nvlist to consume the given descriptor, so we don't have to close it, the latter will dup(2) the given descriptor and then add it to the nvlist. So, to add elements to the list one can use either nvlist_add_() or nvlist_move_() functions. To get the values one also have two choices: nvlist_get_() and nvlist_take_() - the former just returns the value (but the value is still part of the nvlist) and the latter removes associated nvpair from nvlist and returns its value. For example: nvlist_t *nvl; const char *command; char *filename; int fd; nvl =3D nvlist_recv(sock); if (nvl =3D=3D NULL) err(1, "nvlist_recv() failed"); command =3D nvlist_get_string(nvl, "command"); filename =3D nvlist_take_string(nvl, "filename"); fd =3D nvlist_take_descriptor(nvl, "fd"); printf("command=3D%s filename=3D%s fd=3D%d\n", command, filename, fd); nvlist_destroy(nvl); free(filename); close(fd); /* command was freed by nvlist_destroy() */ Of course nvlist_move_() and nvlist_take_() functions are only available for types that require some resources (memory or descriptor), so there is nvlist_move_string(), nvlist_move_int8_array(), but there is no nvlist_move_int8(), as it doesn't make sense. Also note that if there is no element matching the given name and type, the function will fail on internal assertion. This is caller's responsibility to verify if the given element is on the list before trying to get it using nvlist_exists_() API: if (nvlist_exists_string(nvl, "command")) command =3D nvlist_get_string(nvl, "command"); else command =3D NULL; To iterate over all elements one can do the following: nvlist_t *nvl; nvpair_t *nvp; nvl =3D nvlist_recv(sock); if (nvl =3D=3D NULL) err(1, "nvlist_recv() failed"); for (nvp =3D nvlist_first_nvpair(nvl); nvp !=3D NULL; nvp =3D nvlist_next_nvpair(nvl, nvp)) { printf("name=3D%s type=3D%d\n", nvpair_name(nvp), nvpair_type(nvp)); } For every function that takes name as an argument there are two more functions that allow to provide name as format string and arguments or as va_list, for example: int16_t nvlist_get_int16(const nvlist_t *nvl, const char *name); int16_t nvlist_getf_int16(const nvlist_t *nvl, const char *namefmt, ...) _= _printflike(2, 3); int16_t nvlist_getv_int16(const nvlist_t *nvl, const char *namefmt, va_lis= t nameap) __printflike(2, 0); The whole implementation can be found here: http://people.freebsd.org/~pjd/libnv.tgz Only the nv.h header file is public, so yes, nvlist and nvpair structures are not exposed to callers. I'd be grateful for opinions. This work was sponsored by the FreeBSD Foundation. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://mobter.com --pe+tqlI1iYzVj1X/ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iEYEARECAAYFAlHV7tkACgkQForvXbEpPzQ/0gCgscOMG5L/qrK0k+cVIvr4wguR OYEAoOPt682UHpnFIUfkeTrRJ99sciAK =gw5Q -----END PGP SIGNATURE----- --pe+tqlI1iYzVj1X/-- From owner-freebsd-arch@FreeBSD.ORG Fri Jul 5 07:16:26 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2BE6B79B; Fri, 5 Jul 2013 07:16:26 +0000 (UTC) (envelope-from phk@phk.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 8EBE41807; Fri, 5 Jul 2013 07:16:25 +0000 (UTC) Received: from critter.freebsd.dk (unknown [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 49C873EB32; Fri, 5 Jul 2013 07:07:54 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.7/8.14.7) with ESMTP id r6577rMM004819; Fri, 5 Jul 2013 07:07:53 GMT (envelope-from phk@phk.freebsd.dk) To: Pawel Jakub Dawidek Subject: Re: General purpose library for name/value pairs. In-reply-to: <20130704215329.GG1402@garage.freebsd.pl> From: "Poul-Henning Kamp" References: <20130704215329.GG1402@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1 Date: Fri, 05 Jul 2013 07:07:53 +0000 Message-ID: <4818.1373008073@critter.freebsd.dk> Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jul 2013 07:16:26 -0000 In message <20130704215329.GG1402@garage.freebsd.pl>, Pawel Jakub Dawidek write s: >Currently we have plenty of random methods to pass data around: >- nmount(2) implements its own name/value pair approach, >- GEOM uses XML to pass data to userland, >- various sysctls and ioctls use binary structures to export data to > userland. You even overlooked a number of other marshalling implementations, NETGRAPH for instance. As should be obvious from the above list, where I'm guilty of most of it, I think this is a fundamentally a good and overdue idea. I am not entirely convinced that one-size-fits-all is possible, but a name/value list is basically what I ended up with both in nmount() and GEOM::gctl. Exporting complex state with XML is very hard to beat when it comes to code complexity in the kernel. For one thing, you'd need nested n/v lists which is where I decided in GEOM that XML was the way to go. Also, exporting performance counters via mmap(2) (see: devstat) is far superior to anything else, because it means no-syscall high-speed monitoring is possible. ifconfig(8) and netstat(1) could really use a total rewrite along those lines. So, yes I think your n/v idea has a lot of merits, but it's not going to be The Final Solution, and you should not force it on problems where we have better solutions. Implementation issues: Are duplicate names allowed, and if so, do they keep stable order ? Otherwise we will need some other API-convenient form of array-simulation, for instance for variable number of slices in GEOM slicers etc. >Note that we neither check if allocation succeeded nor individual >additions. The library is designed so that write-like operations can >gracefully handle previous failures. Yes, this is important, otherwise the error handling gets too maddening. Also remember a function for debugging which renders a nvlist_t (into a sbuf ?) >So, to add elements to the list one can use either nvlist_add_() >or nvlist_move_() functions. To get the values one also have two >choices: nvlist_get_() and nvlist_take_() - the former just >returns the value (but the value is still part of the nvlist) and the >latter removes associated nvpair from nvlist and returns its value. Do consider having these functions in a variant with a default argument, so that people don't have to wrap each optioanl n/v pair in an if(). One of the things you will need in the kernel is to check that sets of nv's contain what they should, before continuing processing. One particular thing to detect is extra, unwanted, elements, which at least represent a programmer mistake and which may become a security risk down the road, if they suddenly gets used. I would do this globally, by insisting that (kernel-)code cleans the nv lists it is passed, and then in syscall-return check that all nv-lists are in fact empty. Leaving it to programmers is more risky, most people will not grasp the need to be that careful for the long term, but in that case you will need functions like: int nvlist_accept_only(nvlist_t *, const char *, ...); which complains if there are any "strange" nv pairs. (see above for arrays, it should support that.) The complementary function, is also needed: int nvlist_has_all_of(nvlist_t *, const char *, ...); Finally, this kind of complex-data API causes errors which errno does not have the expressive power to communicate: "foo" not allowed when "bar" also specified Some years ago I proposed an API extension to allow the kernel/libs to return "detailed error strings" and I think this is a better idea than trying to shoe-horn errorstrings into individual API call. Something like: int error_buffer(char *, size_t len) Registers a char[] where kernel/libs can dump error strings. Thereafter, whenever a library or kernel call fail, you *MAY* have additional details in that bufffer: static char myerrbuf[1024]; [...] assert(0 == error_buffer(myerrbuf, sizeof myerrbuf)); strcpy(myerrbuf, "(No details available)"; [...] nvl = nvlist_recv(sock); if (nvl == NULL) err(1, "nvlist_recv() failed: %s", myerrbuf); [...] This is just to show the principle, it should be thought through, for instance maybe strerror() should know about and report the buffer contents, to get instant benefit without code changes ? And now for a really nasty question: Have you considered marshalling the n/v lists as JSON strings ? It would be pretty dang efficient just to stuff it into an sbuf and it would make transport into the kernel a breeze. Nobody says the kernel has to support anything but that exact JSON subset generated by your library functions, so parsing it it not too bad. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Fri Jul 5 19:52:41 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B7735AAC for ; Fri, 5 Jul 2013 19:52:41 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 2362B1FD4 for ; Fri, 5 Jul 2013 19:52:39 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id 987E3209; Fri, 5 Jul 2013 21:47:49 +0200 (CEST) Date: Fri, 5 Jul 2013 21:52:55 +0200 From: Pawel Jakub Dawidek To: Poul-Henning Kamp Subject: Re: General purpose library for name/value pairs. Message-ID: <20130705195255.GB25842@garage.freebsd.pl> References: <20130704215329.GG1402@garage.freebsd.pl> <4818.1373008073@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="PmA2V3Z32TCmWXqI" Content-Disposition: inline In-Reply-To: <4818.1373008073@critter.freebsd.dk> X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jul 2013 19:52:41 -0000 --PmA2V3Z32TCmWXqI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 05, 2013 at 07:07:53AM +0000, Poul-Henning Kamp wrote: > In message <20130704215329.GG1402@garage.freebsd.pl>, Pawel Jakub Dawidek= write > s: >=20 > >Currently we have plenty of random methods to pass data around: > >- nmount(2) implements its own name/value pair approach, > >- GEOM uses XML to pass data to userland, > >- various sysctls and ioctls use binary structures to export data to > > userland. >=20 > You even overlooked a number of other marshalling implementations, > NETGRAPH for instance. >=20 > As should be obvious from the above list, where I'm guilty of most > of it, I think this is a fundamentally a good and overdue idea. >=20 > I am not entirely convinced that one-size-fits-all is possible, but > a name/value list is basically what I ended up with both in nmount() > and GEOM::gctl. >=20 > Exporting complex state with XML is very hard to beat when it comes > to code complexity in the kernel. >=20 > For one thing, you'd need nested n/v lists which is where I decided > in GEOM that XML was the way to go. I do support nesting, but with a limit to avoid too deep recursion during nvlist_pack/unpack. I haven't said that, but my plan is not to replace existing mechanisms, ie. XML in GEOM or even nmount(2). This would be hard as we would need to retain backward compatibility anyway. My main goal is to have something for use with new code. If someone will be willing to replace existing formats, that's fine be me, but not my main goal. > Also, exporting performance counters via mmap(2) (see: devstat) is > far superior to anything else, because it means no-syscall high-speed > monitoring is possible. Yes, libnv is not for use in very-performance-sensitive cases. > ifconfig(8) and netstat(1) could really use a total rewrite along > those lines. >=20 > So, yes I think your n/v idea has a lot of merits, but it's not > going to be The Final Solution, and you should not force it on > problems where we have better solutions. I'm not planning to:) > Implementation issues: >=20 > Are duplicate names allowed, and if so, do they keep stable order ? Duplicates are allowed and the ordering is stable unless one of those flags is specified: /* Duplicated names are not allowed. */ #define NV_FLAG_UNIQUE_NAME 0x01 /* Duplicated names of the same type are not allowed. */ #define NV_FLAG_UNIQUE_NAME_TYPE 0x02 To be honest I'd much prefer not to support duplicates, because arrays of values are supported as well as nesting. Not supporting duplicates would simplify implementation a bit. > Otherwise we will need some other API-convenient form of array-simulation, > for instance for variable number of slices in GEOM slicers etc. The API does support arrays. All nvlist_add_() functions: void nvlist_add_nvpair(nvlist_t *nvl, nvpair_t *nvp); void nvlist_add_null(nvlist_t *nvl, const char *name); void nvlist_add_bool(nvlist_t *nvl, const char *name, bool value); void nvlist_add_int8(nvlist_t *nvl, const char *name, int8_t value); void nvlist_add_uint8(nvlist_t *nvl, const char *name, uint8_t value); void nvlist_add_int16(nvlist_t *nvl, const char *name, int16_t value); void nvlist_add_uint16(nvlist_t *nvl, const char *name, uint16_t value); void nvlist_add_int32(nvlist_t *nvl, const char *name, int32_t value); void nvlist_add_uint32(nvlist_t *nvl, const char *name, uint32_t value); void nvlist_add_int64(nvlist_t *nvl, const char *name, int64_t value); void nvlist_add_uint64(nvlist_t *nvl, const char *name, uint64_t value); void nvlist_add_string(nvlist_t *nvl, const char *name, const char *value); void nvlist_add_stringf(nvlist_t *nvl, const char *name, const char *value= fmt, ...) __printflike(3, 4); void nvlist_add_stringv(nvlist_t *nvl, const char *name, const char *value= fmt, va_list valueap) __printflike(3, 0); void nvlist_add_nvlist(nvlist_t *nvl, const char *name, const nvlist_t *va= lue); void nvlist_add_descriptor(nvlist_t *nvl, const char *name, int value); void nvlist_add_bool_array(nvlist_t *nvl, const char *name, const bool *va= lue, size_t nitems); void nvlist_add_int8_array(nvlist_t *nvl, const char *name, const int8_t *= value, size_t nitems); void nvlist_add_uint8_array(nvlist_t *nvl, const char *name, const uint8_t= *value, size_t nitems); void nvlist_add_int16_array(nvlist_t *nvl, const char *name, const int16_t= *value, size_t nitems); void nvlist_add_uint16_array(nvlist_t *nvl, const char *name, const uint16= _t *value, size_t nitems); void nvlist_add_int32_array(nvlist_t *nvl, const char *name, const int32_t= *value, size_t nitems); void nvlist_add_uint32_array(nvlist_t *nvl, const char *name, const uint32= _t *value, size_t nitems); void nvlist_add_int64_array(nvlist_t *nvl, const char *name, const int64_t= *value, size_t nitems); void nvlist_add_uint64_array(nvlist_t *nvl, const char *name, const uint64= _t *value, size_t nitems); void nvlist_add_string_array(nvlist_t *nvl, const char *name, const char *= const *value, size_t nitems); void nvlist_add_nvlist_array(nvlist_t *nvl, const char *name, const nvlist= _t * const *value, size_t nitems); void nvlist_add_descriptor_array(nvlist_t *nvl, const char *name, const in= t *value, size_t nitems); Because it is easy to use format strings for names and because the library support nesting, one could also do the following: nvlist_t *nvl, *slice; int i; nvl =3D nvlist_create(0); nvlist_add_uint32(nvl, "nslices", nslices); for (i =3D 0; i < nslices; i++) { slice =3D nvlist_create(0); nvlist_add_uint64(slice, "offset", slices[i]->offset); nvlist_add_uint64(slice, "size", slices[i]->size); nvlist_add_string(slice, "label", slices[i]->label); nvlist_addf_nvlist(nvl, slice, "slice%d", i); } > >Note that we neither check if allocation succeeded nor individual > >additions. The library is designed so that write-like operations can > >gracefully handle previous failures.=20 >=20 > Yes, this is important, otherwise the error handling gets too maddening. Right. sbuf(9) was inspiration for that:) > Also remember a function for debugging which renders a nvlist_t (into > a sbuf ?) Currently I have: void nvlist_dump(const nvlist_t *nvl, int fd); void nvlist_fdump(const nvlist_t *nvl, FILE *out); Not depending on sbuf could encourage wider addoption, so I'd prefer not to do that. Converting it to JSON as simple string might be good idea. > >So, to add elements to the list one can use either nvlist_add_() > >or nvlist_move_() functions. To get the values one also have two > >choices: nvlist_get_() and nvlist_take_() - the former just > >returns the value (but the value is still part of the nvlist) and the > >latter removes associated nvpair from nvlist and returns its value. >=20 > Do consider having these functions in a variant with a default > argument, so that people don't have to wrap each optioanl n/v pair > in an if(). Using default value to report a problem is too error-prone for my taste and not intuitive. I was considering it (I even have earlier nv implementation in src/sbin/hastd/nv.c that does that), but I decided against it. For some types it is trivial, ie. if nvlist_get_string() returns NULL we know there is a problem, but if nvlist_get_int32() returns 0 it is hard to say if it is a bug or not. In Solaris/IllumOS there is additional set of functions that return an error and use additional function argument for the result, for example we could have: int32_t nvlist_get_int32(const nvlist_t *nvl, const char *name); int nvlist_cget_int32(const nvlist_t *nvl, const char *name, int32_t *valu= ep); I was considering this as well, but there is a lot of functions as it is now and I'm not sure I want to add ~100 more (nvlist_cget_() and nvlist_ctake_() for every type multiplied by format string version multiplied by va_list version). > One of the things you will need in the kernel is to check that sets > of nv's contain what they should, before continuing processing. >=20 > One particular thing to detect is extra, unwanted, elements, > which at least represent a programmer mistake and which may become > a security risk down the road, if they suddenly gets used. >=20 > I would do this globally, by insisting that (kernel-)code cleans > the nv lists it is passed, and then in syscall-return check that all > nv-lists are in fact empty. >=20 > Leaving it to programmers is more risky, most people will not > grasp the need to be that careful for the long term, but in > that case you will need functions like: >=20 > int nvlist_accept_only(nvlist_t *, const char *, ...); >=20 > which complains if there are any "strange" nv pairs. > (see above for arrays, it should support that.) >=20 > The complementary function, is also needed: >=20 > int nvlist_has_all_of(nvlist_t *, const char *, ...); That's interesting idea. I do that in my current consumers of this API by walking nvlist_t in a loop and checking for extra nvpairs. Having more reusable mechanism for that would be nice. What you propose is not enough, as we need three informations about nvpair (in most cases): 1. name 2. type 3. required/optional So we could allow to create static const arrays with nvlist specification that we could compare against, eg. struct nvpair_spec { const char *name; int type; bool required; }; #define NVPAIR_SPEC_SENTINEL { NULL, NV_TYPE_NONE, false } static const struct nvpair_spec geom_provider_spec[] =3D { { "name", NV_TYPE_STRING, true }, { "mediasize", NV_TYPE_UINT64, true }, { "sectorsize", NV_TYPE_UINT64, true }, { "stripesize", NV_TYPE_UINT64, false }, NVPAIR_SPEC_SENTINEL }; /* nvlist_check() could take buffer for detailed error message */ if (nvlist_check(nvl, geom_provider_spec) < 0) err(1, "nvlist doesn't match specification"); > Finally, this kind of complex-data API causes errors which errno > does not have the expressive power to communicate: >=20 > "foo" not allowed when "bar" also specified >=20 > Some years ago I proposed an API extension to allow the kernel/libs > to return "detailed error strings" and I think this is a better > idea than trying to shoe-horn errorstrings into individual API call. >=20 > Something like: >=20 > int error_buffer(char *, size_t len) >=20 > Registers a char[] where kernel/libs can dump error strings. >=20 > Thereafter, whenever a library or kernel call fail, you *MAY* have > additional details in that bufffer: >=20 > static char myerrbuf[1024]; > [...] > assert(0 =3D=3D error_buffer(myerrbuf, sizeof myerrbuf)); > strcpy(myerrbuf, "(No details available)"; > [...] > nvl =3D nvlist_recv(sock); > if (nvl =3D=3D NULL) > err(1, "nvlist_recv() failed: %s", myerrbuf); > [...] >=20 > This is just to show the principle, it should be thought through, > for instance maybe strerror() should know about and report the > buffer contents, to get instant benefit without code changes ? Fine idea, but doesn't seem to directly related to libnv. I can imagine implementing it in libnv internally, so that we not only keep internal error number, but also internal error message that can be obtained with nvlist_errmsg(). If nvl=3DNULL is passed it will just return "No memory" or similar. It would be especially useful for the nvlist_check() above, but also to tell which nvlist_add_() exactly failed, etc. I like it. There is only a problem with translating those messages, which we keep avoiding. > And now for a really nasty question: >=20 > Have you considered marshalling the n/v lists as JSON strings ? >=20 > It would be pretty dang efficient just to stuff it into an sbuf > and it would make transport into the kernel a breeze. >=20 > Nobody says the kernel has to support anything but that exact > JSON subset generated by your library functions, so parsing it > it not too bad. No, I haven't considered it, but currently with: void *nvlist_pack(const nvlist_t *nvl, size_t *sizep); nvlist_t *nvlist_unpack(const void *buf, size_t size); you can easly pack it to a binary blob and pass to the kernel. Thanks for your input. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://mobter.com --PmA2V3Z32TCmWXqI Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iEYEARECAAYFAlHXJBcACgkQForvXbEpPzTSrwCg+GegJJpkjgGsQmxO5d8OhD2E uoUAoMCny23u1qwbDbynXG8nJqDKAIQh =1saQ -----END PGP SIGNATURE----- --PmA2V3Z32TCmWXqI-- From owner-freebsd-arch@FreeBSD.ORG Fri Jul 5 20:10:43 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 081AFFCD; Fri, 5 Jul 2013 20:10:43 +0000 (UTC) (envelope-from phk@phk.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id C52B11090; Fri, 5 Jul 2013 20:10:42 +0000 (UTC) Received: from critter.freebsd.dk (unknown [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 939193EB30; Fri, 5 Jul 2013 20:10:41 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.7/8.14.7) with ESMTP id r65KAfkw060318; Fri, 5 Jul 2013 20:10:41 GMT (envelope-from phk@phk.freebsd.dk) To: Pawel Jakub Dawidek Subject: Re: General purpose library for name/value pairs. In-reply-to: <20130705195255.GB25842@garage.freebsd.pl> From: "Poul-Henning Kamp" References: <20130704215329.GG1402@garage.freebsd.pl> <4818.1373008073@critter.freebsd.dk> <20130705195255.GB25842@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1 Date: Fri, 05 Jul 2013 20:10:40 +0000 Message-ID: <60317.1373055040@critter.freebsd.dk> Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jul 2013 20:10:43 -0000 In message <20130705195255.GB25842@garage.freebsd.pl>, Pawel Jakub Dawidek writ es: >To be honest I'd much prefer not to support duplicates, because arrays >of values are supported as well as nesting. Not supporting duplicates >would simplify implementation a bit. So why do you support it ? >> Also remember a function for debugging which renders a nvlist_t (into >> a sbuf ?) > >Currently I have: > > void nvlist_dump(const nvlist_t *nvl, int fd); > void nvlist_fdump(const nvlist_t *nvl, FILE *out); I would just make one, and have it return an sbuf. That way the same code can be used in kernel and userland, and people get to decide where it should end up and how. >> Do consider having these functions in a variant with a default >> argument, so that people don't have to wrap each optioanl n/v pair >> in an if(). > >Using default value to report a problem [...] Uhm, it's not to report problems, it's to implement default values for non-specified names: give me $foo or if there is no $foo, give me "32" The return value should always be a "int where zero means OK" style error indicator, the returned value should go into a pointer arg. >I was considering this as well, but there is a lot of functions as it is >now [...] Yes, hate to say it, but that to me indicates that you're not quite on the right path. Maybe the basic n/v should just do strings, and interpretation of strings be a layer above ? >> [error string API] >Fine idea, but doesn't seem to directly related to libnv. No, but you'll have a hard time emitting meaningful errors from libnv without such a facility. Right now everybody rolls their own, getaddrinfo has its own, GEOM has its own, netgraph has its own... At some point we should call a stop, and I'd say "before libnv grows one too" is at good time :-) >There is only a problem with translating those messages, which we keep >avoiding. You know ? Screw that! Having usable errors only in english is far better than having only "Invalid argument" in all the languages of the world. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Fri Jul 5 20:59:12 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A06971F2; Fri, 5 Jul 2013 20:59:12 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (unknown [IPv6:2001:610:1108:5012::107]) by mx1.freebsd.org (Postfix) with ESMTP id 54AAC12C3; Fri, 5 Jul 2013 20:59:12 +0000 (UTC) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id CC5931203CC; Fri, 5 Jul 2013 22:58:56 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id A649828493; Fri, 5 Jul 2013 22:58:56 +0200 (CEST) Date: Fri, 5 Jul 2013 22:58:56 +0200 From: Jilles Tjoelker To: Pawel Jakub Dawidek Subject: Re: General purpose library for name/value pairs. Message-ID: <20130705205856.GA19346@stack.nl> References: <20130704215329.GG1402@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130704215329.GG1402@garage.freebsd.pl> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jul 2013 20:59:12 -0000 On Thu, Jul 04, 2013 at 11:53:30PM +0200, Pawel Jakub Dawidek wrote: > The library allows to send and receive descriptors, of course only over > UNIX domain sockets: > nvlist_t *nvl; > int fd; > > fd = open("/etc/passwd", O_RDONLY); > if (fd < 0) > err(1, "open(/etc/passwd) failed"); > > nvl = nvlist_create(0); > nvlist_add_string(nvl, "filename", "/etc/passwd"); > nvlist_move_descriptor(nvl, "fd", fd); > if (nvlist_send(sock, nvl) < 0) > err(1, "nvlist_send() failed"); > nvlist_destroy(nvl); > Also note that I used nvlist_move_descriptor() function and not > nvlist_add_descriptor(). The former will allow nvlist to consume the > given descriptor, so we don't have to close it, the latter will dup(2) > the given descriptor and then add it to the nvlist. The library should use fcntl(fd, F_DUPFD_CLOEXEC, 0) instead of dup(fd) so it does not pass the fd in case another thread forks. This is available in sufficiently recent head, stable/9 and stable/8. (On the other hand, if the application provides a file descriptor, I think it is not necessary to set the close-on-exec flag because only the creator of the file descriptor can do so in a race-free manner.) The recvmsg() call should use the MSG_CMSG_CLOEXEC flag for the same reason. This is currently only available in head. It is probably best to fcntl(fd, F_SETFD, 1) if MSG_CMSG_CLOEXEC is not available so that people do not write applications that assume close-on-exec is clear. -- Jilles Tjoelker From owner-freebsd-arch@FreeBSD.ORG Sat Jul 6 19:41:04 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AAB8A708 for ; Sat, 6 Jul 2013 19:41:04 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 3A8CD11E7 for ; Sat, 6 Jul 2013 19:41:03 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id 0F8DC41F; Sat, 6 Jul 2013 21:36:21 +0200 (CEST) Date: Sat, 6 Jul 2013 21:41:25 +0200 From: Pawel Jakub Dawidek To: Poul-Henning Kamp Subject: Re: General purpose library for name/value pairs. Message-ID: <20130706194124.GE25842@garage.freebsd.pl> References: <20130704215329.GG1402@garage.freebsd.pl> <4818.1373008073@critter.freebsd.dk> <20130705195255.GB25842@garage.freebsd.pl> <60317.1373055040@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="10jrOL3x2xqLmOsH" Content-Disposition: inline In-Reply-To: <60317.1373055040@critter.freebsd.dk> X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jul 2013 19:41:04 -0000 --10jrOL3x2xqLmOsH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 05, 2013 at 08:10:40PM +0000, Poul-Henning Kamp wrote: > In message <20130705195255.GB25842@garage.freebsd.pl>, Pawel Jakub Dawide= k writ > es: >=20 > >To be honest I'd much prefer not to support duplicates, because arrays > >of values are supported as well as nesting. Not supporting duplicates > >would simplify implementation a bit. >=20 > So why do you support it ? Before I proposed libnv here I did two things (which I find healthy): 1. I implemented this functionality in the past (in HAST and auditdistd) as internal functionality to gather some experience and learn about shortcomings. The previous implementation for example was focused on speed - this caused alignment issues and wasn't flexible. Later I realized that performance shouldn't be the top priority for this API. 2. I implemented consumers for this API. I use it in many places in my capsicum branch and I know it will fit HAST and auditdistd too. And the point 2 actually showed that supporting duplicates has some value. For example when I pass nvlist around for unrelated code to add its arguments, I don't have to pass current counter, as everyone can add its own argument under the same name, which is pretty convenient. Not sure if this is good enough argument to support duplicates when you take into account what it actually takes to support them: - If receiver wants to be sure there are no duplicates, nvlist_recv() should probably take 'flags' argument, so the caller can declare at that point if he expects duplicates or not. This is currently not done. - Without duplicates we could even try to get rid of nvpair_t altogether. nvpair_t is exposed mostly to support easy walking through nvlist because of duplicates. If this would be much rare operation, instead of doing: nvlist_t *nvl; nvpair_t *nvp; for (nvp =3D nvlist_first_nvpair(nvl); nvp !=3D NULL; nvp =3D nvlist_next_nvpair(nvl, nvp)) { /* ... */ } one could do: nvlist_t *nvl; const char *name; int type; void *cookie; cookie =3D NULL; while ((name =3D nvlist_next(nvl, &type, &cookie)) !=3D NULL) { /* ... */ } Although passing nvpair around without even knowing its type might be also valuable sometimes. I'm not really sure. I do want the API to be as simple as possible and as easy to use as possible, but not too simple and too easy, so people will keep hacking their own stuff. Opinions welcome. > >> Also remember a function for debugging which renders a nvlist_t (into > >> a sbuf ?) > > > >Currently I have: > > > > void nvlist_dump(const nvlist_t *nvl, int fd); > > void nvlist_fdump(const nvlist_t *nvl, FILE *out); >=20 > I would just make one, and have it return an sbuf. That way the > same code can be used in kernel and userland, and people get to > decide where it should end up and how. You seem to ignored my comment about not willing to depend on sbuf, because it will make adoption harder. Returning it as a regular string, which you could add to sbuf is something I'd be more happy to add. > >> Do consider having these functions in a variant with a default > >> argument, so that people don't have to wrap each optioanl n/v pair > >> in an if(). > > > >Using default value to report a problem [...] >=20 > Uhm, it's not to report problems, it's to implement default values > for non-specified names: >=20 > give me $foo or if there is no $foo, give me "32" >=20 > The return value should always be a "int where zero means OK" style > error indicator, the returned value should go into a pointer arg. What API do you imagine for that? And I wonder if it would be really needed if I'd implement nvlist_spec I proposed. Then one would need to do the checking only for optional nvpairs. > >I was considering this as well, but there is a lot of functions as it is > >now [...] >=20 > Yes, hate to say it, but that to me indicates that you're not quite on > the right path. :) Yes, I'm fully aware this doesn't look good:) Although I don't think you can say the library is trying to do too much. The core functionality is limited and well defined: nvlist_exists_() nvlist_add_() nvlist_move_() nvlist_get_() nvlist_take_() nvlist_free_() nvpair_create_() nvpair_move_() nvpair_get_() The number of functions (not the functionality) is pretty big, because of multiple types and format-string- and va_list- versions. > Maybe the basic n/v should just do strings, and interpretation of > strings be a layer above ? It would make getting values from the nvlist a hell - dealing with strto* functions and checking if conversion succeeded, that just too complex. If you look at the functionality it doesn't look that bad. atomic(9) has the same "problem" with multiple types. > >> [error string API] > >Fine idea, but doesn't seem to directly related to libnv.=20 >=20 > No, but you'll have a hard time emitting meaningful errors from > libnv without such a facility. >=20 > Right now everybody rolls their own, getaddrinfo has its own, > GEOM has its own, netgraph has its own... >=20 > At some point we should call a stop, and I'd say "before libnv grows > one too" is at good time :-) Nice try, but libnv *itself* is my input to improve infrastrucutre, so let's not get caried away, shall we?:) Having nvlist_errmsg() still sounds useful. > >There is only a problem with translating those messages, which we keep > >avoiding. >=20 > You know ? Screw that! Having usable errors only in english is > far better than having only "Invalid argument" in all the languages > of the world. Well, can't we do better than that? This argument goes both ways. In the past I saw people blocking useful functionality going in because it wasn't good enough for our standards and because it will stop someone =66rom doing it right. Many years from there are we are still without both right and not-so-right-but-useful functionalities. In commercial products having error messages visible through some webgui translated to configured language if definiately desired. Maybe having error message as an nvlist where format string (as one of the nvpairs) can be translated and arguments are stored as nvpairs instead of 'char *' would do the trick?:) --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://mobter.com --10jrOL3x2xqLmOsH Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iEYEARECAAYFAlHYcuQACgkQForvXbEpPzRf3wCeKHmVbnDmcYCvEqrgL+tEky82 i+YAn2l2d5ytptJMbV3qXOBB1Jhhqswr =bt/a -----END PGP SIGNATURE----- --10jrOL3x2xqLmOsH-- From owner-freebsd-arch@FreeBSD.ORG Sat Jul 6 20:47:38 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3E902E51 for ; Sat, 6 Jul 2013 20:47:38 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 096FD147B for ; Sat, 6 Jul 2013 20:47:37 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id DF33A442; Sat, 6 Jul 2013 22:42:54 +0200 (CEST) Date: Sat, 6 Jul 2013 22:47:58 +0200 From: Pawel Jakub Dawidek To: Jilles Tjoelker Subject: Re: General purpose library for name/value pairs. Message-ID: <20130706204758.GG25842@garage.freebsd.pl> References: <20130704215329.GG1402@garage.freebsd.pl> <20130705205856.GA19346@stack.nl> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="bygAmIonOAIqBxQB" Content-Disposition: inline In-Reply-To: <20130705205856.GA19346@stack.nl> X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jul 2013 20:47:38 -0000 --bygAmIonOAIqBxQB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 05, 2013 at 10:58:56PM +0200, Jilles Tjoelker wrote: > On Thu, Jul 04, 2013 at 11:53:30PM +0200, Pawel Jakub Dawidek wrote: > > The library allows to send and receive descriptors, of course only over > > UNIX domain sockets: >=20 > > nvlist_t *nvl; > > int fd; > >=20 > > fd =3D open("/etc/passwd", O_RDONLY); > > if (fd < 0) > > err(1, "open(/etc/passwd) failed"); > >=20 > > nvl =3D nvlist_create(0); > > nvlist_add_string(nvl, "filename", "/etc/passwd"); > > nvlist_move_descriptor(nvl, "fd", fd); > > if (nvlist_send(sock, nvl) < 0) > > err(1, "nvlist_send() failed"); > > nvlist_destroy(nvl); >=20 > > Also note that I used nvlist_move_descriptor() function and not > > nvlist_add_descriptor(). The former will allow nvlist to consume the > > given descriptor, so we don't have to close it, the latter will dup(2) > > the given descriptor and then add it to the nvlist. >=20 > The library should use fcntl(fd, F_DUPFD_CLOEXEC, 0) instead of dup(fd) > so it does not pass the fd in case another thread forks. This is > available in sufficiently recent head, stable/9 and stable/8. >=20 > (On the other hand, if the application provides a file descriptor, I > think it is not necessary to set the close-on-exec flag because only the > creator of the file descriptor can do so in a race-free manner.) >=20 > The recvmsg() call should use the MSG_CMSG_CLOEXEC flag for the same > reason. This is currently only available in head. It is probably best to > fcntl(fd, F_SETFD, 1) if MSG_CMSG_CLOEXEC is not available so that > people do not write applications that assume close-on-exec is clear. I fully agree and implemented what you have suggested. Thanks! --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://mobter.com --bygAmIonOAIqBxQB Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iEYEARECAAYFAlHYgn4ACgkQForvXbEpPzS39gCcDAOWBYaxdJlMOO7QgJh7sGR7 CSgAnRLBhO/ww9+//jpX45PTuUV8UMHA =J0Ln -----END PGP SIGNATURE----- --bygAmIonOAIqBxQB-- From owner-freebsd-arch@FreeBSD.ORG Sat Jul 6 23:53:56 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6543DF7C for ; Sat, 6 Jul 2013 23:53:56 +0000 (UTC) (envelope-from benno@FreeBSD.org) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by mx1.freebsd.org (Postfix) with ESMTP id 3CFC119F2 for ; Sat, 6 Jul 2013 23:53:55 +0000 (UTC) Received: from compute1.internal (compute1.nyi.mail.srv.osa [10.202.2.41]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id C4096208E8 for ; Sat, 6 Jul 2013 19:53:41 -0400 (EDT) Received: from frontend2.nyi.mail.srv.osa ([10.202.2.161]) by compute1.internal (MEProxy); Sat, 06 Jul 2013 19:53:41 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=from:content-type:subject:message-id:date :to:mime-version; s=smtpout; bh=FvTYqruILKWDE3S0DD9AfLGXmDI=; b= ZxbAUBjAOhK58LrCVTd5UV+nE6c9KzsLL0xwk1uuXa6zZfVSGWkFeA5sBC6qzu2/ 9m3T9P+FRy3JQvOw+eT5sOs1sN7aBCOJDxKhSZPMK1nb5WIzLY3UhJcko/KTq9u7 MRfaj1d7Fd+1mIHhcS7xxqGjEANf5BJQ/l4pWynWbqA= X-Sasl-enc: 0v4yUoyVyd7wikREblvcTLW2a82Fq2bc629glkK3b16R 1373154821 Received: from [172.20.10.2] (unknown [1.136.198.218]) by mail.messagingengine.com (Postfix) with ESMTPA id A37DA680239 for ; Sat, 6 Jul 2013 19:53:40 -0400 (EDT) From: Benno Rice Subject: Reworking vmmeter Message-Id: Date: Sun, 7 Jul 2013 09:53:37 +1000 To: freebsd-arch@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) X-Mailer: Apple Mail (2.1508) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jul 2013 23:53:56 -0000 So I've put together this patch: http://people.freebsd.org/~benno/vmmeter.diff This patch does a few things: - Renames the singleton "cnt" to "vmmeter". - Replaces all the per-cpu counters with counter_u64_t. - Removes the vmmeter instance from struct pcpu, due to the above = mentioned change. - Adds includes for vmmeter.h to a few files that were only getting it = via pollution in pcpu.h - Removes some entries from assym that weren't being used. This has been tested on amd64 and nothing else right now, I'm more = posting this to get general comments on whether people think this is a = good idea. My motivation for this was twofold, firstly to rename cnt and = secondly to move the counters to the common counter framework. More = testing will be done prior to commit. Cheers, Benno.=