From owner-svn-src-head@freebsd.org Wed Nov 18 23:05:53 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B603F474553; Wed, 18 Nov 2020 23:05:53 +0000 (UTC) (envelope-from se@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cbz1s4cQfz4mYV; Wed, 18 Nov 2020 23:05:53 +0000 (UTC) (envelope-from se@freebsd.org) Received: from Stefans-MBP-WLAN.fritz.box (p200300cd5f3b88002402e588537088de.dip0.t-ipconnect.de [IPv6:2003:cd:5f3b:8800:2402:e588:5370:88de]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: se/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id EEAE22D27D; Wed, 18 Nov 2020 23:05:52 +0000 (UTC) (envelope-from se@freebsd.org) To: Jessica Clarke References: <202011181944.0AIJiUU3003699@repo.freebsd.org> <25465269-5497-4981-A1E4-CC1FFAB68CF4@freebsd.org> <92a87ed8-4b22-4d03-9481-b02f92dcaaa0@freebsd.org> From: Stefan Esser Cc: src-committers , svn-src-all , svn-src-head Subject: Re: svn commit: r367813 - head/lib/libutil Message-ID: <9f719272-f1e4-5668-02d2-bd2fbd0e5d6a@freebsd.org> Date: Thu, 19 Nov 2020 00:05:51 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="pluecdo9FMtq9TgEdev51iMzFYslnfLvb" X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Nov 2020 23:05:53 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --pluecdo9FMtq9TgEdev51iMzFYslnfLvb Content-Type: multipart/mixed; boundary="Wmw5wadY4r4IyARbwXccjPLs8sY45i0Ho"; protected-headers="v1" From: Stefan Esser To: Jessica Clarke Cc: src-committers , svn-src-all , svn-src-head Message-ID: <9f719272-f1e4-5668-02d2-bd2fbd0e5d6a@freebsd.org> Subject: Re: svn commit: r367813 - head/lib/libutil References: <202011181944.0AIJiUU3003699@repo.freebsd.org> <25465269-5497-4981-A1E4-CC1FFAB68CF4@freebsd.org> <92a87ed8-4b22-4d03-9481-b02f92dcaaa0@freebsd.org> In-Reply-To: --Wmw5wadY4r4IyARbwXccjPLs8sY45i0Ho Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable Am 18.11.20 um 23:14 schrieb Jessica Clarke: > On 18 Nov 2020, at 21:52, Stefan Esser wrote: >> Am 18.11.20 um 22:15 schrieb Jessica Clarke: >>> On 18 Nov 2020, at 19:44, Stefan E=C3=9Fer wrote: >>>> + /* >>>> + * Check for some other thread already having >>>> + * set localbase - this should use atomic ops. >>>> + * The amount of memory allocated above may leak, >>>> + * if a parallel update in another thread is not >>>> + * detected and the non-NULL pointer is overwritten. >>>> + */ >>> Why was this committed with a known racy/leaky implementation? >> >> Because the alternatives that I offered for discussion were >> less acceptable. >=20 > That has no bearing over whether this one is. Three alternatives have been discussed: 1) static buffer of size MAXPATHLEN 2) dynamically allocated buffer (as committed) 3) dynamically allocated buffer filled by a constructor 1) has been rejected, since it adds 1 KB of bss to each program that is linked with libutil, whether it uses getlocalbase() or not. 3) has been rejected since it causes 1 getenv() and 2 sysctl() calls when the program linked with libutil starts, independently of whether getlocalbase() is used or not. 2) has been selected, since it is only called when needed and it does not pre-allocate a large buffer. Which other alternative do you want to suggest? Did you have a look at the review announced on the -current list and mentioned in the commit? >>> What happens if I set the value with a sysctl and call this? >> >> You'll get the value set with sysctl, unless overridden by the >> environment variable. There is a window of a few nano-seconds >> where a thread executing in parallel on another core might be >> able to set the localbase variable (between the test for NULL >> in this function and the assignment to it). The value that will >> be returned by either thread will be identical, so there is no >> risk of corruption of the result. >=20 > But if I call getlocalbase, then set it via sysctl, then call > getlocalbase again, I see the old value. If, however, I omit the first > getlocalbase, I see the new value. This differs from how getenv/setenv > of the value work, where you always see the up-to-date value. Maybe you= > think that's a feature, but it's something to watch out for and > explicitly call out in the documentation. Actual programs call getlocalbase() exactly once and create the required pathes from the value returned. It is possible to copy the value from the environment to a buffer, but at added complexity and introducing another race condition. The program itself might have modified its environment and then use an inconsistent value when it calls getlocalbase() again thereafter. But why would you want to do this - it seems to be a complicated way lof foot-shooting to me. I'm not a native speaker of English and not best qualified to clearly express this in the man-page. Feel free to commit any clarification or suggest an addition for me to commit. > You also misunderstand all the subtleties of multithreading here. There= > are no acquire/release pairs so it is entirely legal for Thread 2 to > read Thread 1's initialised value for localbase before the contents of > it are visible (i.e. the pointer is initialised but the data is > garbage). Yes, and thread 2's value will be identical to the one from thread 1, just in a different heap location, unless there is a write to the sysctl variable in the kernel at just the same time. And you cannot protect against this race and you'll get either the old or new value. > The `(volatile const char*)localbase` cast is also a complete waste of > time. You probably meant to write `(const char * volatile)localbase` > but even then that does nothing useful as the cast is too late. What > you really were trying to write was > `*(const char * volatile *)&localbase`, but you need proper atomics > anyway for this to be safe. I was not sure about the correct volatile declaration, I've got to admit. It was in the review and I did not get any comments regarding that specific modifier. The goal was to enforce the access to the localbase pointer in memory after returning from the sysctl to shorten the window. Perhaps I should just have completely ignore any multi-threading issues and accepted that an overlapping execution of this function would allocate multiple sysctl destination buffers but loose all references but one in the assignment to localbase. It will not happen in practice, it just does not make sense to call this function more than once in a program. >> This unlikely case may actually leak a heap allocated string >> of typically tens of bytes (but with negligible probability). >> >> But this really is a non-issue, since there should never be a >> reason to invoke this function in a multi-threaded context. >=20 > Why not? There could easily be code out there calling getenv in a > multi-threaded context so this is inadequate as a replacement. Yes it's= > inefficient but it's perfectly legal and imaginable. Yes, calling getenv() might occur, but getlocalbase() is generally called before configuration files are accessed, and the resulting path is saved in the program. Other uses are possible, but this is the recurring pattern. And unless the program modifies its own environment at run-time, all invocsations of getlocalbase() are guaranteed to return the same result again and again. > Also if malloc returns NULL I would quite like that to be an error for > the function and not silently fall back on _PATH_LOCALBASE. The function is specifically specified to always succeed. Else there is extra code required at each invocation to check for errors. Not being able to allocate the required buffer of at most 1 KB will cause every non-trivial program to fail anyway. Returning the compiled in default value will direct the program to access files in /usr/local, which is an administrator controlled path, even on systems with non-default localbase. Returning NULL is possible in that case, but the only ways a program could deal with this case is to either crash, sue the default path, or ignore the accesses to files in localbase. In either case it will not operate as intended, but I consider directing accesses to the FreeBSD default location (or the modified _PATH_LOCALBASE) to be a reasonable fall-back. But this will not happen unless provoked (by not allowing the program to allocate any memory from the heap, which will cause it to fail as soon as any other library function or start-up code tries to allocate heap-space). I had put this code up for review and it has been discussed before (and another implementation has been reverted after several iterations of changes). If there had been consensus to return NULL, I could have committed that - it is a trivial change and ENOMEM would have been set after a malloc failure. I have written about this possibility and I had appreciated if comments had been made on Phabricator before the commit. Regards, STefan --Wmw5wadY4r4IyARbwXccjPLs8sY45i0Ho-- --pluecdo9FMtq9TgEdev51iMzFYslnfLvb Content-Type: application/pgp-signature; name="OpenPGP_signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="OpenPGP_signature" -----BEGIN PGP SIGNATURE----- wsB5BAABCAAjFiEEo3HqZZwL7MgrcVMTR+u171r99UQFAl+1qM8FAwAAAAAACgkQR+u171r99US5 DAf/b+xunZK6AfbbAawlLu7QfLx7QQrplIaD7rQw7Yn80AH1PJe3uyqp1yitqHgpEXwpmBuhaILP Oa5kGfXpju0f8ipSAesNv+FfVO2r4YG65eTUftV6fCQPSjFTYX8yNZMSpPtal+tp9fLyvTliZJKQ PLVXu3GqICFajeO2Bk0+TRWfgbiwJ5dLcT3PsX3p9fc0inwGfu3lhsO9ReDEXrFbCKMCY6sEOFMv 4Q3r4MyGUOfvBdv+aIZUH27zfiBnSOMk08E0qBYAHH2MHYWVHuiSsCF82I1HVPQM19/v7G/BE7Eo +VEg6ulgJL/L7q0+2gOqzA/X6HMaS8ttBl2ZIDCBYw== =8Tf8 -----END PGP SIGNATURE----- --pluecdo9FMtq9TgEdev51iMzFYslnfLvb--