From owner-freebsd-threads@FreeBSD.ORG Wed Jun 18 22:59:35 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8BA8A37B401; Wed, 18 Jun 2003 22:59:35 -0700 (PDT) Received: from out003.verizon.net (out003pub.verizon.net [206.46.170.103]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8A10443F93; Wed, 18 Jun 2003 22:59:34 -0700 (PDT) (envelope-from mtm@identd.net) Received: from kokeb.ambesa.net ([138.88.125.205]) by out003.verizon.net (InterMail vM.5.01.05.33 201-253-122-126-133-20030313) with ESMTP id <20030619055933.FNBD4805.out003.verizon.net@kokeb.ambesa.net>; Thu, 19 Jun 2003 00:59:33 -0500 Date: Thu, 19 Jun 2003 01:59:32 -0400 From: Mike Makonnen To: Daniel Eischen In-Reply-To: References: <20030618210812.GB21622@rot13.obsecurity.org> X-Mailer: Sylpheed version 0.8.10 (GTK+ 1.2.10; i386-portbld-freebsd5.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Authentication-Info: Submitted using SMTP AUTH at out003.verizon.net from [138.88.125.205] at Thu, 19 Jun 2003 00:59:33 -0500 Message-Id: <20030619055933.FNBD4805.out003.verizon.net@kokeb.ambesa.net> cc: nectar@freeBSD.org cc: threads@FreeBSD.org cc: kris@obsecurity.org Subject: making nsswitch(3) thread-safe (was Re: Removal of bogus gethostbyaddr_r()) X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jun 2003 05:59:35 -0000 [ Jacques, I'm CC'ing you since this affects nss ] I just took a look into what it would take to make the gethostby*_r functions thread safe, and it isn't pretty. The "thread-unsafeness" goes all the way down into the individual methods for the various nsswitch(3) databases (dns, files, etc). So, the most expedient and least invasive way to go about it would be to put a mutex in the gethostby* functions that is dependent on __isthreaded being true. This, offcourse, means that the entire call-chain (from top to bottom) is locked for the duration of a call to one of these functions, which can be a considerable amount of time if it has to timeout. During this time no other thread in the application can resolve a host because it will be blocked on said mutex. To my mind the way to fix this properly is to make the nsdispatch(3) call-chain in libc thread-safe from the ground-up. This is a huge task in and of itself, but is complicated even further by the fact that whatever we do can't change the semantics of the existing user-visible functions. I think there are three points to keep in mind: 1. The nsswitch(3) database functions should be made thread-safe, but at the same time not change the thread-unsafe semantics that third-party apps might expect (through nsdispatch(3)). 2. The thread-unsafeness starts at the bottom (low-level functions) of the call-chain, in the nsswitch(3) database functions. 3. Because so many databases (hosts, passwd, group, etc) will be affected by this it would be too huge a task to do it all at once. The safest approach is probably to introduce the changes separately a few at a time. We can achieve this by Introducing a variable that gets passed down the call-chain telling the nsswitch(3) database functions whether the "destination address" (return value) will by allocated by the caller or not. If this variable is false then the routines can use a static buffer, otherwise, they will assume the caller has allocated the necessary space. Most of the functions will need a lot more work than this to make them fully thread-safe. The advantage of doing it this way is that it maintains the status-quo while allowing us to make the individual subsystems thread-safe separately and as time and resources permit. The following patch shows what I have in mind. It won't compile just yet, but it might make what I'm saying a little clearer :-) This initial phase is really just a mechanical change of argument lists. It doesn't introduce any thread-safeness on its own. It just makes it easier to introduce such changes safely on a subsytem by subsystem basis. Basically, the _nsdispatch() functionality is moved into nsdispatch_common(), which takes a va_list as its last argument instead of a variable number of arguments. It also takes one additional argument(int is_r) so callers can tell it what kind of semantics they want. Applications calling nsdispatch() get a wrapper that passes a false (0) value in the is_r argument. Callers from within libc will continue calling_nsdispatch(), which also grows an additional argument (int is_r) that callers can pass down to the nsswitch(3) database functions through nsdispatch_common(). What do you think? Should I continue with this or do you think it's bogus? The important point to keep in mind is that we have to keep the dual semantics (thread-safe and unsafe) all the way down the call-chain because only the top-level and low-level functions know the actual memory type and size that needs to be allocated for the return value to the library users. So, we can't make the low-level nsswitch(2) database functions unconditionally thread-safe and restrict the dual semantics to the top-level functions. Cheers. -- Mike Makonnen | GPG-KEY: http://www.identd.net/~mtm/mtm.asc mtm@identd.net | D228 1A6F C64E 120A A1C9 A3AA DAE1 E2AF DBCC 68B9 mtm@FreeBSD.Org| FreeBSD - The Power To Serve Index: include/nsswitch.h =================================================================== RCS file: /home/ncvs/src/include/nsswitch.h,v retrieving revision 1.3 diff -u -r1.3 nsswitch.h --- include/nsswitch.h 17 Apr 2003 14:14:21 -0000 1.3 +++ include/nsswitch.h 19 Jun 2003 03:13:29 -0000 @@ -104,13 +104,13 @@ /* * ns_dtab `method' function signature. */ -typedef int (*nss_method)(void *_retval, void *_mdata, va_list _ap); +typedef int (*nss_method)(void *_retval, void *_mdata, int is_r, va_list _ap); /* * Macro for generating method prototypes. */ #define NSS_METHOD_PROTOTYPE(method) \ - int method(void *, void *, va_list) + int method(void *, void *, int, va_list) /* * ns_dtab - `nsswitch dispatch table' Index: lib/libc/net/gethostbydns.c =================================================================== RCS file: /home/ncvs/src/lib/libc/net/gethostbydns.c,v retrieving revision 1.43 diff -u -r1.43 gethostbydns.c --- lib/libc/net/gethostbydns.c 1 May 2003 19:03:13 -0000 1.43 +++ lib/libc/net/gethostbydns.c 19 Jun 2003 03:33:43 -0000 @@ -470,7 +470,7 @@ } int -_dns_gethostbyname(void *rval, void *cb_data, va_list ap) +_dns_gethostbyname(void *rval, void *cb_data, int is_r, va_list ap) { const char *name; int af; @@ -604,7 +604,7 @@ } int -_dns_gethostbyaddr(void *rval, void *cb_data, va_list ap) +_dns_gethostbyaddr(void *rval, void *cb_data, int is_r, va_list ap) { const char *addr; /* XXX should have been def'd as u_char! */ int len, af; Index: lib/libc/net/nsdispatch.c =================================================================== RCS file: /home/ncvs/src/lib/libc/net/nsdispatch.c,v retrieving revision 1.8 diff -u -r1.8 nsdispatch.c --- lib/libc/net/nsdispatch.c 24 Apr 2003 19:57:31 -0000 1.8 +++ lib/libc/net/nsdispatch.c 19 Jun 2003 03:17:17 -0000 @@ -557,13 +557,41 @@ } -__weak_reference(_nsdispatch, nsdispatch); +__weak_reference(__nsdispatch, nsdispatch); + +int +__nsdispatch(void *retval, const ns_dtab disp_tab[], const char *database, + const char *method_name, const ns_src defaults[], ...) +{ + va_list ap; + int error; + + va_start(ap, defaults); + error = nsdispatch_common(retval, disp_tab, database, method_name, + defaults, 0, ap); + va_end(ap); + return (error); +} int _nsdispatch(void *retval, const ns_dtab disp_tab[], const char *database, - const char *method_name, const ns_src defaults[], ...) + const char *method_name, const ns_src defaults[], int is_r, ...) +{ + va_list ap; + int error; + + va_start(ap, is_r); + error = nsdispatch_common(retval, disp_tab, database, method_name, + defaults, is_r, ap); + va_end(ap); + return (error); +} + +int +nsdispatch_common(void *retval, const ns_dtab disp_tab[], const char *database, + const char *method_name, const ns_src defaults[], int is_r, + va_list ap) { - va_list ap; const ns_dbt *dbt; const ns_src *srclist; nss_method method; @@ -597,9 +625,7 @@ method = nss_method_lookup(srclist[i].name, database, method_name, disp_tab, &mdata); if (method != NULL) { - va_start(ap, defaults); - result = method(retval, mdata, ap); - va_end(ap); + result = method(retval, mdata, is_r, ap); if (result & (srclist[i].flags)) break; }