Date: Tue, 29 Jul 2003 16:03:16 -0400 From: Mike Makonnen <mtm@identd.net> To: "Jacques A. Vidrine" <nectar@FreeBSD.org> Cc: kris@obsecurity.org Subject: Re: making nsswitch(3) thread-safe (was Re: Removal of bogus gethostbyaddr_r()) Message-ID: <20030729200315.GA53927@kokeb.ambesa.net> In-Reply-To: <20030623150627.GD82167@madman.celabo.org> References: <20030618210812.GB21622@rot13.obsecurity.org> <Pine.GSO.4.10.10306182206300.3647-100000@pcnet5.pcnet.com> <20030619055933.FNBD4805.out003.verizon.net@kokeb.ambesa.net> <20030623150627.GD82167@madman.celabo.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Ok Jacques, I have abandoned the method I was thinking of doing it in. Changes to third-party nss modules is definitely a show-stopper IMO. So, this email is essentially to let you know I have decided to use the modifications from PR: bin/29581 (as posted by sobomax a while back) as a starting point since it generally got a good response. For those who don't remember, it's the one that uses the POSIX thread-local storage primitives. BTW, in case it's not clear what I'm talking about: I won't be touching the nsswitch stuff, I'll just be making the the gethostbyXXX functions thread-safe. The essence of the patch won't change, I'll just be modifying the backend to make some things less static, and also hiding the whole thing behind __isthreaded. I should be done in a few days and will post it here after some testing. On Mon, Jun 23, 2003 at 10:06:27AM -0500, Jacques A. Vidrine wrote: > [Sorry for the late reply. As you may or may not have noticed :-) > my FreeBSD hiatus continues :-( I'll be back, just not sure when > yet.] > > On Thu, Jun 19, 2003 at 01:59:32AM -0400, Mike Makonnen wrote: > > > > [ Jacques, I'm CC'ing you since this affects nss ] > > Thanks, Mike! -- for CC'ing me and for looking at this in the first > place. > > > I just took a look into what it would take to make the gethostby*_r functions > > thread safe, and it isn't pretty. The "thread-unsafeness" goes all the way > > down into the individual methods for the various nsswitch(3) databases (dns, > > files, etc). So, the most expedient and least invasive way to go about it would > > be to put a mutex in the gethostby* functions that is dependent on __isthreaded > > being true. This, offcourse, means that the entire call-chain (from top to > > bottom) is locked for the duration of a call to one of these functions, which > > can be a considerable amount of time if it has to timeout. During this time no > > other thread in the application can resolve a host because it will be blocked on > > said mutex. > > You couldn't take this approach even if you wanted. No part of libc > should hold a lock while making calls through nsdispatch. NSS modules > are free to call back into libc. Imagine holding a lock in > gethostbyname_r, calling through nsdispatch and getting to, say, the > LDAP backend. The LDAP backend might itself invoke gethostbyname_r > or some other nsdispatch consumer resulting in recursive locking or > lock order reversals. > > > To my mind the way to fix this properly is to make the nsdispatch(3) > > call-chain in libc thread-safe from the ground-up. This is a huge task in and of > > itself, but is complicated even further by the fact that whatever we do can't > > change the semantics of the existing user-visible functions. > > Yes, this is the approach that must be taken. The nsdispatch engine > is itself thread-safe. The getpw*_r, getgr*_r functions are > thread-safe. I believe that with the exception of the `dns' backend, > the existing backends are thread-safe. > > > I think there are three points to keep in mind: > > > > 1. The nsswitch(3) database functions should be made thread-safe, but at the > > same time not change the thread-unsafe semantics that third-party apps > > might expect (through nsdispatch(3)). > > 2. The thread-unsafeness starts at the bottom (low-level functions) of the > > call-chain, in the nsswitch(3) database functions. > > 3. Because so many databases (hosts, passwd, group, etc) will be affected by > > this it would be too huge a task to do it all at once. The safest approach > > is probably to introduce the changes separately a few at a time. > > What do you mean by the `nsswitch(3) database functions' ? Do you > mean the backends? If so, yes, the backends must ultimately be > thread-safe, and it is best to manage them one database at a time > i.e., all built-in backends --- files, nis, dns --- for a database > must be done at once. > > > We can achieve this by > > Introducing a variable that gets passed down the call-chain telling the > > nsswitch(3) database functions whether the "destination address" (return > > value) will by allocated by the caller or not. If this variable is false then > > the routines can use a static buffer, otherwise, they will assume the caller has > > allocated the necessary space. > > > > Most of the functions will need a lot more work than this to make them > > fully thread-safe. The advantage of doing it this way is that it maintains > > the status-quo while allowing us to make the individual subsystems > > thread-safe separately and as time and resources permit. > > > > The following patch shows what I have in mind. It won't > > compile just yet, but it might make what I'm saying a little clearer :-) > > This initial phase is really just a mechanical change of argument lists. It > > doesn't introduce any thread-safeness on its own. It just makes it easier to > > introduce such changes safely on a subsytem by subsystem basis. > > > > Basically, the _nsdispatch() functionality is moved into nsdispatch_common(), > > which takes a va_list as its last argument instead of a variable number of > > arguments. It also takes one additional argument(int is_r) so callers can tell > > it what kind of semantics they want. Applications calling nsdispatch() get a > > wrapper that passes a false (0) value in the is_r argument. Callers from within > > libc will continue calling_nsdispatch(), which also grows an additional argument > > (int is_r) that callers can pass down to the nsswitch(3) database functions > > through nsdispatch_common(). > > > > What do you think? Should I continue with this or do you think it's bogus? > > > > The important point to keep in mind is that we have to keep the dual semantics > > (thread-safe and unsafe) all the way down the call-chain because only the > > top-level and low-level functions know the actual memory type and size that > > needs to be allocated for the return value to the library users. So, we can't > > make the low-level nsswitch(2) database functions unconditionally thread-safe > > and restrict the dual semantics to the top-level functions. > > I think this is bogus :-) I must say I don't really understand what > you are trying to accomplish ... maybe we should get on IRC and chat. > Also, please look at getpwent.c to see how thread-safe versus > non-thread-safe entry points to getpw* are handled if you have not > already. I can't see any need to add this `is_r' argument. I don't > see the purpose of it. > > The hardest part of making the thread-safe gethost*_r functions, IMHO > (after having done some work in this area) is untangling the > KAME-contributed code. There is a lot of incest between the > implementations of the various high-level resolver routines > (gethostbyname, getaddrinfo, getipnodebyname, etc.) that complicates > things. But the approach is conceptually simple: > > 1. Write gethostbyname_r (thread-safe), which invokes > nsdispatch("gethostbyname_r", ...). > > 2. Make each back-end thread-safe. Starting with a global lock is > reasonable. I don't think we can ever do anything efficient > with the BIND 4 resolver, anyway. > > Only I think you want to do the whole family of functions in one blow, > not one-at-a-time. (But maybe I'm wrong here, 'cause that has > certainly kept me from committing any partial work in this area.) > > Note also that the libc<-->NSS modules interface is constrained. > Nominally we use the GNU libc interface, which has different entry > points for e.g. gethostbyname and gethostbyname_r, with a set number > and type of parameters. We don't want to change these interface lest > we make porting NSS modules to FreeBSD harder. We can add some > interfaces, though, where needed (i.e. getaddrinfo). > > You can contact me in a number of ways to discuss: > > - EFnet nickname `nectar' (or whatever nick shows up for > `nectar@free.bsd') > - AIM `nectar5' > - ICQ `17783107' > - Yahoo! `nectar' > - MSN `nectar523@hotmail.com' > - Voice 612-743-3364 > > It would be best if we could `schedule' something. I apologize, but > my freetime is very slim lately. > > Cheers, > -- > Jacques Vidrine . NTT/Verio SME . FreeBSD UNIX . Heimdal > nectar@celabo.org . jvidrine@verio.net . nectar@freebsd.org . nectar@kth.se -- Mike Makonnen | GPG-KEY: http://www.identd.net/~mtm/mtm.asc mtm@identd.net | D228 1A6F C64E 120A A1C9 A3AA DAE1 E2AF DBCC 68B9 mtm@FreeBSD.Org| FreeBSD - Unleash the Daemon!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030729200315.GA53927>