From owner-freebsd-hackers Sat Feb 1 23:07:53 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id XAA04990 for hackers-outgoing; Sat, 1 Feb 1997 23:07:53 -0800 (PST) Received: from skynet.ctr.columbia.edu (skynet.ctr.columbia.edu [128.59.64.70]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id XAA04985 for ; Sat, 1 Feb 1997 23:07:46 -0800 (PST) Received: (from wpaul@localhost) by skynet.ctr.columbia.edu (8.6.12/8.6.9) id CAA01954; Sun, 2 Feb 1997 02:04:37 -0500 From: Bill Paul Message-Id: <199702020704.CAA01954@skynet.ctr.columbia.edu> Subject: Re: slow ypserv problem To: tom@sdf.com (Tom Samplonius) Date: Sun, 2 Feb 1997 02:04:35 -0500 (EST) Cc: shovey@buffnet.net, hackers@FreeBSD.ORG In-Reply-To: from "Tom Samplonius" at Feb 1, 97 07:45:21 am X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Of all the gin joints in all the towns in all the world, Tom Samplonius had to walk into mine and say: > > On Sat, 1 Feb 1997, Steve wrote: > > > On Sat, 1 Feb 1997, Tom Samplonius wrote: > > > Bill knows. > > > > > > ypserv was completely re-written for 2.2. Darn straight. > > > Tom > > > > > > > > It it possible to get just that piece for 2.1? or wouldnt it work under > > 2.1? > > It won't compile without modification on 2.1.x I asked about this > along time ago, and Bill said that you'd have to drag over some rcp > library stuff too. > > Tom I think you mean RPC. Actually, the RPC library isn't the problem: the stuff under src/lib/libc/yplib is the problem. It changed quite a lot, and the new ypserv just won't link on 2.1.x unless you take certain steps. Also, I recently added async DNS lookups, which require calling into part of the resolver code in libc, which won't work on 2.1.x unless you supply some support code. Can you do it? Sure: I still run 2.1.0 at home and use it to do most of my development (though I test on a 2.2 machine too). But I've already tried to explain to people how to do it and I get the feeling I just leave them confused. For the record, the main reason why the 2.1.x ypserv is so slow in this case (many repeated calls to getpwent() to enumerate the passwd database) is because it opens/searches/closes the database on each access. The actual Linux version may not have this problem since it uses GDBM, and I think GDBM supports a special 'nextkey' routine which allows for sequential enumerations of a database in an efficient manner. With the Berkeley DB hash method, there's no simple way to say 'here's a key 'foo': find me the record that goes with this key and then return me the record that comes immediately _after_ it.' Unfortunately, this is exactly what you need to do for the yp_next() operation. The 2.1.x ypserv fakes this up by starting at the first record in the database and then searching through the database one record at a time until it find the specified key, then skips ahead one space to retrieve the next record. (The exception is for yp_all(): in this case the server just spews the whole map from start to finish, so no lookups are necessary.) This mess is caused by the fact that when using the hash method, Berkeley DB's 'cursor' can only be positioned by performing fetches with the DB_FIRST or DB_NEXT flags. Simply fetching an arbitrary record without any flags does not move the cursor. There is an RB_CURSOR flag which is supposed to let you shift the cursor to an arbitrary location, but it doesn't work for the hash method. It does work for the btree method, but I couldn't decide which method was faster and decided to just stick with hashing. In 2.2, the server gets around this problem by caching database handles: after you access a database the first time, the server keeps the database open and thus preserves the Berkeley DB 'cursor' location. On the next access, the lookup to open the map checks to see if there's already a handle in the cache that happens to be positioned at the right location, and if so, it uses that handle rather than opening a new one. In the case of the getpwent() loop, this permits the server to open the database just once. If two clients run the same kind of loop, each loop will end up with a separate cached handle. As more clients requests are received, more handles are cached until all slots are filled; once the slot limit is reached, the server starts evicting handles at the bottom of the list to make room for new ones. Note that the client side NIS code in libc has also been changed to support caching handles, though in this case it's RPC client handles rather than Berkeley DB database handles. Previously, the yplib code would call clnt_create(), clnt_call() and clnt_destroy() for every NIS call. This is not necessary; once a client handle has been created with clnt_create(), you can make as many clnt_call()s with it as you like. The code now reuses the same handle as often as it can until it encounters an error: only then will it destroy the binding and try to create a new one. With any luck, both the server and client side changes should make the overall performace much better. It's still not as fast as I'd like it, but I think the remaining performance problems stem from the fact that the transactions involved are very small: the client and server are both transmitting and receiving a large number of very small UDP packets. A 'ypcat passwd' is usually many times faster than a getpwent() loop since yp_all() uses a TCP pipe to send the whole map in one shot. By contrast, yp_first() and yp_next() can only send one record at a time. No batching is possible due to the nature of the NIS protocol, which is too bad as it would probably help a lot. -Bill -- ============================================================================= -Bill Paul (212) 854-6020 | System Manager, Master of Unix-Fu Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ============================================================================= "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" =============================================================================