From owner-freebsd-arch@FreeBSD.ORG Mon Jun 2 15:24:18 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E133D37B401 for ; Mon, 2 Jun 2003 15:24:18 -0700 (PDT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 41BC543F75 for ; Mon, 2 Jun 2003 15:24:18 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.9/8.12.6) with ESMTP id h52MOIVI002730; Mon, 2 Jun 2003 15:24:18 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9/8.12.6/Submit) id h52MOIwj002729; Mon, 2 Jun 2003 15:24:18 -0700 (PDT) Date: Mon, 2 Jun 2003 15:24:18 -0700 (PDT) From: Matthew Dillon Message-Id: <200306022224.h52MOIwj002729@apollo.backplane.com> To: Gordon Tetlow References: <20030602171942.GA87863@roark.gnf.org> <20030602202947.GE87863@roark.gnf.org> <200306022125.h52LPhhc002291@apollo.backplane.com> <20030602214956.GG87863@roark.gnf.org> cc: arch@freebsd.org cc: Dag-Erling Smorgrav Subject: Re: Making a dynamically-linked root X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jun 2003 22:24:19 -0000 :Actually, it was a diskless boot, so it was in the system cache. =3D) I :know this is a rigged demo, but the point is the same, yes, it's slower, :but we also have a huge gain from going to a dynamically linked world. :It would also serve as encouragement to get things like pre-binding and :caching working. : :-gordon Ah, but you are still waiting on 'disk I/O'... it just happens to be *network* disk I/O, so it doesn't matter if it's in the server's cache or not. A lot of the delay is due to the client program stalling until the page is faulted in over the network (and not doing any other work in the mean time), then running for a few cycles and stalling again waiting for the next random page to be faulted in. Another big issue with the diskless code is the path cache. Whenever a shell script runs a program using a relative path (like 'ls' instead of '/bin/ls'), it tries to stat the program file for each path element in the path. With a local disk the local system's name cache is coherent and these operations are nearly instantanious. Over NFS, however, a lot of retesting of the same paths are done over the network over and over again, leading to a massive perceived slow down, and even retesting a good path like /bin/sh often generates NFS traffic looking up "/bin/sh" over and over again. For example, if in one window you start a tcpdump and monitor port 2049 (typically nfsd), and in another window you run /bin/sh, you will see at least 3 NFS lookups. If you exit the shell and run it again you will see the same 3 NFS lookups again. And again, and again. This alone is probably responsible for most of the rc script slowdown. It is probably all the path lookups on the dynamic link libraries at program startup that is causing the problem, not exec() per-say. If you think running /bin/sh produces a lot of NFS traffic, try running '/usr/bin/nm' without any arguments and look at the NFS traffic. /usr/bin/nm, being a dynamic executable, will do no less then 14 uncacheable synchronous NFS operations just to deal with its shared libraries. -Matt