From owner-freebsd-current@FreeBSD.ORG Mon Nov 17 11:54:08 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 83BEB16A4CF for ; Mon, 17 Nov 2003 11:54:08 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id EF56A43F75 for ; Mon, 17 Nov 2003 11:54:04 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) hAHJs4iF085916; Mon, 17 Nov 2003 11:54:04 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id hAHJs4YD085915; Mon, 17 Nov 2003 11:54:04 -0800 (PST) (envelope-from dillon) Date: Mon, 17 Nov 2003 11:54:04 -0800 (PST) From: Matthew Dillon Message-Id: <200311171954.hAHJs4YD085915@apollo.backplane.com> To: Doug White References: <200311121855.hACItAaG006605@beastie.mckusick.com> <3FB6BA40.6B7B2FD8@mindspring.com> <20031117092357.E21453@carver.gumbysoft.com> cc: freebsd-current@freebsd.org Subject: Re: HEADS-UP new statfs structure condidered harmful X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 19:54:08 -0000 :Well, there's some glue there now, but its pretty slim. What you :advocate would swap system call numbers for doing structure reloading per :call, which would significantly incrase the cost of the call. :Considering that *BSD system call overhead is pretty bad as is, I don't :think I'd be putting structure recopies into the critical path of a :syscall. : :-- :Doug White | FreeBSD: The Power to Serve Umm, no. I'm not sure why you are taking such a negative view of things, the actual implementation is whole lot simpler then you seem to believe. What we will be doing is adding new system calls to replace *stat() and *statfs(). They will for obvious reasons not be named the same, nor would the old system calls be removed. The new system calls will generate a capability list into a buffer supplied by userland, which is really no different from the copyout that the old system calls already do. The only difference is that the userland libc function that takes over the *stat() and *statfs() functionality using the new system calls (obsoleting the original system calls) will have to have to loop through the capability list and populate the user-supplied statfs or stat structure from it. Since the returned capability list is simply a stack based buffer there won't be any cache contention and the data will already be in the L1 cache. My guess is that it would add perhaps 150ns to these system calls compared to the 3-5uS they already take for the non-I/O case. The capability list would be 'chunky'. e.g. one capability record would represent all three timespecs for example, another record would represent uid, and gid. Another record record represent file size and block count, and so forth. They key point is that the individual capability elements would not change, ever. Instead if a change is needed a new capability element would be added and an argument to the new syscalls will let the system know whether it needs to generate the older elements that the newer ones replace. Userland will ignore capabilities it does not understand. The result is full forwards and backwards compatibility, forever. I do not believe there is any performance impact at all, especially if stat has to go do I/O. If you care about performance then I recommend that you fix the syscall path in 5.x instead of worrying yourself over stat(). If a particular program really needs to save the 150ns, say 'find', then it can call the new system call directly. But I really doubt anyone would notice 'find' running any slower. I certainly care a great deal about performance in DragonFly and I am not worried about the capability idea's impact *AT* *ALL*. The userland implementation would be something like this: int stat(const char *file, struct stat *st) { char tmpbuf[SMALLBUF]; /* stat info is expected to fit */ char *buf = tmpbuf; int off; int len; struct stat_cap_header *cap; /* * Run the system call. Try a small buffer first (designed to * succeed for the current version of the OS). If it fails then * allocate a larger buffer (compatibility with future OSs that might * provide more information). */ if ((len = stat_cap(file, buf, STAT_CAP_STDFIELDS)) < 0) { if (errno != E2BIG) return(-1); buf = malloc(((struct stat_cap_header *)buf)->c_len); if ((len = stat_cap(file, buf, STAT_CAP_STDFIELDS)) < 0) { free(buf); return(-1); } } /* * Populate the stat structure (this could be common code for all * stat*() calls). */ off = 0; while (off < len) { cap = (struct stat_cap_header *)(buf + off); switch(cap->c_type) { case STAT_TIMESPEC1: st->st_atimespec = cap->c_timespec1.atimespec; st->st_mtimespec = cap->c_timespec1.mtimespec; st->st_ctimespec = cap->c_timespec1.ctimespec; break; case STAT_UIDGID1: st->st_uid = cap->c_uidgid1.uid; st->st_gid = cap->c_uidgid1.gid; break; ... } off += cap->c_len; } if (buf != tmpbuf) free(buf); return(0); } -Matt