From owner-freebsd-arch Tue May 7 6:47:29 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 3033337B405; Tue, 7 May 2002 06:47:23 -0700 (PDT) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.3/8.12.3) with SMTP id g47Dkvb5076683; Tue, 7 May 2002 09:46:57 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Tue, 7 May 2002 09:46:56 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Matthew Dillon Cc: Poul-Henning Kamp , arch@FreeBSD.ORG Subject: Re: syscall changes to deal with 32->64 changes. In-Reply-To: <200205070815.g478Fn180961@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 7 May 2002, Matthew Dillon wrote: > I tried to add new syscalls to the existing vector but so many > syscalls had to be changed to support 64 bit time_t's that it > became a huge mess... so much so that I would expect the other > BSD's to cry foul on us if we tried to do it with the existing > vector. It will be far, far cleaner to simply implement an > entirely new syscall vector / ELF identifier. Sounds like we actually have relatively firm concensus on this point [thus far]. The only concern really has been level of effort and chances of completeness in a reasonable time. > In regards to other things that may need to change size: A complete > audit will have to be performed. I would be happy to take a run through. > Getting it right the first time is extremely important. A bunch of > things starting leaking out of the woodwork as I was playing around > with 64 bit time_t's. At the very least I would pad the structures > to handle things like 64 bit dev_t, ino_t, and file flags, and I would > even consider padding now 96 bit structures like timespec (on IA32 with > 64 bit time_t's + nanoseconds long = 96 bits) to 128 bits across the > board. It might also be worthwhile to make uid_t, gid_t, and pid_t > 64 bits to support probable future work in those areas. It's certainly worth talking about, in that it would be a good breaking point for any of these. One thing to keep in mind is how much inode growth is acceptable -- we get some as part of the natural process of moving to 64-bit block pointers, but we shouldn't let it get out of control or risk serious reduction in cached inode information for the same memory footprint. My understanding from talking with Kirk and Poul-Henning is that ino_t is definitely on the list already, as are a number of the others at the file system image layer (such as block pointers, et al). They should already have much of this underway, and the temptation is to allow them to do the grunt work if they're already doing it :-). dev_t I don't really have an opinion about. For file flags, last I checked, the current leaning was to actually add a new flags field to the inode for internal system use, and possibly breaking out the current field into two fields. All of this is at the fs image level however. This would allow moving the snapshot flag out of the normal fields range and reducing the level of hard-to-read flag masking in the UFS2 code. The system flags would also allow for some EA information caching, improving performance for ACLs and related things. No opinion on the time type stuff, but I'm sure phk has an opinion :-). WRT uid_t and gid_t: I'm not sure if there's enough benefit here. The temptation is certainly there, but unlike things like ino_t, this stuff tends to get mixed up in data files a lot. Same with time type stuff. This is more likely to cause problems with persistent application state -- for example, password databases. > In regards to the 5.0R question that comes up later in the thread... > I just don't know. I will say that creating a new syscall vector > cannot be done piecemeal... you have to get it *all* in from the > get-go or you create huge issues with things like bootstrapping new > systems and general compatibility and useability, etc.. Well, I think it's possible to start doing back-end stuff in the kernel and just "cast down", but I haven't given it too much think-through yet. Where we'll really need a flag day is with the default compiled ABI for user binaries. It depends how we handle the change to some of the types, I suppose -- whether we use "holding types" during a changeover, etc. One of the reasons I mentioned the "one week" figure in my earlier responses it that when we do the cut-over, we need to do it decisively and without a lot of lag in getting stuff working. At the filesystem level, introducing new types doesn't hurt us too much (ufs_ino_t, ufs2_ino_t, et al), but at the system call layer it could be that would hurt too much. I guess the one opinion I haven't heard yet, and am a little surprised not to have heard is: No, we shouldn't do this on architectural grounds. We've heard "yes" in various flavors, including moderated "yes if we can manage it by the release". Not to invite a bikeshed, but if there's going to be a strong argument against such a change, it would be nice to hear it sooner. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message