From owner-freebsd-hackers@FreeBSD.ORG Sun May 23 21:08:12 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 815B916A4CE; Sun, 23 May 2004 21:08:12 -0700 (PDT) Received: from smtp2.server.rpi.edu (smtp2.server.rpi.edu [128.113.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1032543D1D; Sun, 23 May 2004 21:08:12 -0700 (PDT) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp2.server.rpi.edu (8.12.8/8.12.8) with ESMTP id i4O47ZIX008279; Mon, 24 May 2004 00:07:35 -0400 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: Date: Mon, 24 May 2004 00:07:34 -0400 To: freebsd-ports@freebsd.org From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: CanIt (www . canit . ca) X-Mailman-Approved-At: Mon, 24 May 2004 04:54:15 -0700 Subject: Third "RFC" on on pkg-data ideas for ports X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 May 2004 04:08:12 -0000 [this is BCC'ed to -hackers and -arch just so everyone has a chance to see it, but I expect the bulk of the discussion should take place on the freebsd-ports mailing list] Well, Darren and I have done more work on my "pkg-data" ideas, but we're also getting closer to the time where Darren will be busy with his own full-time job, at which point the progress on this will be much slower. So, I'd like to show some of what we've been working on, make a third proposal, and see if this one is interesting enough for us to pursue. If not, then I'll probably just update my web pages with my thoughts so far, and then put this whole idea on a back-burner. [and if you thought progress was slow before, imagine how slow it will be when moved to a back-burner!] In the last go-round, someone pointed out that it could be helpful just to have a better idea of what the ports-collection really *is*. So we took some time to write a script which goes through a ports collection and gathers some statistics what files exist (on a per-port basis), and how much room they take up. I'll post some results of that script as a follow-up to this message. (that reply will only go to freebsd-ports...). So, hopefully that information will be of some interest even if we never do anything with the pkg-data ideas. Someone else (whose name I also forget) said something which focused my attention a bit more on patch-files per se, and how they really aren't the same as the other files I'm trying to collapse into pkg-data. Also, I haven't gotten quite as far along with figuring out what to do with pkg-descr files, so (in the interests of time), I think I'll "leave those alone" for this proposal. We've worked on some other ideas too, but those aren't far enough along yet. So I'll just write them up as "future work" (when I update the web pages...). The third proposal is basically: a) move most "standard" files into a new pkg-data file, as described in previous proposals, except for pkg-descr and "patch" files. b) create a new directory at the root directory of the ports collection. That directory would be called "Patches", and inside would be a directory for each category. Inside each Patches/category directory would be a single-file for each port in that category, where that single-file would have all the "ports-collection patches" for the matching port. c) [minor] in the pkg-data section for distinfo, I'd like to change the format for each file from, eg: MD5 (bash-2.05b.tar.gz) = 5238251b4926d778dfe162f6ce729733 SIZE (bash-2.05b.tar.gz) = 1956216 to 5238251b4926d778dfe162f6ce729733 1956216 bash-2.05b.tar.gz So it collapses most standard files into the pkg-data file, and collapses the patch-related files for a given port into files such as: ports/Patches/shells/patches-bash2. This will not result in as dramatic a drop in inodes, but it has the nice side-effect that Patches are separated from all the other files. Thus, end-users could 'cvsup refuse' the patches for categories that they do not care about, and it would not break operations which work on the entire ports collection (such as `make index'). Our current transform script doesn't do part 'c' yet, but I thought it would be interesting to note the result of 'b': (63) du -sk pd-new/ports pd-new/ports/Patches 190944 pd-new/ports 28414 pd-new/ports/Patches 162530 == "ports without the Patches" And to compare the present ports collection to a transformed ports collection, the result would look like: 1K-blocks Inodes Used Used 238742 79154 pd-orig/ports 190944 49321 pd-new/ports 20% 37% = reduction So, should we pursue any of this? -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu