From owner-freebsd-hackers Thu May 16 16:19:50 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id QAA14667 for hackers-outgoing; Thu, 16 May 1996 16:19:50 -0700 (PDT) Received: from DATAPLEX.NET (SHARK.DATAPLEX.NET [199.183.109.241]) by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id QAA14662 for ; Thu, 16 May 1996 16:19:46 -0700 (PDT) Received: from 199.183.109.242 by DATAPLEX.NET with SMTP (MailShare 1.0fc5); Thu, 16 May 1996 18:15:59 -0600 Message-ID: Date: 16 May 1996 18:15:46 -0500 From: "Richard Wackerbarth" Subject: Standard Shipping Containers - A Proposal for Distributing FreeBSD To: "FreeBSD Hackers" Cc: "FreeBSD Current" , "freebsd-stable@freebsd.org" X-Mailer: Mail*Link PT/Internet 1.6.0 Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk I see what appears to me to be a problem in the distribution of FreeBSD sources. I also propose a solution. I welcome your discussion. Richard The Problem: There are too many different variations of the same basic information. The Product: There are, and logically should be, four different "product lines". At the moment, they are 2.1, 2.2, "current", and "cvs". Each has its purpose and I don't intend to comment on that except to note that the similarities in the first three exceed their differences. The Distribution: There are seven distribution channels upon which I will comment. 1) Direct access to the master tree. This really applies only to the cvs tree and is "the only way to go" for commiters who are well connected. 2) Using "mirror". 3) Using "mirror" with directory listing cached on the server. 4) Using "sup". 5) Using "ctm". 6) Using distribution tarballs. 7) Using the "live file system" from CD. Characteristics of the Distribution Mechanisms. a) Only (1) and (2) provide "up to the minute" copies. All the rest give only a snapshot at server defined intervals. However, they exert an extremely heavy load on the server. The remainder compromise (in a reasonable mannner) by reusing the tree scan for multiple users at the expense of a delay in the update. b) (3) and (4) are functionally similar c) (1) thru (5) offer incremental updates. The Specific Difficulty. Each distribution mechanism has its own way of getting started. If I start with a clean disk, I must obtain a very large (28M compressed for the whole source) "update" to get started. In general, I cannot use the results of another distribution in place of a large portion of that transfer. CTM is perhaps better in that with it, we can create an update to transform one tree into another. However, it is significant work to attempt to identify and create the transformations from multiple starting points. The Proposal. Since all the reasonable distribution mechanisms are based upon server initiated snapshots, I suggest that, for each product, we do the following: 1) Have a single mechanism to define the snapshots that will be delivered. Then assure that everyone delivers exactly the same "product". 2) Include with that distribution the identifier(s) which would allow a user to use that distribution as a starting place for another distribution method. (In the case of CTM, this would mean the .ctm_status file.) Suggested details. 1) Since we are running CTM for each of the products, I would start by having the CTM servers define the snapshots. The .ctm_status file would then become a part of the source tree and everyone would distribute it. In particular, it would get included on the sup servers, in the distribution tarballs, and on the live file system CD. This would allow anyone who has a copy of the tree from any of these sources to update it by applying the ctm files. 2) I would also make available the directory of sup update keys. Although the one on the CD should match that distribution, they do not have to be maintained totally up-to-date. If you use a slightly out-of-date version, sup will simply replace a few additional files. 3) In order to coordinate these events, the sup servers would trigger their updates on the basis of the receipt of a ctm update. 4) In preparing a CD-ROM, we need to either a) freeze the source tree far enough in advance of the release to allow the updates to make the update circuit, or, b) freeze the update circuit and anticipate the effect of the final update or, c) use a combination of the two. Freeze the ctm updates before the fact. Allow the sup update to propogate. for inclusion on the CD. Anticipate the ctm update by adding one to the last count propogated if there were any changes. After the CD is frozen, use it to generate the next input to the ctm sequence. Conclusions: 1) Such a methodology will assure that it is easy for any user to jump from a CD or tarball to ctm or sup without having to re-aquire the bulk of the sources. 2) Sup can be used to repair a damaged tree when a complete ctm sequence is not available locally. 3) Ctm can be used for routine updates to avoid transferring the entire file to realize a minor change. 4) We need to enhance ctm to allow it to recognize intentionally pruned trees and ignore that portion of the update. (The argument for this conclusion was not included in this document) -- ...computers in the future may have only 1,000 vacuum tubes and weigh only 1/2 tons. -- Popular Mechanics, March 1949