Date: Thu, 4 Apr 2002 15:17:34 -0500 (EST) From: Mikhail Teterin <mi@aldan.algebra.com> To: anarcat@anarcat.dyndns.org Cc: jhb@FreeBSD.org, imp@village.org, des@ofug.org, pst@pst.org, obrien@FreeBSD.org, cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org, winter@jurai.net, jkh@winston.freebsd.org, rwatson@FreeBSD.org Subject: Re: cvs commit: src/usr.sbin/sysinstall install.c installUpgrade Message-ID: <200204042017.g34KHYnF006405@aldan.algebra.com> In-Reply-To: <20020404181423.GB279@lenny.anarcat.dyndns.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[Reply-To set] On 4 Apr, The Anarcat wrote: > Indexed packages might take up more space on a CD, but regardless of > the network connection, it should speed up package installs a 2-fold > at least. But if that makes them 15% bigger, I think I'd rather wait. 15% increase of the download time is more than 100% of the extraction time for too many people. And then you store the 15% bigger archives forever... > I'm not sure I understand what you mean by seekable. Some network > connections (HTTP 1.1 and FTP, IIRC) are seekable, ie you can start > downloading http files at any given location. By "seekable" I mean, that the same data can be read multiple times. True, you can do that over the network too, but the net-bandwidth available to even the most fortunate of us is nowhere close to the local storage bandwidth. > The problem is with non-seekable (non-indexed would be the proper > word) archives. For .tgz (or .tbz2), wether you have the seekable file > or network connection doesn't matter since you must extract the whole > file in the order to seek individual files in the archive. > > Repeat after me: there's no way to access a given individual file in a > tar(1) archive without extracting the archive up to the given file. It is true. ZIP provides _generic_ index, which is good for many. We can do better by placing the "important" files -- such as the "install" script -- or "+CONTENTS" at the beginning of the file at the archiving time... In fact, I think, that's what happens now, the package tools just don't rely on that fact... >> What's left are the people, who like to install directly from the >> network and don't mind redownloading in case of a failure. My >> guesstimate is those are not big in number and mostly don't care for the >> method chosen one way or the other... > > Choosing an index archive format doesn't mean you can't keep a local > copy, and actually, right now, libh does keep a copy of the .zip > locally, as a temporary, yes, but that is a simple toggle. What I was saying is if you are likely to have the local copy anyway, it does not matter that much if it is indexed or not -- extraction is very fast anyway... Again -- indexing saves you time but wastes space. Some (myself included) think, space is more important. >> >> And I suspect, those who disagree are simply blinded by their >> >> blazingly fast connections and fat disks. :-) >> >> > No, the fact is that we have thought about some of the problems the >> > current scheme doesn't address and which you haven't apparently >> > thought about how to address either. >> >> Mmm, sounds familiar :( Can you explain, what those are, or point me to >> the mail archive, where this was discussed? > > I can point you to the libh design document on /projects/libh.html. Ok... I just read it. It does not contain anything, that was not expressed in this thread -- regarding package format that is. Zip is advocated as the most suitable in there... And I remain convinced, that the overhead of compressing each file individually (which is what Zip does) is too much of a price to pay... The present pkg_add can read +CONTENTS (or whatever the meta-data file(s) is(are) going to be named) from the beginning of the tarball and proceed to extracting from the rest of the file, preferably -- directly into the right place, or into the temporary directory _on the same filesystem_, so that the bits can be quickly mv-ed to the right place. The document describes having to extract into a temporary location as "evil", which is not neccessarily true. If the location is chosen on the same filesystem as the final destination, there will be enough space, and there will be very little overhead -- rename(2) is very quick... -mi To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200204042017.g34KHYnF006405>