Date: Sun, 20 Aug 2006 16:23:24 -0700 From: Doug Barton <dougb@FreeBSD.org> To: Andrew Pantyukhin <infofarmer@FreeBSD.org> Cc: FreeBSD Ports <ports@FreeBSD.org>, portmgr@FreeBSD.org Subject: Re: Enforcing "DIST_SUBDIR/DISTFILE" uniqueness Message-ID: <44E8EEEC.3040907@FreeBSD.org> In-Reply-To: <cb5206420608200158q22edef00jd53e646439207149@mail.gmail.com> References: <cb5206420608160931q65adc8fft6084e7f498b403f5@mail.gmail.com> <cb5206420608190944o5c07dbefwfdf50586ae23ef5a@mail.gmail.com> <44E81C12.9050306@FreeBSD.org> <cb5206420608200158q22edef00jd53e646439207149@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I'm combining both of your responses to save time. Andrew Pantyukhin wrote: > On 8/20/06, Doug Barton <dougb@freebsd.org> wrote: >> Andrew Pantyukhin wrote: >> > On 8/16/06, Andrew Pantyukhin <infofarmer@freebsd.org> wrote: >> >> I'd like to propose a policy to enforce a change in >> >> DIST_SUBDIR whenever a distfile is rerolled in-place, i.e. >> >> when checksum changes, but name stays unchanged. >> >> >> >> Moreover, effort should be made whenever possible to >> >> make the old file available for download from an >> >> alternative location. >> >> >> >> This policy will rid us of some fetch-related headaches. >> >> It also will make it possible to share distfiles between >> >> hosts with ports trees of different dates. Some rare issues >> >> might also be resolved as a result of this. For one, ftp >> >> mirrors could be configured to allow upload, but deny >> >> modification and/or deletion. >> >> >> >> One thing I would personally frown upon is using >> >> something like "fetch -o othername" to save a file with a >> >> different name. It looks all right, but it prevents us from >> >> looking for mirrors in an automated way when master >> >> sites go down. >> > >> > Well, if no one is really against, >> >> I am violently against this proposal, but I was really hoping >> that someone else would speak up first. > > No need to be that violent, pal. Nothing's been set in stone yet > and the reason for me writing here is to discuss it, not fight > over it. My intention is not to fight over it either. If the terminology is problematic for you, feel free to substitute "very strongly opposed" instead. >> > I'll start preparing statements >> > for documentation and thinking about a way to watch for >> > "violations". I also intend to go through CVS and find past >> > "offenders" to prod them about it. >> > >> > The recent openoffice update rerolled a file in-place, and while >> > it may seem irrelevant or even beneficial (erasing 286Mb of >> > the old file), the fact is that it prevents us from keeping distfile >> > history on unversioned file servers, >> >> IMO this represents a very small minority of FreeBSD users, >> and frankly I feel that it is incumbent on you to solve this problem >> for your circumstance. > > The percentage of FreeBSD users who need 5-10 year old > sources in the CVS is very small, too. Therefore, IMO, we should not be complicating the lives of the vast majority of freebsd users (not to mention taking up some small portion of additional space on the mirrors, etc.) in order to do what you suggest. > But we treasure our src history and don't throw out any commits. I don't see the two things as being equivalent at all. The least of the reasons being that what's in our repo is the history of our project. What you're asking for is that we dedicate resources to archiving the history of other projects. (And yes, I realize that you could argue that because version xyz was in _our_ ports tree at some point in time that it's part of _our_ history, but I don't buy it.) > Well, I happen > to treasure our ports history. I really want people to have a > chance, however slim, to be able to build ports using a very > old tree. Then I think, by all means, you should put together a resource for them to be able to do that. I don't think (for whatever that's worth) that it should be the ports tree. >> OTOH, your solution would break the logic that portmaster (and I believe >> portupgrade also) uses to detect and delete stale distfiles. > > AFAICT portmaster's logic still misses the case when > DIST_SUBDIR has changed for whatever reason. > > portupgrade --distclean will not be broken, it deals with > distfiles at the current DIST_SUBDIR > > portsclean -D is actually broken now, and will be fixed if > my proposal is implemented. It doesn't erase an old file if > its path/name match those of a new file. Actually portmaster and portupgrade share these characteristics. If the subdir changes with a new version of the port, portmaster will not "see" the old files. > Oh, now that I've had another look at portmaster's logic it > doesn't makes sense at all. It might not make sense to you, but it actually works in the vast majority of cases, so it's not entirely without merit. :) > What if distfiles of different > ports have similar %[-_]* names? Then the user is given a choice of whether or not to delete the file, unless they've chose to always or never delete distfiles. My design choice is to be aggressive, and try to clean up more, not less. That said, the new method that I use (as of version 1.6) creates significantly fewer false positives than it did previously. > What if different ports require the same distfile of different versions? That's an edge case, but it does happen. The user either needs to know this, or run the risk of downloading the distfile again. For users that value network bits more than disk bits, they can either use the -D option, or choose to carefully monitor what files are deleted. Or, not use portmaster, which is of course a valid option. :) > What if distname changed radically? Again, an edge case, but it does happen. See below. > You can't make such broad > assumptions about distfile patterns. You should probably > do it the same way portsclean -D does - i.e. to check > "dist_subdir/distfile" against distinfo files of all installed > ports or all ports, whichever a user prefers. IMO this wouldn't actually help with either of the cases that you describe, unless you were to build a database of installed ports and distfiles. And building "extra" databases is exactly what I'm trying to avoid doing. I could also go into some detail about why even using the file name patterns from the distinfo file to glob against really isn't any better than the way I do it, but I won't because ... The real solution to this is something that a few of us kicked around a while back, but unfortunately it never gained traction. Namely to record the subdir (if any) and distfile/patchfile names in the +CONTENTS file at install/package time. That would completely remove the ambiguity as to which distfiles to remove for the _current_ (installed) port. It would still leave the problem of how to deal with some of the edge cases that you described, and of course you still have to use something similar to the way I do it in order to find stale files that are older than the version that we're deinstalling. But IMO we're bordering on a 95/5 rule here, and _my_ goal is not antiseptic cleanliness in this area. With over 15,000 ports, any solution that is "right" most of the time is way ahead of the game, and adding this info to the +CONTENTS file would make it easier (and cheaper) to get it right way more often than not. So meanwhile, back to your original proposal, I think you're asking to add a lot of complexity, and other costs to something that is fairly simple now, without providing a corresponding benefit to even a significant minority of our users. And I'll leave it at that for now, and let some other folks speak up if they so desire. Doug -- This .signature sanitized for your protection
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?44E8EEEC.3040907>