Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Aug 2006 08:55:00 +0400
From:      "Andrew Pantyukhin" <infofarmer@FreeBSD.org>
To:        "Doug Barton" <dougb@FreeBSD.org>
Cc:        FreeBSD Ports <ports@FreeBSD.org>, portmgr@FreeBSD.org
Subject:   Re: Enforcing "DIST_SUBDIR/DISTFILE" uniqueness
Message-ID:  <cb5206420608202155r78e1f96fg2590bf2dcdcc8f72@mail.gmail.com>
In-Reply-To: <44E8EEEC.3040907@FreeBSD.org>
References:  <cb5206420608160931q65adc8fft6084e7f498b403f5@mail.gmail.com> <cb5206420608190944o5c07dbefwfdf50586ae23ef5a@mail.gmail.com> <44E81C12.9050306@FreeBSD.org> <cb5206420608200158q22edef00jd53e646439207149@mail.gmail.com> <44E8EEEC.3040907@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8/21/06, Doug Barton <dougb@freebsd.org> wrote:
> I'm combining both of your responses to save time.
>
> Andrew Pantyukhin wrote:
> > On 8/20/06, Doug Barton <dougb@freebsd.org> wrote:
> >> Andrew Pantyukhin wrote:
> >> > On 8/16/06, Andrew Pantyukhin <infofarmer@freebsd.org> wrote:
> >> >> I'd like to propose a policy to enforce a change in
> >> >> DIST_SUBDIR whenever a distfile is rerolled in-place, i.e.
> >> >> when checksum changes, but name stays unchanged.
> >> >>
> >> >> Moreover, effort should be made whenever possible to
> >> >> make the old file available for download from an
> >> >> alternative location.
> >> >>
> >> >> This policy will rid us of some fetch-related headaches.
> >> >> It also will make it possible to share distfiles between
> >> >> hosts with ports trees of different dates. Some rare issues
> >> >> might also be resolved as a result of this. For one, ftp
> >> >> mirrors could be configured to allow upload, but deny
> >> >> modification and/or deletion.
> >> >>
> >> >> One thing I would personally frown upon is using
> >> >> something like "fetch -o othername" to save a file with a
> >> >> different name. It looks all right, but it prevents us from
> >> >> looking for mirrors in an automated way when master
> >> >> sites go down.
> >> >
> >> > I'll start preparing statements
> >> > for documentation and thinking about a way to watch for
> >> > "violations". I also intend to go through CVS and find past
> >> > "offenders" to prod them about it.
> >> >
> >> > The recent openoffice update rerolled a file in-place, and while
> >> > it may seem irrelevant or even beneficial (erasing 286Mb of
> >> > the old file), the fact is that it prevents us from keeping distfile
> >> > history on unversioned file servers,
> >>
> >> IMO this represents a very small minority of FreeBSD users,
> >> and frankly I feel that it is incumbent on you to solve this problem
> >> for your circumstance.
> >
> > The percentage of FreeBSD users who need 5-10 year old
> > sources in the CVS is very small, too.
>
> Therefore, IMO, we should not be complicating the lives of the vast majority
> of freebsd users (not to mention taking up some small portion of additional
> space on the mirrors, etc.) in order to do what you suggest.

How would your life be complicated? It's totally transparent to
the user.

> > But we treasure our src history and don't throw out any commits.
>
> I don't see the two things as being equivalent at all. The least of the
> reasons being that what's in our repo is the history of our project. What
> you're asking for is that we dedicate resources to archiving the history of
> other projects. (And yes, I realize that you could argue that because
> version xyz was in _our_ ports tree at some point in time that it's part of
> _our_ history, but I don't buy it.)

I argue that many people do make fetch for many or all ports
and make distdir available via ftp/web.

> > Well, I happen
> > to treasure our ports history. I really want people to have a
> > chance, however slim, to be able to build ports using a very
> > old tree.
>
> Then I think, by all means, you should put together a resource for them to
> be able to do that. I don't think (for whatever that's worth) that it should
> be the ports tree.

The ports tree has a history of filenames. If they're unique at
their paths, then that's all we need. A user can override MS
and yes, I've maintained and will maintain one of distfile
mirrors.

> >> OTOH, your solution would break the logic that portmaster (and I believe
> >> portupgrade also) uses to detect and delete stale distfiles.
> >
> > AFAICT portmaster's logic still misses the case when
> > DIST_SUBDIR has changed for whatever reason.
> >
> > portupgrade --distclean will not be broken, it deals with
> > distfiles at the current DIST_SUBDIR
> >
> > portsclean -D is actually broken now, and will be fixed if
> > my proposal is implemented. It doesn't erase an old file if
> > its path/name match those of a new file.
>
> Actually portmaster and portupgrade share these characteristics. If the
> subdir changes with a new version of the port, portmaster will not "see" the
> old files.

portupgrade -D is to deal with the name clashes, not to
remove old files.

> > Oh, now that I've had another look at portmaster's logic it
> > doesn't makes sense at all.
>
> It might not make sense to you, but it actually works in the vast majority
> of cases, so it's not entirely without merit. :)
>
> > What if distfiles of different
> > ports have similar %[-_]* names?
>
> Then the user is given a choice of whether or not to delete the file, unless
> they've chose to always or never delete distfiles. My design choice is to be
> aggressive, and try to clean up more, not less. That said, the new method
> that I use (as of version 1.6) creates significantly fewer false positives
> than it did previously.
>
> > What if different ports require the same distfile of different versions?
>
> That's an edge case, but it does happen. The user either needs to know this,
> or run the risk of downloading the distfile again. For users that value
> network bits more than disk bits, they can either use the -D option, or
> choose to carefully monitor what files are deleted. Or, not use portmaster,
> which is of course a valid option. :)

I'm already in love with it, sorry :)

> > What if distname changed radically?
>
> Again, an edge case, but it does happen. See below.
>
> > You can't make such broad
> > assumptions about distfile patterns. You should probably
> > do it the same way portsclean -D does - i.e. to check
> > "dist_subdir/distfile" against distinfo files of all installed
> > ports or all ports, whichever a user prefers.
>
> IMO this wouldn't actually help with either of the cases that you describe,
> unless you were to build a database of installed ports and distfiles. And
> building "extra" databases is exactly what I'm trying to avoid doing. I
> could also go into some detail about why even using the file name patterns
> from the distinfo file to glob against really isn't any better than the way
> I do it, but I won't because ...

I was going to ask you if you'd be willing to use tmp files,
i.e. to traverse the whole tree, saving info from distinfo's to
some file. Each time some flag is set.

portsclean doesn't glob against anything, it looks for strict
matches. That's why the only case it misses is the distfile
name clash I'm trying to get rid of.

> The real solution to this is something that a few of us kicked around a
> while back, but unfortunately it never gained traction. <..>

IMO, what we need is a good ports database. portmaster
is good enough because it's written in shell. I don't really
see why databases should be avoided. And if portmaster
is to be present in the base, it might as well be integrated
with portsnap and provide incremental DB updates.

> So meanwhile, back to your original proposal, I think you're asking to add a
> lot of complexity, and other costs to something that is fairly simple now,
> without providing a corresponding benefit to even a significant minority of
> our users. And I'll leave it at that for now, and let some other folks speak
> up if they so desire.

Do occasional subdir changes, only to avoid distfile name
clashes, really sound like a lot of complexity to you? It's
not even a strict requirement, I'll just bark at the "offenders".



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?cb5206420608202155r78e1f96fg2590bf2dcdcc8f72>