Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Mar 2015 22:18:16 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-ports-bugs@FreeBSD.org
Subject:   [Bug 199052] [PATCH] Update WRKSRC for GH_COMMIT, related to "legacy.tar.gz" (codeload) GitHub backend method
Message-ID:  <bug-199052-13@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199052

            Bug ID: 199052
           Summary: [PATCH] Update WRKSRC for GH_COMMIT, related to
                    "legacy.tar.gz" (codeload) GitHub backend method
           Product: Ports & Packages
           Version: Latest
          Hardware: Any
                OS: Any
            Status: New
          Keywords: patch
          Severity: Affects Some People
          Priority: ---
         Component: Ports Framework
          Assignee: portmgr@FreeBSD.org
          Reporter: lightside@gmx.com
                CC: freebsd-ports-bugs@FreeBSD.org
          Keywords: patch

Created attachment 155032
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=155032&action=edit
Proposed patch (since 382622 revision)

Currently, the GH_COMMIT is (considered to) deprecated in favor of "new"
USE_GITHUB and GH_TAGNAME (see changes after ports r381618).

But I did some tests (bug 194898, comment #24) and found, that 7 alphanumeric
commit hash might doesn't work after some changes to repository (which applies
to GH_TAGNAME and related "new" USE_GITHUB also). Looks like, the 7
alphanumeric commit hash is used for newer Git commits, but then it might
change to longer abbreviated commit hash with newer changes to repository.
There is a need to use longer abbreviated commit hash or full commit hash in
case of correct download method. For example, there is "f45daed3cf3"
abbreviated commit hash for FreeBSD ports tree on GitHub, which is 11
alphanumeric abbreviated commit hash, currently:
https://github.com/freebsd/freebsd-ports/tree/f45daed3cf3
It doesn't work with 7 alphanumeric commit hash in this case:
% curl -Is https://github.com/freebsd/freebsd-ports/tree/f45daed | grep ^HTTP
HTTP/1.1 404 Not Found

There are about 5208 non 7 alphanumeric commit hashes from 381033 of total for
mentioned repository, currently:
% git log --pretty=%h | grep -v '^.\{7\}$' | wc -l
    5208

Some statistics for length of abbreviated commit hash and current count:
L:    Count
7:    375825
8:    4884
9:    311
10:    12
11:    1

Personally, I think, that deprecation of GH_COMMIT (and related "legacy.tar.gz"
codeload GitHub backend method) was untimely (premature). The GH_TAGNAME (and
related "tar.gz" codeload GitHub backend method) has the same issues with 7
alphanumeric commit hash. It's possible to use longer abbreviated (or full)
commit hash for "legacy.tar.gz" method with fix to automatic determination of
WRKSRC, which I attached to this PR.

If you do/did your own tests, you might understand, that GitHub uses their
codeload backend methods (legacy.tar.gz, legacy.zip, tar.gz, zip) through their
frontend methods (archive, tarball, zipball) correctly, e.g. with using full
commit hash, tag or branch:

1. With using "tarball" GitHub frontend method:
% curl -Lv -o freebsd-ports-f45daed3cf3-tarball.tar.gz
https://github.com/freebsd/freebsd-ports/tarball/f45daed3cf3
Location:
https://codeload.github.com/freebsd/freebsd-ports/legacy.tar.gz/f45daed3cf3b694a969192c615bceba0a247b4d4
% sha256 freebsd-ports-f45daed3cf3-tarball.tar.gz
SHA256 (freebsd-ports-f45daed3cf3-tarball.tar.gz) =
fac04ff56d18ef6af23dd3d7b4271fd4d7e8e9d5924d84317921a07224745f93
% tar -tf freebsd-ports-f45daed3cf3-tarball.tar.gz | head -1 | cut -d'/' -f1
freebsd-freebsd-ports-f45daed

2. With using "archive" GitHub frontend method:
% curl -Lv -o freebsd-ports-f45daed3cf3-archive.tar.gz
https://github.com/freebsd/freebsd-ports/archive/f45daed3cf3.tar.gz
Location:
https://codeload.github.com/freebsd/freebsd-ports/tar.gz/f45daed3cf3b694a969192c615bceba0a247b4d4
% sha256 freebsd-ports-f45daed3cf3-archive.tar.gz
SHA256 (freebsd-ports-f45daed3cf3-archive.tar.gz) =
71b5def07f84f522cba78cc9d570c70521f404ecea0bd291743f9b81a6140d0b
% tar -tf freebsd-ports-f45daed3cf3-archive.tar.gz | head -1 | cut -d'/' -f1
freebsd-ports-f45daed3cf3b694a969192c615bceba0a247b4d4

The contents of 1 and 2 archives are the same, while different parent
directory:
% tar -xf freebsd-ports-f45daed3cf3-tarball.tar.gz
% tar -xf freebsd-ports-f45daed3cf3-archive.tar.gz
% diff -qruN freebsd-freebsd-ports-f45daed
freebsd-ports-f45daed3cf3b694a969192c615bceba0a247b4d4

The benefit of using "legacy.tar.gz" GitHub backend with full (or abbreviated)
commit hash is short commit hash for parent directory. The tests shows exactly
7 alphanumeric commit hash even for requested full commit hash.

Again, since "legacy.tar.gz" and "tar.gz" backend methods exists on the same
https://codeload.github.com server and works, the deprecation of some of them
is untimely (premature), in my opinion. The existing methods might have
possible usage, if used correctly.

There are other changes, which might need attention (e.g. removal of forced
external DISTNAME changes and constant _GH* addition (where it could be a
one-time feature or solved inside of concrete port, when needed)), but they are
out of scope of this PR and have some opposition, related to previous troubles
of using GitHub methods and different opinion(s) about implementation(s) in
ports framework. Personally, I attached some GitHub usage examples in bug
194898 (e.g. attachment 154942 and attachment 154943), if someone interested.

-- 
You are receiving this mail because:
You are on the CC list for the bug.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-199052-13>