Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Jul 2020 16:24:11 +0300
From:      Sergei Vyshenski <svysh.fbsd@gmail.com>
To:        FreeBSD Ports <freebsd-ports@freebsd.org>
Subject:   Autogenerated versus handmade distributions from Github (possible new section to PHB)
Message-ID:  <0202562f-8035-03b7-d308-e2a2faf30b3e@gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,
Please comment.
Regards, Sergei
=============================
See PDF version:
https://drive.google.com/file/d/1ly9ylKJehzcqXxy3NhqN3Pywy0iqZwn9/view?usp=sharing

The text below is being offered for discussion as a candidate for a new 
section of PHB, which could be placed after the existing section"5.17. 
How to Use USE_GITHUB with Git Submodules?"

  Writing of this text has been advised by Adam Weinberger 
<mailto:adamw@FreeBSD.org> (adamw@):

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221634

and is based on posts by

Yasuhiro KIMURA (yasu at utahime.org):

https://lists.freebsd.org/pipermail/freebsd-ports/2017-December/111758.html

Mathieu Arnold (mat@):

https://lists.freebsd.org/pipermail/freebsd-ports/2017-December/111759.html

and Sergei Vyshenski (svysh.fbsd at gmail.com):

https://lists.freebsd.org/pipermail/freebsd-ports/2017-December/111753.html 
<https://lists.freebsd.org/pipermail/freebsd-ports/2017-December/111753.html>;

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221634

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247459 (this one has 
status: open)

=== A proposed new section of PHB starts here ===

*Autogenerated versus handmade distributions from Github*


*A-distributions*

USE_GITHUB triggers a nice feature of automatic distribution generating, 
which is provided by the Github infrastructure. Upon user request it 
automatically generates distribution archive (aka git archive or source 
code archive or tarball) from the master branch of a specified project. 
It also accepts wanted format (tgz, zip etc) and wanted tag (commit 
number) of the master branch. Author of the project could label some 
commit numbers with version numbers, thus performing version release. 
Version numbers are published on a special web pages like:

https://github.com/openxpki/openxpki/tags

https://github.com/openxpki/openxpki/releases

Let us call such automatically (dynamically) generated distributions 
"A-distributions". A-distributions do not exist on the Github portal. 
They are generated upon user request. This request can come in two ways:

1)Click on an URL (shown on the Web page) like this:
https://github.com/openxpki/openxpki/archive/v3.6.1.tar.gz
Note a keyword "archive" in this type of URL.

2)Through USE_GITHUB the same file is accessible like this:
https://codeload.github.com/openxpki/openxpki/tar.gz/v3.6.1?dummy=/openxpki-openxpki-v3.6.1_GH0.tar.gz
Note a keyword "codeload" in this type of URL.

Both these URLs are virtual ones in a sense that the referenced file 
actually does not exist, and is generated only upon user request.

*H-distributions*

Sometime getting of an A-distribution could place a user in a difficult 
position. For illustration consider two such cases: 1) cyclic 
dependencies and 2) excessive complexity.

1)Project P is called to have a cyclic dependency, if project P depends 
on project P1, which in turn depends on project P.
In the simplest case of cyclic dependency, project P can depend on 
itself (autodependency). Imagine that project P is dedicated to building 
a library P_L, which is meant for text processing. Usually author of 
such a project P illustrates use of the library P_L by employing this 
library while building documentation for the same project P. 
Straightforward porting of the project P for FreeBSD will bring ports 
infrastructure on halt with complaint about not installed library P_L, 
which is needed to build documentation, which in turn is part of the 
project P.

2)Project P is called to have excessive complexity, if a tiny (and not 
essential) part of it requires huge amounts of dependencies. Imagine 
that a project uses a high-level workflow language to describe logic of 
some complex business process. This project provides docs and examples, 
where workflow language is used as a source for graphical visualization. 
Then building this docs and examples would require tons of other 
packages and would spend much more time that the all the rest of the 
project P.

Autodependency mentioned on points 1 above could be avoided in two ways:

* *Porter's way:*Porter can split a port for the project P into two
ports: P_library and P_doc, where port P_doc depends on port P_library.
* *Author's way:*Author of the project P could prepare a documentation
in a form, which does not require further use of library P_L. To
achieve this, he makes use of the library P_L, which is available on
his personal host anyway. Then author by hand prepares a
distribution, which contains sources for building the library P_L,
and pre-built documentation in a ready-to-use form. Let us call such
distributions "H-distributions".

Example of project with autodependency is a Github project 
libexpat/libexpat, which is ported to FreeBSD as port textproc/expat2 
with use of H-distribution.

Excessive complexity mentioned in point 2 above could be avoided in the 
Author's way as follows. Author pre-builds docs and examples with fancy 
pictures himself. Then author by hand prepares an H-distribution, which 
contains pre-built docs and examples, along sources for the rest of the 
project (that is no sources for docs and examples).

Example of project with excessive complexity is a Github project 
tdiary/tdiary-core, which is ported to FreeBSD as port www/tdiary with 
use of H-distribution.

H-distributions are not available via USE_GITHUB. They could only be 
downloaded from the pages where author has put them, like:

https://github.com/tdiary/tdiary-core/tags

https://github.com/tdiary/tdiary-core/releases

using clickable URL like:

https://github.com/tdiary/tdiary-core/releases/download/v5.1.2/tdiary-full-v5.1.2.tar.gz

Note keywords "releases" or "download" in this type of URL.

URLs for H-distributions are real ones in a sense that the referenced 
file actually exists, and has been generated and uploaded by author's hand.

To fetch the H-distribution from port, porter should remove USE_GITHUB, 
and add explicit MASTER_SITES, and maybe DISTNAME or DISTPREFIX.

*Examples of ports using H-tarballs:*

textproc/expat2

https://github.com/libexpat/libexpat

www/tdiary

https://github.com/tdiary/tdiary-core

net/uriparser

https://github.com/uriparser/uriparser (will use H-tarball, if PR#247459 
is committed)

*A- and H-distributions mixed*

Ultimate example of simultaneous use of both A- and H-distributions can 
be seen on page

https://github.com/libexpat/libexpat/releases/tag/R_2_2_8

which provides URL for H-distribution:

https://github.com/libexpat/libexpat/releases/download/R_2_2_8/expat-2.2.8.tar.gz

and provides URL for A-distribution (it is called "Source code" here):

https://github.com/libexpat/libexpat/archive/R_2_2_8.tar.gz

Please compare both types of distribution and see how very different 
they are.

This H-distribution is employed for port /textproc/expat2, while this 
A-distribution is left for a special connoisseurs.

*A- and H-distributions compared*

You should always use released H-distributions, if they are available, 
authors go through much efforts to make them, so that other people can 
build their software more easily. The git archive (A-distribution) you 
get from USE_GITHUB should only be used if nothing else is available.

=== A proposed new section of PHB ends here ===






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0202562f-8035-03b7-d308-e2a2faf30b3e>