Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Nov 2008 23:42:14 +0300
From:      Eygene Ryabinkin <rea-fbsd@codelabs.ru>
To:        freebsd-ports@freebsd.org, bug-followup@freebsd.org
Subject:   Re: ports/128754: [port infrastructure] implement master sites randomization
Message-ID:  <39xW7jCcvncjT8ROmhiSV7bt%2BfI@St9vgMtsbWIYRnEpVt0VUBJqeA4>
In-Reply-To: <eug34ZXB3oCwzaFhy76lDCxgwDo@L0uXKD5VodxBs2kecYjbmbfI7F4>
References:  <20081110155616.DA66A1AF424@void.codelabs.ru> <20081111032350.0b22a853@gumby.homeunix.com> <20081111153554.GA4294@wep4035.physik.uni-wuerzburg.de> <eug34ZXB3oCwzaFhy76lDCxgwDo@L0uXKD5VodxBs2kecYjbmbfI7F4>

next in thread | previous in thread | raw e-mail | index | archive | help

--CUfgB8w4ZwR/yMy5
Content-Type: multipart/mixed; boundary="tThc/1wpZn/ma/RB"
Content-Disposition: inline


--tThc/1wpZn/ma/RB
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Tue, Nov 11, 2008 at 07:19:03PM +0300, Eygene Ryabinkin wrote:
> > On Tue, Nov 11, 2008 at 03:23:50AM +0000, RW wrote:
> > > On Mon, 10 Nov 2008 18:56:16 +0300 (MSK)
> > > I think it would be sensible to seed srand from a hash of something
> > > reproducible to make better use of caches - maybe DISTNAME+DISTVERSIO=
N.
> > >
>=20
> For the feeding the hashes: RW, do you mean HTTP caches?  In principle,
> this is a neat idea: it will achieve load-balancing between the sites.
> But as it will use the same master sites order for the given port, this
> will be failing when the first download site is almost down: the
> download will take very long.  But probably stable order of the sites
> can be made settable via the variable, e.g.
> RANDOMIZE_MASTER_SITE_REPRODUCIBLY.  Will it be fine?  Please, note that
> this can be achievable only for the awk script: random(6) can not be
> currently directed to do this.

OK, I reworked the patch to add RANDOMIZE_MASTER_SITE_REPRODUCIBLY:
it guarantees the same order of master sites for the given combination
of the port name and version.

Please, give it a shot and comment.

Thanks!
--=20
Eygene
 _                ___       _.--.   #
 \`.|\..----...-'`   `-._.-'_.-'`   #  Remember that it is hard
 /  ' `         ,       __.--'      #  to read the on-line manual  =20
 )/' _/     \   `-_,   /            #  while single-stepping the kernel.
 `-'" `"\_  ,_.-;_.-\_ ',  fsc/as   #
     _.-'_./   {_.'   ; /           #    -- FreeBSD Developers handbook=20
    {_.-``-'         {_/            #

--tThc/1wpZn/ma/RB
Content-Type: text/x-diff; charset=koi8-r
Content-Disposition: attachment;
	filename="0001-Add-awk-randomization-script.patch"
Content-Transfer-Encoding: quoted-printable

=46rom 40644084ccdb42bd34bf85cf43088c3ecb3e205f Mon Sep 17 00:00:00 2001
=46rom: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Date: Mon, 10 Nov 2008 18:38:01 +0300
Subject: [PATCH] Add awk randomization script

Currently port download sites can be randomized only if the system has
/usr/games/random installed.  I introduce the script that acts as the
replacement for /usr/games/random and needs no parts that are not exists
in the base system (only AWK is needed ;))

I had also added new directive, RANDOMIZE_MASTER_SITE_REPRODUCIBLY.  It
applies randomization, but guarantees that the site order for the given
portname and portversion will be the same.  Might be useful when one
uses HTTP caching and have many FreeBSD servers behind the cache.

Signed-off-by: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
---
 Mk/bsd.port.mk |   15 +++++++++++++--
 Mk/rnd.awk     |   26 ++++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 2 deletions(-)
 create mode 100644 Mk/rnd.awk

diff --git a/Mk/bsd.port.mk b/Mk/bsd.port.mk
index 85ec297..dcc2d31 100644
--- a/Mk/bsd.port.mk
+++ b/Mk/bsd.port.mk
@@ -2169,11 +2169,22 @@ FETCH_REGET?=3D	0
 FETCH_CMD?=3D		${FETCH_BINARY} ${FETCH_ARGS}
=20
 .if defined(RANDOMIZE_MASTER_SITES)
-.if exists(/usr/games/random)
+.if defined(RANDOMIZE_MASTER_SITES_REPRODUCIBLY)
+RANDOM_CMD?=3D	${AWK}
+RANDOM_SEED!=3D	${ECHO} ${DISTNAME} | ${MD5}
+RANDOM_ARGS?=3D	"-v value=3D'${RANDOM_SEED}' -f ${PORTSDIR}/Mk/rnd.awk"
+.elif exists(/usr/games/random)
 RANDOM_CMD?=3D	/usr/games/random
 RANDOM_ARGS?=3D	"-w -f -"
-_RANDOMIZE_SITES=3D	" |${RANDOM_CMD} ${RANDOM_ARGS}"
+.else
+RANDOM_CMD?=3D	${AWK}
+RANDOM_ARGS?=3D	"-v value=3D'' -f ${PORTSDIR}/Mk/rnd.awk"
 .endif
+.if defined(RANDOMIZE_MASTER_SITES_REPRODUCIBLY)
+RANDOM_CMD?=3D	${AWK}
+RANDOM_ARGS?=3D	"-v value=3D`echo ${DISTNAME}-${DISTVERSION} | md5` -f ${P=
ORTSDIR}/Mk/rnd.awk"
+.endif
+_RANDOMIZE_SITES=3D	" |${RANDOM_CMD} ${RANDOM_ARGS}"
 .endif
=20
 TOUCH?=3D			/usr/bin/touch
diff --git a/Mk/rnd.awk b/Mk/rnd.awk
new file mode 100644
index 0000000..ce9943f
--- /dev/null
+++ b/Mk/rnd.awk
@@ -0,0 +1,26 @@
+BEGIN {
+	count =3D 0;
+	if (length(value) !=3D 0)
+		srand(value);
+	else
+		srand();
+}
+{
+	for (i =3D 1; i <=3D NF; i++)
+		array[count++] =3D $i;
+}
+END {
+# Need to drop a couple of initial rand() values: they tend
+# to be around 0.8 - 0.9, so for fairly small array length
+# they will produce identical values at the beginning.
+	rand(); rand(); rand(); rand();
+
+	for (i =3D count - 1; i > 0; i--) {
+		j =3D int(10*count*rand()) % i;
+		if (j =3D=3D i) continue;
+		t =3D array[i]; array[i] =3D array[j]; array[j] =3D t;
+	}
+
+	for (i =3D 0; i < count; i++)
+		print array[i];
+}
--=20
1.6.0.3


--tThc/1wpZn/ma/RB
Content-Type: text/x-diff; charset=koi8-r
Content-Disposition: attachment;
	filename="0001-ports-7-document-new-option-RANDOMIZE_MASTER_SITES.patch"
Content-Transfer-Encoding: quoted-printable

=46rom ffcaf1247756f403dc1e7308b1c68f6d6fcea524 Mon Sep 17 00:00:00 2001
=46rom: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Date: Fri, 14 Nov 2008 23:33:39 +0300
Subject: [PATCH] ports(7): document new option RANDOMIZE_MASTER_SITES_REPRO=
DUCIBLY

Applies master site randomization, but guarantees that for the given
combination of the port name and version, the order of the download
sites will be the same at the each invocation.

Signed-off-by: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
---
 share/man/man7/ports.7 |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/share/man/man7/ports.7 b/share/man/man7/ports.7
index 7f8fb07..08c109b 100644
--- a/share/man/man7/ports.7
+++ b/share/man/man7/ports.7
@@ -420,6 +420,12 @@ Try going to these sites for all files and patches, fi=
rst.
 Try going to these sites for all files and patches, last.
 .It Va RANDOMIZE_MASTER_SITES
 Try the download locations in a random order.
+.It Va RANDOMIZE_MASTER_SITES_REPRODUCIBLY
+Try the download locations in a random order, but keep the same order
+for each combination of port name and version.
+Useful if you have a bunch of hosts that are sitting behind the
+HTTP cache and you want to save some traffic, but still want to
+randomize the list of master sites.
 .It Va MASTER_SORT
 Sort the download locations according to user supplied pattern.
 Example:
--=20
1.6.0.3


--tThc/1wpZn/ma/RB--

--CUfgB8w4ZwR/yMy5
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkkd4qUACgkQthUKNsbL7YheFwCeJWnhgk8bMTw5xdzrX+hjTVHo
Y0EAnRIr8OubMchkXjuqdqlOFD0DPE/8
=xBZi
-----END PGP SIGNATURE-----

--CUfgB8w4ZwR/yMy5--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39xW7jCcvncjT8ROmhiSV7bt%2BfI>