Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Jul 2022 13:41:53 +0200
From:      Mathieu Arnold <mat@freebsd.org>
To:        freebsd-git <freebsd-git@freebsd.org>
Subject:   Git new feature when cloning
Message-ID:  <20220729114153.cl2p3kpap5qcspz2@aching.in.mat.cc>

next in thread | raw e-mail | index | archive | help

--r3xczjpazl24yfyb
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi,

A while back, Git grew a way to filter the objects it asks the server
when cloning. It can speed up the download because it will download less
data. It also stores less information locally, so this is a bonus.

The only drawback is that whem you ask for information it does not have
locally, it will have to download the missing data, which it'll store
locally, so you don't download something twice.  (it's done under the
hood and you don't see it happening, the only thing you'll see is the
command being a bit longer to return.)

It all happens in the --filter argument to git clone, see
git-rev-list(1) for the whole explanation, and range things you can do.
It can filter a few things, but in order of information downloaded, the
most common values I can see for our usage are:

--filter=3Dblob:none
  This will download all the commits and all the trees (which are the
  file list of a directory), and only the blobs needed to checkout the
  branch you asked for.

--filter=3Dtree:0
  This will download all the commits, and only the trees and blobs
  needed to checkout the branch you asked for.

Both of those can be used with --sparse, which enables sparse checkout,
which basically only checks out the files in the root directory, and you
need to use git sparse-checkout to add/remove files to the checkout.
That can be useful if you don't have a lot of disk space, and need
multiple checkouts to work on. Note that you can't really use --sparse
on the ports tree if you want to build things out of it, because you
would need to add all the dependencies, and the framework, to build a
port. For a kernel developper though, you can probably live with only
having the kernel sources and not the whole world.

And for numbers because we all love numbers :

| filter           | SRC   | PORTS | DOC  |
|------------------|-------|-------|------|
| blob:none        |  605M |  576M | 119M |
| blob:none sparse |  314M |  498M |  37M |
| tree:0           |  407M |  238M |  97M |
| tree:0 sparse    |  115M |  115M |  15M |
| filtering        | 1461M | 1010M | 321M |

This is the size of .git/objects, for a checkout done this morning. So
it is basically the amount of data downloaded from the server.

Note that contrary to using --depth=3DX, which limits the number of
commits you get from the server, and which renders the repository ok for
testing, but not great for development because fo some limitations, the
repository you get when running --filter is fully usable, the only
drawback is that if you need bits of history you filtered out, they will
be downloaded on the fly so internet access may be required.

PS: as filtering is done on the server, a knob needed to be enabled on
    our servers, gitlab and github already supported the feature.
    gitrepo.f.o and gitrepo-dev.f.o have it enabled, I am unsure about
    the mirror status, but they should be ok too.

--=20
Mathieu Arnold

--r3xczjpazl24yfyb
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQGTBAABCgB9FiEEFD4jMKwz5Ud8Ywu3ecmT/A9inX0FAmLjx4FfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDE0
M0UyMzMwQUMzM0U1NDc3QzYzMEJCNzc5Qzk5M0ZDMEY2MjlEN0QACgkQecmT/A9i
nX17Hwf/QdAt2kGzh6oLUlHC2Klm+VVKaZeypTQkyDmro3pr0Z5972mEfXsAqkqZ
ZfvV09QiEKfhI6X08pjEsY25PDcdEnC5bNu41DkR9WLC5IpnIg5M1SD5NdaIr7d5
2FN90VN6UTeuJwKMnDh3PFYqx3JA+HYcf63dfF4uGG3wK1Oro2cD3x/CQEFD8hF6
LX0cnjgprpl1t+/gXr+SFILEKzlmTJMELki8UV88T67M/EBM4bcARAzekPtYPw/n
kDDEGB4x+qaDH1J9/u9nALllQGj14NTdhUCz1EhTNKbNDQKDJfoEmzb7Oo34+3sV
HPoqlkO/oNLls+OY0krxdqSDG45tkA==
=E6mj
-----END PGP SIGNATURE-----

--r3xczjpazl24yfyb--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20220729114153.cl2p3kpap5qcspz2>