Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Apr 2021 01:03:30 +0200
From:      Martin Matuska <mm@FreeBSD.org>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Ed Maste <emaste@freebsd.org>, freebsd-git <freebsd-git@freebsd.org>, Xin Li <delphij@freebsd.org>, Ryan Moeller <freqlabs@freebsd.org>, Alexander Motin <mav@freebsd.org>, Mateusz Guzik <mjg@freebsd.org>
Subject:   Re: OpenZFS branch tracking policy
Message-ID:  <9679ec9d-4916-92b7-ff70-0050d699875c@FreeBSD.org>
In-Reply-To: <CANCZdfoPzNFSp2sW94Ken=u7DstHL_BWFmjV5MBD4cRBo3t_Uw@mail.gmail.com>
References:  <21c7313e-315c-ec48-9437-e0a3d4ec14d2@FreeBSD.org> <CANCZdfopOxm-HTYkVPHkEweHw-F%2BA9mk3Vv26x4t3MEAVEd2gQ@mail.gmail.com> <CAPyFy2DS=nsE3-JQdqQC797xQhAiBACkuyA%2BcxkcRY0yeB_6=w@mail.gmail.com> <CANCZdfoPm0tfDpBTU8ORy-_Oa-tkiNX0_MeAdJn0T5ZJdQe6MQ@mail.gmail.com> <41924e9d-9d61-6646-6c8f-e4458f94296e@FreeBSD.org> <30f529c1-6087-e704-8cc7-0c48a40b7430@FreeBSD.org> <CANCZdfp3EJ%2BbrNM02Sfzu_Y42VDEADiApFaX0V9bu_jb5NWd4w@mail.gmail.com> <f8d7a7f3-63a2-434f-054c-fadb9131cf82@FreeBSD.org> <CANCZdfoPzNFSp2sW94Ken=u7DstHL_BWFmjV5MBD4cRBo3t_Uw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Thank you for your comments, Warner.

What I would like to know is the timing - how much time do we need to=20
resolve the issues. I can pull in the OpenZFS code up to commit=20
3522f57b6 the "old" way. This is the last commit common to master and=20
zfs-2.1-release and can be cherry-picked to stable/13 the "old" way.=20
This will keep our code on par with openzfs-2.1-rc1 (rc2 is out now) and =

I can add a 2-week MFC for stable/13 as usual but there are no=20
significant changes at all. After that we need to split main and=20
stable/13 and ideally move to direct tracking of OpenZFS.

I have added some comments below.

On 10. 4. 2021 21:22, Warner Losh wrote:
> Thanks for the update Martin.
>
> The tl;dr is I think this will be fine. However, I'd like to document=20
> the reasoning here for future cases that we may need to judge. There's =

> also a couple of logistical issues at the end we need to address, one=20
> critical.
>
> On Sat, Apr 10, 2021 at 11:15 AM Martin Matuska <mm@freebsd.org=20
> <mailto:mm@freebsd.org>> wrote:
>
>     Here are some of the facts:
>
>     - In my merge, there are 15 conflicting files due to changes in
>     FreeBSD (add/add)
>     - Some of the changes have already been upstreamed in later
>     revisions of openzfs than 891568c99
>     - A significant majority of the diffs is subject for upstreaming.
>     The ideal state would be to have all changes upstreamed. Sometimes
>     changes get upstreamed with modifications.
>     - In general our developers open pull requests and commit to
>     OpenZFS, then we merge the changes
>
>     What our developers would like is to use a "git blame" on
>     sys/contrib/openzfs/something to see the history path from OpenZFS.=

>
>     I agree that the merge commits should be more verbose, ideally
>     containing a "git log --oneline" of the commits since last merge.
>
>     If a do a "squashed" merge like you described with bzip2, then I
>     do not import the history from OpenZFS. That way we don't need
>     that at all and can continue working the way we did until now.
>
>     What you say about adding "unnecessary" history - since the common
>     development at OpenZFS the majority of commits directly affects
>     FreeBSD. Only "Linux-Only" and "CI-related" commits are not
>     relevant for FreeBSD.
>
>     I have updated my example branch how it may look like with more
>     detailed commit messages, nicely clickable from github:
>     https://github.com/mmatuska/freebsd-src/tree/openzfs_master_merged
>     <https://github.com/mmatuska/freebsd-src/tree/openzfs_master_merged=
>
>
>     So the the current question is quite simple, we can do one of the
>     following:
>     a) do the unsquashed merge I suggest that imports the openzfs
>     history - this will make the commits very transparent, future
>     merges and upstream tracking very easy and
>     --allow-unrelated-history flag is not required anymore. The
>     "common" part of the histories in main and stable/13 will be
>     identical.
>     b) if that is not desired or we are undecided I will continue the
>     way we go now until a better solution is found. In that case I
>     will fork a second vendor branch (vendor/openzfs-2.1) that starts
>     with the latest common commit of openzfs/master and
>     openzfs/zfs-2.1-release and will merge (or cherry-pick?) from this
>     branch directly to stable/13. As an alternative to merging, git
>     cherry-pick supports -Xsubtree=3D as well.
>
> I'm leading towards 'a', but that's a new way for the project to track =

> vendor changes. Many of my comments were on how to mirror pulling in=20
> upstreams that we would want to do infrequently, and where we didn't=20
> care about the details so much. llvm is a good example, as would be=20
> bzip, though for different reasons. The former more due to the sheer=20
> size of the llvm repo and the extremely infrequent need for users and=20
> developers of FreeBSD to peer into the details. They simply are=20
> relevant for those cases. For these cases, a squashed commit makes=20
> sense: people don't care about the details and it keeps our repo size=20
> manageable and 'b' is appropriate. I had initially thought OpenZFS=20
> would fall into this category, but your additional details suggest=20
> that my initial thinking might be a poor fit to our needs.
I agree to your opinion here. The other project I maintain, libarchive,=20
is another example for the 'b' approach. Imports are infrequent and=20
FreeBSD is primarily a "downstream consumer" of libarchive even if there =

is some code dedicated privately to FreeBSD. As of OpenZFS, there is=20
much more dedicated code, we are interested in more frequent pulls and=20
several of our developers are directly involved in the project=20
developing both "common" and "FreeBSD-related" code. What I especially=20
like about the OpenZFS project are the high development standards.
>
> I think that you've made a compelling case to merge in the tree. The=20
> potential downsides=C2=A0need to be looked at for doing something new. =

> First is size. From the numbers you provided, OpenZFS is on the larger =

> side of things we'd want to do this with. The expansion of the repo is =

> concerning, so there would need to be some benefit from that. Here,=20
> you've clearly articulated the benefit: our OpenZFS developers drift=20
> back and forth between OpenZFS and FreeBSD and do development=C2=A0in b=
oth=20
> places. If these merges are frequent, this allows a more efficient=20
> workflow for OpenZFS maintenance. This also allows better bisecting in =

> the case of trouble. One reason we don't generally want to open things =

> up to merge commits is the crazy merges we did with svn that created=20
> weird loops.=C2=A0 While the git transition work endeavored=C2=A0to eli=
minate=20
> them, a number slipped through. We do not want any more of them=20
> created. By that test, these commits pose no risk given then OpenZFS=20
> practices (and little risk outside the contrib/openzfs tree).
Are such (messy) situations even possible in git?
>
> So, the practical aspects of this: how do we do this.=C2=A0We'll need t=
o=20
> have the OpenZFS mainline and branches in the tree, so the question of =

> what namespace to put them into comes to mind. The obvious answer=20
> would be 'openzfs' or 'vendor/openzfs' comes to mind, but you want two =

> branches, so maybe vendor/openzfs/main (or master, whatever it is=20
> called upstream) and vendor/openzfs/<branch-name> would be better=20
> since we could then recommend a 'refs' line for people working on=20
> openzfs that would let git do all the heavy lifting here. There's no=20
> issue with having both vendor/openzfs and vendor/openzfs/<foo> in the=20
> tree at the same time, I don't think. The current rule sets would=20
> allow this, and you could carefully push both the branches first. I=20
> don't think we need to do anything special except document how to do=20
> the first commit (for others who need to do this) and document how to=20
> update which I'm more than happy to help out with.
I would be happy with vendor/openzfs/master and=20
vendor/openzfs/zfs-2.1-release to use the same naming as OpenZFS does.
>
> One critical thing we need to assess=C2=A0before you proceed, however: =

> mail. We need to make sure we're not about to send 7k emails as all=20
> these revisions suddenly appear in the repo... While having an extra=20
> 7k revs in the repo will be no problem, but 7k extra emails might=20
> raise a comment or two...
Is there a way to simulate this?
>
> Comments?
>
> Warner
>
>     Best regards,
>     mm
>
>     On 10. 4. 2021 0:15, Warner Losh wrote:
>>
>>
>>     On Fri, Apr 2, 2021 at 6:44 PM Martin Matuska <mm@freebsd.org
>>     <mailto:mm@freebsd.org>> wrote:
>>
>>         I have prepared an example merged branch here:
>>         https://github.com/mmatuska/freebsd-src/tree/openzfs_master_me=
rged
>>         <https://github.com/mmatuska/freebsd-src/tree/openzfs_master_m=
erged>
>>
>>         The magical command was:
>>         git merge -s subtree -Xsubtree=3D"sys/contrib/openzfs" 891568c=
99
>>         --allow-unrelated-histories
>>
>>         Luckily, our current diff is manageable.
>>
>>
>>     So I did this for bzip2 using approximately:
>>
>>     git add remove bzip2 <url>
>>     git fetch bzip2
>>     git merge -s subtree -Xsubtree=3Dcontrib/bzip2 bzip2/master
>>     --allow-unrelated-histories --squash
>>
>>     [1] At this point I resolved conflicts, where were the entire
>>     files since I guess I didn't bootstrap right to the last merge.
>>     There were 4 files in conflict.
>>
>>     Then I did a git add of all the files in conflict and a git commit=
=2E
>>
>>     This produced a good commit. since it was a squash commit, there
>>     were no issues.
>>
>>     However, it turns out I botched the commit at point [1] above. So
>>     I ran this again and got a conflict for the whole file that I'd
>>     removed a blank line from.
>>
>>     So, this looks like it could be workable, but does lead me to a
>>     few questions:
>>
>>     (1) How do we do this so that the conflicts aren't add/add
>>     conflicts? Is there some way to bootstrap this?
>>     (2) Do we need to keep track of the last merge point and use that
>>     in merging the next one in?
>>     (3) I assume we keep track of FreeBSD diffs in a branch off <url>
>>     and we merge that instead of master.
>>     (4) What do we do about adjustments to the build that are needed?
>>     (5) Do we need to host a FreeBSD-specific repo with this stuff,
>>     maybe with tags we don't want widely pushed to ease the next
>>     merge? Eg, make this the first case of a 'vendor repo' that we
>>     then pull squash commits from so that the vendor repo can track
>>     upstream, but not otherwise be pushed to all our users....
>>
>>     Finally, how did you deal with [1] producing so many full-file
>>     add/add conflicts? Oh, and what kind of commit message when
>>     things merge do you suggest? I rather like your 'bring in hash
>>     XXXX branch blah, here's the important highlights' emails and
>>     think that would be a good first cut at advice on what to put in
>>     these.
>>
>>     This suggests the current answer is 'seems doable, but we need to
>>     document it and come up with recommendations for how to do it'.
>>
>>     Warner
>>
>>         On 3. 4. 2021 1:37, Martin Matuska wrote:
>>         > Hi Warner and Ed,
>>         >
>>         > 2.1-release has already been branched. The stable branch
>>         policy in
>>         > OpenZFS is somewhat strange, they make a staging branch for
>>         each
>>         > patchlevel release, but the commits are continuous.
>>         >
>>         > To have some idea how big the repo history is:
>>         >
>>         > $ git rev-list master --count
>>         > 6662
>>         >
>>         > $ git rev-list zfs-2.1-release --count
>>         > 6650
>>         >
>>         > master and zfs-2.1-release have 6650 common commits at the=C2=
=A0
>>         moment
>>         >
>>         > $ git log master | wc -l
>>         > 129868
>>         >
>>         > (linecount - 4 * revcount) / revcount =3D linecount /
>>         revcount - 4 =3D
>>         > 15,4938 comment lines per commit on average
>>         >
>>         > Initial commit was made in Feb 26, 2008.
>>         >
>>         > Yearly commit counts:
>>         >
>>         > $ git log master | grep -c -E '^Date:.* 2020 -[0-9]+$'
>>         > 666
>>         >
>>         > $ git log master | grep -c -E '^Date:.* 2019 -[0-9]+$'
>>         > 535
>>         >
>>         > $git log master | grep -c -E '^Date:.* 2018 -[0-9]+$'
>>         > 428
>>         >
>>         > Martin
>>         >
>>         > On 2. 4. 2021 20:15, Warner Losh wrote:
>>         >>
>>         >>
>>         >> On Fri, Apr 2, 2021 at 11:56 AM Ed Maste
>>         <emaste@freebsd.org <mailto:emaste@freebsd.org>
>>         >> <mailto:emaste@freebsd.org <mailto:emaste@freebsd.org>>>
>>         wrote:
>>         >>
>>         >> =C2=A0=C2=A0=C2=A0 On Fri, 2 Apr 2021 at 11:50, Warner Losh=

>>         <imp@bsdimp.com <mailto:imp@bsdimp.com>
>>         >> =C2=A0=C2=A0=C2=A0 <mailto:imp@bsdimp.com <mailto:imp@bsdim=
p.com>>> wrote:
>>         >> =C2=A0=C2=A0=C2=A0 >
>>         >> =C2=A0=C2=A0=C2=A0 > We'd always hoped that we'd be able to=
 do subtree
>>         merges from
>>         >> =C2=A0=C2=A0=C2=A0 upstreams
>>         >> =C2=A0=C2=A0=C2=A0 > that use git into FreeBSD. The big wor=
ry, though,
>>         was that this
>>         >> =C2=A0=C2=A0=C2=A0 would
>>         >> =C2=A0=C2=A0=C2=A0 > needless bloat the repo with a lot of =
history. We
>>         don't want,
>>         >> =C2=A0=C2=A0=C2=A0 for example,
>>         >> =C2=A0=C2=A0=C2=A0 > all of LLVM's history in the tree. We'=
d always
>>         anticipated that
>>         >> =C2=A0=C2=A0=C2=A0 there'd be
>>         >> =C2=A0=C2=A0=C2=A0 > some things we'd just accept the histo=
ry for, since
>>         it is
>>         >> similar in
>>         >> =C2=A0=C2=A0=C2=A0 > character to the vendor branches (thou=
gh of course a
>>         bit more).
>>         >>
>>         >> =C2=A0=C2=A0=C2=A0 Note that if we do want to avoid bringin=
g in the full
>>         history `git
>>         >> =C2=A0=C2=A0=C2=A0 subtree merge` supports a `--squash` opt=
ion. This
>>         brings in the
>>         >> set of
>>         >> =C2=A0=C2=A0=C2=A0 upstream changes as a single commit, wit=
hout bringing
>>         along the
>>         >> =C2=A0=C2=A0=C2=A0 associated history. We will need to do m=
ore
>>         experimentation to
>>         >> confirm
>>         >> =C2=A0=C2=A0=C2=A0 that the full process, including bootstr=
apping, will
>>         work as we
>>         >> want.
>>         >> =C2=A0=C2=A0=C2=A0 Assuming this all works it should allow =
us to forgo
>>         the use of a
>>         >> =C2=A0=C2=A0=C2=A0 FreeBSD-specific vendor branch in src.
>>         >>
>>         >> =C2=A0=C2=A0=C2=A0 We've discussed mirroring any such 3rd-p=
arty source in
>>         some
>>         >> =C2=A0=C2=A0=C2=A0 FreeBSD-controlled repository. This woul=
d allow the
>>         project to
>>         >> retain
>>         >> =C2=A0=C2=A0=C2=A0 a full copy of the history, but avoid bl=
oating src
>>         with it.
>>         >>
>>         >> =C2=A0=C2=A0=C2=A0 I agree with Warner that we may want a d=
ifferent
>>         policy (full
>>         >> history
>>         >> =C2=A0=C2=A0=C2=A0 or snapshots) for different contrib sour=
ces.
>>         >>
>>         >>
>>         >> Good points Ed. I'd forgotten about --squash.
>>         >>
>>         >> Martin, what's your timeline for wanting to implement
>>         these things?
>>         >> I'm unfamiliar with the OpenZFS schedules.
>>         >>
>>         >> Warner
>>         > _______________________________________________
>>         > freebsd-git@freebsd.org <mailto:freebsd-git@freebsd.org>
>>         mailing list
>>         > https://lists.freebsd.org/mailman/listinfo/freebsd-git
>>         <https://lists.freebsd.org/mailman/listinfo/freebsd-git>;
>>         > To unsubscribe, send any mail to
>>         "freebsd-git-unsubscribe@freebsd.org
>>         <mailto:freebsd-git-unsubscribe@freebsd.org>"
>>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9679ec9d-4916-92b7-ff70-0050d699875c>