Date: Sun, 11 Apr 2021 01:03:30 +0200 From: Martin Matuska <mm@FreeBSD.org> To: Warner Losh <imp@bsdimp.com> Cc: Ed Maste <emaste@freebsd.org>, freebsd-git <freebsd-git@freebsd.org>, Xin Li <delphij@freebsd.org>, Ryan Moeller <freqlabs@freebsd.org>, Alexander Motin <mav@freebsd.org>, Mateusz Guzik <mjg@freebsd.org> Subject: Re: OpenZFS branch tracking policy Message-ID: <9679ec9d-4916-92b7-ff70-0050d699875c@FreeBSD.org> In-Reply-To: <CANCZdfoPzNFSp2sW94Ken=u7DstHL_BWFmjV5MBD4cRBo3t_Uw@mail.gmail.com> References: <21c7313e-315c-ec48-9437-e0a3d4ec14d2@FreeBSD.org> <CANCZdfopOxm-HTYkVPHkEweHw-F%2BA9mk3Vv26x4t3MEAVEd2gQ@mail.gmail.com> <CAPyFy2DS=nsE3-JQdqQC797xQhAiBACkuyA%2BcxkcRY0yeB_6=w@mail.gmail.com> <CANCZdfoPm0tfDpBTU8ORy-_Oa-tkiNX0_MeAdJn0T5ZJdQe6MQ@mail.gmail.com> <41924e9d-9d61-6646-6c8f-e4458f94296e@FreeBSD.org> <30f529c1-6087-e704-8cc7-0c48a40b7430@FreeBSD.org> <CANCZdfp3EJ%2BbrNM02Sfzu_Y42VDEADiApFaX0V9bu_jb5NWd4w@mail.gmail.com> <f8d7a7f3-63a2-434f-054c-fadb9131cf82@FreeBSD.org> <CANCZdfoPzNFSp2sW94Ken=u7DstHL_BWFmjV5MBD4cRBo3t_Uw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Thank you for your comments, Warner. What I would like to know is the timing - how much time do we need to=20 resolve the issues. I can pull in the OpenZFS code up to commit=20 3522f57b6 the "old" way. This is the last commit common to master and=20 zfs-2.1-release and can be cherry-picked to stable/13 the "old" way.=20 This will keep our code on par with openzfs-2.1-rc1 (rc2 is out now) and = I can add a 2-week MFC for stable/13 as usual but there are no=20 significant changes at all. After that we need to split main and=20 stable/13 and ideally move to direct tracking of OpenZFS. I have added some comments below. On 10. 4. 2021 21:22, Warner Losh wrote: > Thanks for the update Martin. > > The tl;dr is I think this will be fine. However, I'd like to document=20 > the reasoning here for future cases that we may need to judge. There's = > also a couple of logistical issues at the end we need to address, one=20 > critical. > > On Sat, Apr 10, 2021 at 11:15 AM Martin Matuska <mm@freebsd.org=20 > <mailto:mm@freebsd.org>> wrote: > > Here are some of the facts: > > - In my merge, there are 15 conflicting files due to changes in > FreeBSD (add/add) > - Some of the changes have already been upstreamed in later > revisions of openzfs than 891568c99 > - A significant majority of the diffs is subject for upstreaming. > The ideal state would be to have all changes upstreamed. Sometimes > changes get upstreamed with modifications. > - In general our developers open pull requests and commit to > OpenZFS, then we merge the changes > > What our developers would like is to use a "git blame" on > sys/contrib/openzfs/something to see the history path from OpenZFS.= > > I agree that the merge commits should be more verbose, ideally > containing a "git log --oneline" of the commits since last merge. > > If a do a "squashed" merge like you described with bzip2, then I > do not import the history from OpenZFS. That way we don't need > that at all and can continue working the way we did until now. > > What you say about adding "unnecessary" history - since the common > development at OpenZFS the majority of commits directly affects > FreeBSD. Only "Linux-Only" and "CI-related" commits are not > relevant for FreeBSD. > > I have updated my example branch how it may look like with more > detailed commit messages, nicely clickable from github: > https://github.com/mmatuska/freebsd-src/tree/openzfs_master_merged > <https://github.com/mmatuska/freebsd-src/tree/openzfs_master_merged= > > > So the the current question is quite simple, we can do one of the > following: > a) do the unsquashed merge I suggest that imports the openzfs > history - this will make the commits very transparent, future > merges and upstream tracking very easy and > --allow-unrelated-history flag is not required anymore. The > "common" part of the histories in main and stable/13 will be > identical. > b) if that is not desired or we are undecided I will continue the > way we go now until a better solution is found. In that case I > will fork a second vendor branch (vendor/openzfs-2.1) that starts > with the latest common commit of openzfs/master and > openzfs/zfs-2.1-release and will merge (or cherry-pick?) from this > branch directly to stable/13. As an alternative to merging, git > cherry-pick supports -Xsubtree=3D as well. > > I'm leading towards 'a', but that's a new way for the project to track = > vendor changes. Many of my comments were on how to mirror pulling in=20 > upstreams that we would want to do infrequently, and where we didn't=20 > care about the details so much. llvm is a good example, as would be=20 > bzip, though for different reasons. The former more due to the sheer=20 > size of the llvm repo and the extremely infrequent need for users and=20 > developers of FreeBSD to peer into the details. They simply are=20 > relevant for those cases. For these cases, a squashed commit makes=20 > sense: people don't care about the details and it keeps our repo size=20 > manageable and 'b' is appropriate. I had initially thought OpenZFS=20 > would fall into this category, but your additional details suggest=20 > that my initial thinking might be a poor fit to our needs. I agree to your opinion here. The other project I maintain, libarchive,=20 is another example for the 'b' approach. Imports are infrequent and=20 FreeBSD is primarily a "downstream consumer" of libarchive even if there = is some code dedicated privately to FreeBSD. As of OpenZFS, there is=20 much more dedicated code, we are interested in more frequent pulls and=20 several of our developers are directly involved in the project=20 developing both "common" and "FreeBSD-related" code. What I especially=20 like about the OpenZFS project are the high development standards. > > I think that you've made a compelling case to merge in the tree. The=20 > potential downsides=C2=A0need to be looked at for doing something new. = > First is size. From the numbers you provided, OpenZFS is on the larger = > side of things we'd want to do this with. The expansion of the repo is = > concerning, so there would need to be some benefit from that. Here,=20 > you've clearly articulated the benefit: our OpenZFS developers drift=20 > back and forth between OpenZFS and FreeBSD and do development=C2=A0in b= oth=20 > places. If these merges are frequent, this allows a more efficient=20 > workflow for OpenZFS maintenance. This also allows better bisecting in = > the case of trouble. One reason we don't generally want to open things = > up to merge commits is the crazy merges we did with svn that created=20 > weird loops.=C2=A0 While the git transition work endeavored=C2=A0to eli= minate=20 > them, a number slipped through. We do not want any more of them=20 > created. By that test, these commits pose no risk given then OpenZFS=20 > practices (and little risk outside the contrib/openzfs tree). Are such (messy) situations even possible in git? > > So, the practical aspects of this: how do we do this.=C2=A0We'll need t= o=20 > have the OpenZFS mainline and branches in the tree, so the question of = > what namespace to put them into comes to mind. The obvious answer=20 > would be 'openzfs' or 'vendor/openzfs' comes to mind, but you want two = > branches, so maybe vendor/openzfs/main (or master, whatever it is=20 > called upstream) and vendor/openzfs/<branch-name> would be better=20 > since we could then recommend a 'refs' line for people working on=20 > openzfs that would let git do all the heavy lifting here. There's no=20 > issue with having both vendor/openzfs and vendor/openzfs/<foo> in the=20 > tree at the same time, I don't think. The current rule sets would=20 > allow this, and you could carefully push both the branches first. I=20 > don't think we need to do anything special except document how to do=20 > the first commit (for others who need to do this) and document how to=20 > update which I'm more than happy to help out with. I would be happy with vendor/openzfs/master and=20 vendor/openzfs/zfs-2.1-release to use the same naming as OpenZFS does. > > One critical thing we need to assess=C2=A0before you proceed, however: = > mail. We need to make sure we're not about to send 7k emails as all=20 > these revisions suddenly appear in the repo... While having an extra=20 > 7k revs in the repo will be no problem, but 7k extra emails might=20 > raise a comment or two... Is there a way to simulate this? > > Comments? > > Warner > > Best regards, > mm > > On 10. 4. 2021 0:15, Warner Losh wrote: >> >> >> On Fri, Apr 2, 2021 at 6:44 PM Martin Matuska <mm@freebsd.org >> <mailto:mm@freebsd.org>> wrote: >> >> I have prepared an example merged branch here: >> https://github.com/mmatuska/freebsd-src/tree/openzfs_master_me= rged >> <https://github.com/mmatuska/freebsd-src/tree/openzfs_master_m= erged> >> >> The magical command was: >> git merge -s subtree -Xsubtree=3D"sys/contrib/openzfs" 891568c= 99 >> --allow-unrelated-histories >> >> Luckily, our current diff is manageable. >> >> >> So I did this for bzip2 using approximately: >> >> git add remove bzip2 <url> >> git fetch bzip2 >> git merge -s subtree -Xsubtree=3Dcontrib/bzip2 bzip2/master >> --allow-unrelated-histories --squash >> >> [1] At this point I resolved conflicts, where were the entire >> files since I guess I didn't bootstrap right to the last merge. >> There were 4 files in conflict. >> >> Then I did a git add of all the files in conflict and a git commit= =2E >> >> This produced a good commit. since it was a squash commit, there >> were no issues. >> >> However, it turns out I botched the commit at point [1] above. So >> I ran this again and got a conflict for the whole file that I'd >> removed a blank line from. >> >> So, this looks like it could be workable, but does lead me to a >> few questions: >> >> (1) How do we do this so that the conflicts aren't add/add >> conflicts? Is there some way to bootstrap this? >> (2) Do we need to keep track of the last merge point and use that >> in merging the next one in? >> (3) I assume we keep track of FreeBSD diffs in a branch off <url> >> and we merge that instead of master. >> (4) What do we do about adjustments to the build that are needed? >> (5) Do we need to host a FreeBSD-specific repo with this stuff, >> maybe with tags we don't want widely pushed to ease the next >> merge? Eg, make this the first case of a 'vendor repo' that we >> then pull squash commits from so that the vendor repo can track >> upstream, but not otherwise be pushed to all our users.... >> >> Finally, how did you deal with [1] producing so many full-file >> add/add conflicts? Oh, and what kind of commit message when >> things merge do you suggest? I rather like your 'bring in hash >> XXXX branch blah, here's the important highlights' emails and >> think that would be a good first cut at advice on what to put in >> these. >> >> This suggests the current answer is 'seems doable, but we need to >> document it and come up with recommendations for how to do it'. >> >> Warner >> >> On 3. 4. 2021 1:37, Martin Matuska wrote: >> > Hi Warner and Ed, >> > >> > 2.1-release has already been branched. The stable branch >> policy in >> > OpenZFS is somewhat strange, they make a staging branch for >> each >> > patchlevel release, but the commits are continuous. >> > >> > To have some idea how big the repo history is: >> > >> > $ git rev-list master --count >> > 6662 >> > >> > $ git rev-list zfs-2.1-release --count >> > 6650 >> > >> > master and zfs-2.1-release have 6650 common commits at the=C2= =A0 >> moment >> > >> > $ git log master | wc -l >> > 129868 >> > >> > (linecount - 4 * revcount) / revcount =3D linecount / >> revcount - 4 =3D >> > 15,4938 comment lines per commit on average >> > >> > Initial commit was made in Feb 26, 2008. >> > >> > Yearly commit counts: >> > >> > $ git log master | grep -c -E '^Date:.* 2020 -[0-9]+$' >> > 666 >> > >> > $ git log master | grep -c -E '^Date:.* 2019 -[0-9]+$' >> > 535 >> > >> > $git log master | grep -c -E '^Date:.* 2018 -[0-9]+$' >> > 428 >> > >> > Martin >> > >> > On 2. 4. 2021 20:15, Warner Losh wrote: >> >> >> >> >> >> On Fri, Apr 2, 2021 at 11:56 AM Ed Maste >> <emaste@freebsd.org <mailto:emaste@freebsd.org> >> >> <mailto:emaste@freebsd.org <mailto:emaste@freebsd.org>>> >> wrote: >> >> >> >> =C2=A0=C2=A0=C2=A0 On Fri, 2 Apr 2021 at 11:50, Warner Losh= >> <imp@bsdimp.com <mailto:imp@bsdimp.com> >> >> =C2=A0=C2=A0=C2=A0 <mailto:imp@bsdimp.com <mailto:imp@bsdim= p.com>>> wrote: >> >> =C2=A0=C2=A0=C2=A0 > >> >> =C2=A0=C2=A0=C2=A0 > We'd always hoped that we'd be able to= do subtree >> merges from >> >> =C2=A0=C2=A0=C2=A0 upstreams >> >> =C2=A0=C2=A0=C2=A0 > that use git into FreeBSD. The big wor= ry, though, >> was that this >> >> =C2=A0=C2=A0=C2=A0 would >> >> =C2=A0=C2=A0=C2=A0 > needless bloat the repo with a lot of = history. We >> don't want, >> >> =C2=A0=C2=A0=C2=A0 for example, >> >> =C2=A0=C2=A0=C2=A0 > all of LLVM's history in the tree. We'= d always >> anticipated that >> >> =C2=A0=C2=A0=C2=A0 there'd be >> >> =C2=A0=C2=A0=C2=A0 > some things we'd just accept the histo= ry for, since >> it is >> >> similar in >> >> =C2=A0=C2=A0=C2=A0 > character to the vendor branches (thou= gh of course a >> bit more). >> >> >> >> =C2=A0=C2=A0=C2=A0 Note that if we do want to avoid bringin= g in the full >> history `git >> >> =C2=A0=C2=A0=C2=A0 subtree merge` supports a `--squash` opt= ion. This >> brings in the >> >> set of >> >> =C2=A0=C2=A0=C2=A0 upstream changes as a single commit, wit= hout bringing >> along the >> >> =C2=A0=C2=A0=C2=A0 associated history. We will need to do m= ore >> experimentation to >> >> confirm >> >> =C2=A0=C2=A0=C2=A0 that the full process, including bootstr= apping, will >> work as we >> >> want. >> >> =C2=A0=C2=A0=C2=A0 Assuming this all works it should allow = us to forgo >> the use of a >> >> =C2=A0=C2=A0=C2=A0 FreeBSD-specific vendor branch in src. >> >> >> >> =C2=A0=C2=A0=C2=A0 We've discussed mirroring any such 3rd-p= arty source in >> some >> >> =C2=A0=C2=A0=C2=A0 FreeBSD-controlled repository. This woul= d allow the >> project to >> >> retain >> >> =C2=A0=C2=A0=C2=A0 a full copy of the history, but avoid bl= oating src >> with it. >> >> >> >> =C2=A0=C2=A0=C2=A0 I agree with Warner that we may want a d= ifferent >> policy (full >> >> history >> >> =C2=A0=C2=A0=C2=A0 or snapshots) for different contrib sour= ces. >> >> >> >> >> >> Good points Ed. I'd forgotten about --squash. >> >> >> >> Martin, what's your timeline for wanting to implement >> these things? >> >> I'm unfamiliar with the OpenZFS schedules. >> >> >> >> Warner >> > _______________________________________________ >> > freebsd-git@freebsd.org <mailto:freebsd-git@freebsd.org> >> mailing list >> > https://lists.freebsd.org/mailman/listinfo/freebsd-git >> <https://lists.freebsd.org/mailman/listinfo/freebsd-git> >> > To unsubscribe, send any mail to >> "freebsd-git-unsubscribe@freebsd.org >> <mailto:freebsd-git-unsubscribe@freebsd.org>" >>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9679ec9d-4916-92b7-ff70-0050d699875c>