Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Sep 2019 11:26:07 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Ed Maste <emaste@freebsd.org>
Cc:        Sean Chittenden <sean@chittenden.org>, freebsd-git@freebsd.org,  =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= <uspoerlein@gmail.com>
Subject:   Re: Service disruption: git converter currently down
Message-ID:  <CANCZdfpOhqJKxDJ_8xYnzrm3_sTKftgeE8e50pmUGPZq1wE-ng@mail.gmail.com>
In-Reply-To: <CAPyFy2C5FNwHOTuamwKQXY9Z_uMJJGnmo_4fG8UOp8expxiN%2BQ@mail.gmail.com>
References:  <CAJ9axoR41gM5BGzT-nPJqqjym1cPYv31dDUwXwi4wsApfDJW%2Bw@mail.gmail.com> <CAJ9axoToynYpF=ZdWdtn_CkkA2nVkgtckQSu%2BcMis1NOXgUdnA@mail.gmail.com> <CAJ9axoR2VXFo9_hx9Z1Qwgs7U-dkan56hrUKO9f7uN6Wpd15xQ@mail.gmail.com> <CAHevUJHwDet8pBdrE4SN3nuoAUgP-ixpCz9uOTdwbE31UDDsbA@mail.gmail.com> <CAPyFy2AMqft2EwdZHYnFUOFxSDOmN1Rv0A9jnR3VdE38SP87pw@mail.gmail.com> <CANCZdfq71yYjGGog9qm2-xb0RRZG8=YdCg3g0%2BotLvPn6r3xJw@mail.gmail.com> <CAPyFy2AWOqtb_DNiekKUx07LbQPzvOkw_qvf58DKuopsvHySTQ@mail.gmail.com> <CANCZdfoBYwp6Gn9nh754yQGXFR0MWkg3hKo8LF-RX_YgdSBycA@mail.gmail.com> <CAPyFy2C5FNwHOTuamwKQXY9Z_uMJJGnmo_4fG8UOp8expxiN%2BQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Sep 26, 2019 at 8:26 AM Ed Maste <emaste@freebsd.org> wrote:

> On Wed, 25 Sep 2019 at 15:50, Warner Losh <imp@bsdimp.com> wrote:
> >
> > git log always requires added care. There's not actually 9000 commits
> there. The tree looks fine topologically. Its purely an artifact of git log.
>
> This seems to be getting into a philosophical discussion of what it
> means for a commit to exist. But, given the constraints in the way git
> represents commits the history crafted by the svn-git exporter indeed
> shows thousands of "phantom" commits. The converter should (and with
> uqs tweaks, would) represent the offending commit here as if it were a
> cherry pick, not a merge.
>

Yes. This is the artifact of 'git log' I was talking about. Unlike
subversion, it will log both sides of the merge commit back to the common
ancestor unless you tell it to do otherwise, which is why we're seeing both
legs of the tree.

But git also has no way to represent cherry-picks. It just does them as
regular commits.


> In order to really represent this correctly we need to add to git
> metadata tracking file operations. Recording that path d1/f1 was
> copied from d2/f2 at some hash would allow us to properly represent
> this case as well as renames/moves.
>

Right. We're not going to get that since the 'git' way is to divine that
the copies happened. Unlike subversion, git doesn't use paths for version
control...


> >> git log --first-parent isn't really a solution here either, because
> >> there are cases where one legitimately does want history from both
> >> parents, especially working in downstream projects.
> >
> > I'm pretty sure it would be fine, even in that case.
>
> It's not fine, because it omits the commits I want to see.
>

The --first-parent actually mirrors what svn log shows today. What commits
do you think that it omits?


> >> > I'd offer the opinion that needing to know about things like git log
> --first-parent vs having to rebase every single downstream fork,
> >>
> >> We won't need to rebase every fork - in no case should the path
> >> forward be worse than uqs's suggestion of a merge from both old/new
> >> conversions.
> >
> > IMHO, uqs suggestion is a complete non-starter, at least the "git diff |
> git patch" one. It destroys all local history, commit messages, etc. Except
> for the most trivial cases, it's not really going to fly with our users.
> His other, followup ones might be workable into scripts.
>
> diff | patch is not the suggestion; the suggestion is to perform a
> merge from the "new" conversion. Other options (e.g. some sort of
> scripted commit replay) are at least no worse than that base case.
>

I'm curious how the 'merge' operation would work with such a remote common
ancestor, especially with the complicated glitches in the currently
exported tree. I'll have to play around with this since I have a similar
situation with my git-svn trees that have identical commits to the github
commits, but with different hashes (due to the differing ways git-svn and
the github converter export the rXXXXX number), though maybe not with such
a deep common ancestor that the old and new conversion trees would have.

For the rebase workflow, I agree, it's pretty simple to graft a series of
commits to a new branch, even one without a common ancestor. I do it all
the time to move things back and forth between my git from github tree and
my git from git-svn trees depending on my needs.

> I'm not sure you can merge, as there's no common ancestor that's recent
> enough to give it a chance at succeeding (since the different exports would
> have different hashes starting fairly early in our history). My experience
> with qemu is that long-lived merge-updated branches become quite difficult
> to cope with after a while. It took me three weeks to sort out that
> relatively simple repo.
>
> In fact, the merge works fine, even with completely unrelated
> histories. You can try this by merging 'svn_head' (from git svn) to
> 'master' (from svn2git), using `git merge --allow-unrelated-histories
> origin/svn_head`. The resulting history has two copies of every
> commit, but the file contents are unchanged over the merge.
>

I'll have to try that to see how well it works. I'd not used
allow-unrelated-histories and had frequently run into this issue. In the
past, I'd been warned off using that flag, but I'll give it another try.


> If you try this in a tree with changes (i.e., try applying it to a
> long-running merge-based branch) every modified file will result in a
> conflict, but they can be trivially resolved in favour of the first
> version. From that point on merging from the "new" conversion will
> work as expected.
>

OK. I'll have to give it a spin to see where it takes me.


> > A rebase has a chance of working for people following a 'rebase' work
> flow.
>
> Indeed, for rebase workflow it's fairly straightforward.
>
> > However, for people like CHERIBSD who follow a 'merge from upstream'
> model which never rebases (since that would be anti-social to their down
> streams), I'm having trouble understanding how that could work. At work, we
> basically do the merge from upstream with collapse model, which I'm having
> trouble seeing how to move from old hashes to new. I'd like to know what
> the plan for that would be and would happily test any solution there with a
> copy of our repo. I'd even be happy to run experiments in advance of there
> being something more public available to see what options do or don't work.
>
> Could you expand on the "merge from upstream with collapse" -
> specifically, can you provide an example command used when merging
> from FreeBSD?
>

We basically have an upstream called 'FreeBSD' that we fetch into our git
repo:
% git fetch FreeBSD master
and then we create a beanch
% git checkout -b merge-branch-rXXXXXX
then we do the merge:
% git subtree -P FreeBSD merge FreeBSD/master $HASH
# resolve conflicts
% git commit
% git push
then use use stash's pull-request to manage the landing into our master,
but it's effectively a
% git checkout master
% git merge merge-branch-rXXXXXX

which results in a fairly ugly master for us, which is the other reason I
know the difference between git log and svn log behaviors so well :(.
Effectively, the merges from upstream show up as a single merge commit,
plus a number of follow-on fix-up commits. git subtree is both awesome and
evil.

Warnre

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpOhqJKxDJ_8xYnzrm3_sTKftgeE8e50pmUGPZq1wE-ng>