From owner-freebsd-git@freebsd.org Tue Jun 16 18:42:12 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7FC1E346EF4 for ; Tue, 16 Jun 2020 18:42:12 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-il1-f193.google.com (mail-il1-f193.google.com [209.85.166.193]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49mcW74Rlwz4WdH for ; Tue, 16 Jun 2020 18:42:11 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: by mail-il1-f193.google.com with SMTP id j19so15810445ilk.9 for ; Tue, 16 Jun 2020 11:42:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=EVR0poI8D1ZpYSaY9VqHvO7KfF/wZjAThvnbujhCoBM=; b=aZAumHU/HAWFqGf0MW9UMk7QxBSnuO4fYzRJg4bfXd7SWVuXPFnzkHsY0nIOFUSPTA 2vgBsco/d+LpkltJa2fgpXW5vo/gci3xIFuUILx7xpS15nrce3n8aeY4ISLYAyxnh/B1 l3Syd63I3Tn82AwesrZIRSkdRNkt6EOFRKHmUbADtsVIxwoxDi72G2XgJFGO/sHIWxtL jEe564ILufA1XM2Z7eQQXv/Ry7huw+EazzgXQH+j6EQ0IeughuFOPX4bkf9jlHF8EzRT E3dOPLgCMwAHvDa5LimW+fNtQgeES/2fdxpNbSuH/iE9Xft/v/kJM3Ye651E5uqRXQNf MrqQ== X-Gm-Message-State: AOAM532lrTmgnwAVUtwFtgW21MUaCJFbXvs+DXjQuiO+r+4QvtXIu9ws w3dp0aMIRvQ4VBxSSj508XoWp0Y8tna8QjVCDjjJXWQz X-Google-Smtp-Source: ABdhPJz2A+lFcL+KeptzSCGA4FySPLSCdYZD927i7pfGtu4YFTwSAj8MCWSF0xgBASgkjJQ5sJlhHuwY8oCokO7vuI0= X-Received: by 2002:a92:5b86:: with SMTP id c6mr4812964ilg.100.1592332929859; Tue, 16 Jun 2020 11:42:09 -0700 (PDT) MIME-Version: 1.0 From: Ed Maste Date: Tue, 16 Jun 2020 14:41:57 -0400 Message-ID: Subject: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: freebsd-git@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 49mcW74Rlwz4WdH X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of carpeddiem@gmail.com designates 209.85.166.193 as permitted sender) smtp.mailfrom=carpeddiem@gmail.com X-Spamd-Result: default: False [-0.57 / 15.00]; ARC_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-git@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-0.57)[-0.573]; DMARC_NA(0.00)[freebsd.org]; NEURAL_SPAM_SHORT(0.46)[0.462]; RCVD_IN_DNSWL_NONE(0.00)[209.85.166.193:from]; NEURAL_HAM_MEDIUM(-0.46)[-0.459]; FORGED_SENDER(0.30)[emaste@freebsd.org,carpeddiem@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.166.193:from]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[emaste@freebsd.org,carpeddiem@gmail.com]; TO_DOM_EQ_FROM_DOM(0.00)[] X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jun 2020 18:42:12 -0000 I'm currently excluding the following additional revisions from mergeinfo parsing (on gig_conv 9ae420c081): 242545 249429 250837 255263 255477 256424 265006 265006 265044 265547 265720 267888 283595 It may not be necessary to exclude all of these - they're just the list I've arrived at, after iterative trial and error. When I run `git subtree split --prefix=contrib/elftoolchain` this includes some legitimate, unnecessary, but innocuous commits - e.g. r298092 is included, which tracks a long-running project branch that brought in elftoolchain updates in MFH commits, but didn't otherwise touch elftoolchain on the branch. These clutter the history slightly but cause no real issue. I've found one new class of strange commit - r265044, which I've excluded from mergeinfo parsing (in the above list). However, svnweb highlights an issue: https://svnweb.freebsd.org/base?limit_changes=0&view=revision&revision=265044 projects/bmake/contrib/elftoolchain/ (Copied from head/contrib/elftoolchain, r265036) It looks like contrib/elftoolchain was added to svn head during the time that the /projects/bmake/ was in use, and brought to that projects branch via a MFH. It looks like this triggers svn2git's existing svn cp-based merge detection. I think svn2git is doing the right thing here, it just legitimately triggers the existing issue/bug in subtree split. From owner-freebsd-git@freebsd.org Wed Jun 17 02:25:17 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E5D8B33F6A0 for ; Wed, 17 Jun 2020 02:25:17 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-io1-f66.google.com (mail-io1-f66.google.com [209.85.166.66]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49mpnT17Hgz47rQ for ; Wed, 17 Jun 2020 02:25:16 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: by mail-io1-f66.google.com with SMTP id p20so945849iop.11 for ; Tue, 16 Jun 2020 19:25:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=iRGzAKmlxKtGhUVvE1yng8QP4Ee3DViRSNkFvsYnmxM=; b=RbvFu6VBYVuQMLMgPLQ3dbteepLYymECulWBjorOPg2x5RmhJc5ihm5kL2u6AN5ZWi tM+fMVvdAw+WbrsPS7Au9Cq9k6rjhJqxCcI3z4kIjaOOmvRsTZ36mkmP/mHE/xRXJQIc 0HpO5eeopS20IVhhulM3B/P2Dl9e9j366t6+QBadK8oJGKwGszPs6o5RQGH7mBdnDpRv OdPlnwlaHDObbQrQ26CTLftOJ8SEEB9r845EbW0nN1oBa8RJ61D2RVCK63uedaj6or4c wG9jbzawkbWBKEiP6dkzjC6/JZ8H5orOy2XV2JwfwfNF6B2A0VndCy+LGq0VxbbtyZxV tHng== X-Gm-Message-State: AOAM533hd+wCaP58FlmsNtH6vpxP8rr/9V/ZpcxEJFBf0n5C2HHogIz1 2scIWNgNgWADath9YBlZuhV1O/f8Zg61+SU1vkanMLGT X-Google-Smtp-Source: ABdhPJyDZiTRp9sOlneqhzYHYdvAhak9Kd205V8BArqzmjVpNp46lszd2ki+i/3gQO/2JG7Q80fjAHdwdsQ+X+RsDqk= X-Received: by 2002:a02:960a:: with SMTP id c10mr28173076jai.12.1592360714780; Tue, 16 Jun 2020 19:25:14 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ed Maste Date: Tue, 16 Jun 2020 22:25:02 -0400 Message-ID: Subject: Re: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: freebsd-git@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 49mpnT17Hgz47rQ X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of carpeddiem@gmail.com designates 209.85.166.66 as permitted sender) smtp.mailfrom=carpeddiem@gmail.com X-Spamd-Result: default: False [-0.29 / 15.00]; ARC_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-git@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-0.63)[-0.634]; DMARC_NA(0.00)[freebsd.org]; NEURAL_SPAM_SHORT(0.62)[0.617]; RCVD_IN_DNSWL_NONE(0.00)[209.85.166.66:from]; NEURAL_HAM_MEDIUM(-0.27)[-0.271]; FORGED_SENDER(0.30)[emaste@freebsd.org,carpeddiem@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.166.66:from]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[emaste@freebsd.org,carpeddiem@gmail.com]; TO_DOM_EQ_FROM_DOM(0.00)[] X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jun 2020 02:25:18 -0000 On Tue, 16 Jun 2020 at 14:41, Ed Maste wrote: > > I've found one new class of strange commit - r265044, which I've > excluded from mergeinfo parsing (in the above list). However, svnweb > highlights an issue: I've now successfully used Tom Clarkson's patched git subtree[1] to split elftoolchain out of cgit-beta, keeping the vendor branch history intact, and avoiding the extraneous and bogus commits. I used https://cgit-beta.freebsd.org/src.git at 4c09cab462f2d27794d51a4dcb06df806dc9f3a6, and performed the following steps: % git log contrib/elftoolchain 1. Observe that the initial commit under that prefix is: commit f4b5186d24f3e1969c32a49f126c8ad29ece63e9 Merge: 8b06418614b0 5265ace0e440 Author: Kai Wang Date: Wed Jan 15 22:30:48 2014 +0000 Copy libelf, libdwarf and common files from vendor/ to contrib/. 2. Inspect the two merge parents and determine that the second (5265ace0e440) is the initial elftoolchain import: it 5265ace0e440a23fb522c516f4ee20f43eaed2b3 (origin/vendor/elftoolchain/elftoolchain-r2974) Author: Kai Wang Date: Wed Jan 15 08:43:20 2014 +0000 Initial import of elftoolchain r2974. Obtained from: elftoolchain.org 3. Split the subtree, using patched git: ~/src/git/contrib/subtree/git-subtree.sh split \ --prefix=contrib/elftoolchain \ --onto=5265ace0e440a23fb522c516f4ee20f43eaed2b3 The "onto" revision is the initial hash in the subtree. With the patched git subtree this is needed, because we don't have metadata about the subtree (as would be created if `git subtree add` was used to initially create the subtree). After some time the hash of the split tree is printed: 5f7e5a9dbe67c06fb6baf08f26745e537fcfe9dd At this point I can do what I like with the subtree. I could create a branch for it, or push it to a GitHub repo: % git push github-elftoolchain \ 5f7e5a9dbe67c06fb6baf08f26745e537fcfe9dd:refs/heads/split-from-cgit-beta Which is now available here: https://github.com/emaste/elftoolchain/tree/split-from-cgit-beta From owner-freebsd-git@freebsd.org Wed Jun 17 13:36:37 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7525134F4B5 for ; Wed, 17 Jun 2020 13:36:37 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-io1-f46.google.com (mail-io1-f46.google.com [209.85.166.46]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49n5h44qmwz3Ycx for ; Wed, 17 Jun 2020 13:36:36 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: by mail-io1-f46.google.com with SMTP id u13so2693920iol.10 for ; Wed, 17 Jun 2020 06:36:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=1zIfv9c1+YpPDLiajOHZ0i69AYsM9XGmL3rrOzOcnrY=; b=RKMljWFATXw7Q4pDy4l9mlUsY5wIo2BAhf5erNqyIzqaamk5y9YVjsCgRPYudNefg0 dRjAsofHmbbxE9yVcLy5pgt5pOpS4JmhIrxPbwzJS1Zk4+CmwZQveqjI2fmnI8r7he/g dJIb+YAGm/ZXxM2+QZ1CEDmxCLV96Lym3lkvPs9GUUQt2NnkwFZqHy4QU8UJa0D2hhco ATYmuXNWFMr2bmjwYx8dDn5TUj4A6hvZXH+B9/Qrc4H8dh8kZI68lxTe1St8LrVmjz9z 822L6pcWeZIuDdWe47rN2Dj5UOKd93wUZZMuhkTjkSsm3SaPfdMgsMs3j7ou58XyECQv EUng== X-Gm-Message-State: AOAM533wLKbrLVVgdkCRBRAzS5Hyo1wraZjH3qpS9VooXYNoYuLWf9mZ dh9eyN8Onfp19Ajn3SvUnWIMPswczvseJRJA/M6Dw6xz X-Google-Smtp-Source: ABdhPJy/6scP8p538mqPqbL9yIzPcj4QkEc09KoAgObGFKb920lzBFyuAfNHybfqNvxBwVsGHYhoiF9zP0S6H8qKyLk= X-Received: by 2002:a02:ce8a:: with SMTP id y10mr31522977jaq.136.1592400994498; Wed, 17 Jun 2020 06:36:34 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ed Maste Date: Wed, 17 Jun 2020 09:36:21 -0400 Message-ID: Subject: Re: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: freebsd-git@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 49n5h44qmwz3Ycx X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of carpeddiem@gmail.com designates 209.85.166.46 as permitted sender) smtp.mailfrom=carpeddiem@gmail.com X-Spamd-Result: default: False [0.65 / 15.00]; ARC_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-git@freebsd.org]; TO_DN_NONE(0.00)[]; NEURAL_SPAM_MEDIUM(0.84)[0.836]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-0.61)[-0.607]; DMARC_NA(0.00)[freebsd.org]; NEURAL_SPAM_SHORT(0.42)[0.421]; RCVD_IN_DNSWL_NONE(0.00)[209.85.166.46:from]; FORGED_SENDER(0.30)[emaste@freebsd.org,carpeddiem@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.166.46:from]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[emaste@freebsd.org,carpeddiem@gmail.com]; TO_DOM_EQ_FROM_DOM(0.00)[] X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jun 2020 13:36:37 -0000 On Tue, 16 Jun 2020 at 22:25, Ed Maste wrote: > > I've now successfully used Tom Clarkson's patched git subtree[1] to > split elftoolchain out of cgit-beta, keeping the vendor branch history > intact, and avoiding the extraneous and bogus commits. I've done a cursory comparison of the history in this converted + split repo against contrib/elftoolchain in svn, and it is broadly as expected. There are some commits that are shown only in the git conversion, and some that appear only in svn. The git history includes the vendor branch commits, which do not appear in svn log of the subdirectory. `git log --graph` makes the relationship between the various branches clear. Subversion history includes a number of commits that affect mergeinfo only, such as r353358[1]. These are completely absent in the git conversion. As expected, the contents of the files in the git conversion and of contrib/elftoolchain are identical. [1] https://svnweb.freebsd.org/base?limit_changes=0&view=revision&revision=353358 From owner-freebsd-git@freebsd.org Wed Jun 17 13:37:57 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C019034F519 for ; Wed, 17 Jun 2020 13:37:57 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-il1-f174.google.com (mail-il1-f174.google.com [209.85.166.174]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49n5jc4F4Gz3YTg for ; Wed, 17 Jun 2020 13:37:56 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: by mail-il1-f174.google.com with SMTP id g3so2106986ilq.10 for ; Wed, 17 Jun 2020 06:37:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=j2A5DLNL/lqwJ6GvtJ2+90R7roIt6fq9Zq6kFwXvruA=; b=CQ85fnXO2Q3vlY8CiAau2DMDh9vt6XfVPtf4CqfD2Y01VKq084yy9A/tON5kQ5xP3r mRrrgVQ391+whTutTNSvjxqI5lurkNndkwCmupeNJfBc8HbvFXEoovm681Ya4nFVazNK NJ9Uo1bSqY27RTgNlnSPZO6deXRDBfquyVe1vYkXcBoY3pQSCS4upuMakCztsG4JaGNt XDKDg/hjsY2vZTJMy89ns1vzyh45aRFmUxDEhu33VfiwX6kLvc0EgorkicXbmspuy1kk 6kci4d22YMUVW294t96QbRmm7s5BO08AkesYLBz3Ib+5YsvjJj4R4wjvZ8v0euhTzpNU b1zw== X-Gm-Message-State: AOAM531bAZVbC0lbnlLhLS/iz2sLzbscgLky6lXuKjqGjtCVS+vG7wn9 VPFoYNeJSyU5UiAxShIuc26Wks+RV1P4VGkX2MEMT4ae X-Google-Smtp-Source: ABdhPJzGhvjAR8dNkiycjwAMdZFLVWqEyqbtKuGQf4IMh5upnKgpEA97VDuNHk1d9oKqEitzjt4g2HYo69vT3IMHMjc= X-Received: by 2002:a92:dccd:: with SMTP id b13mr8137764ilr.98.1592401075106; Wed, 17 Jun 2020 06:37:55 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ed Maste Date: Wed, 17 Jun 2020 09:37:43 -0400 Message-ID: Subject: Re: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: freebsd-git@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 49n5jc4F4Gz3YTg X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of carpeddiem@gmail.com designates 209.85.166.174 as permitted sender) smtp.mailfrom=carpeddiem@gmail.com X-Spamd-Result: default: False [0.67 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-git@freebsd.org]; TO_DN_NONE(0.00)[]; NEURAL_SPAM_MEDIUM(0.84)[0.839]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-0.59)[-0.594]; DMARC_NA(0.00)[freebsd.org]; NEURAL_SPAM_SHORT(0.42)[0.424]; RCVD_IN_DNSWL_NONE(0.00)[209.85.166.174:from]; FORGED_SENDER(0.30)[emaste@freebsd.org,carpeddiem@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.166.174:from]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[emaste@freebsd.org,carpeddiem@gmail.com]; FREEMAIL_ENVFROM(0.00)[gmail.com]; TO_DOM_EQ_FROM_DOM(0.00)[] X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jun 2020 13:37:57 -0000 On Tue, 16 Jun 2020 at 22:25, Ed Maste wrote: > > I've now successfully used Tom Clarkson's patched git subtree[1] to > split elftoolchain out of cgit-beta, keeping the vendor branch history > intact, and avoiding the extraneous and bogus commits. I accidentally omitted the footnote, which should be: [1] https://github.com/gitgitgadget/git/pull/493 From owner-freebsd-git@freebsd.org Wed Jun 17 16:00:05 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 09F2C35232B for ; Wed, 17 Jun 2020 16:00:05 +0000 (UTC) (envelope-from uspoerlein@gmail.com) Received: from mail-ot1-f67.google.com (mail-ot1-f67.google.com [209.85.210.67]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49n8sc5919z3yPf; Wed, 17 Jun 2020 16:00:04 +0000 (UTC) (envelope-from uspoerlein@gmail.com) Received: by mail-ot1-f67.google.com with SMTP id 97so2027412otg.3; Wed, 17 Jun 2020 09:00:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=epQsimTAzy5djDS3wC7P4f2MgXMQFgC0AkyNRgVx670=; b=cTqc9lE3p+ZiVz/BkyZd1I9+Epu+XfJBb4mccS8r9C12N8K/rT0zWb6wICPTUYHOct jIL4ZvHavY0kYOQrzwh99P22i/h8B8GRqJwKKbJlK4JamDy74/j+Pg9hmcodxKIACL7n rW4YtokCjakz8iq6hLG563KYqLfxixHW2l5hlUlkITVQsG1o50CpREBdNZ13/PeZvdOC hCwqvmhmpM3NhE5/n/Tb0NPu3MCVVJSI0yDx+tiiT7G/Ctk2WSBu206SAe0rHa68l1xh o//9cF78vuT3NSQdoItg1AXoYZx2SxU6MQMbICcaCNN6A4YPknoX8RjBbZFqj+cZYI8O U7rA== X-Gm-Message-State: AOAM530f2ehFtL0qTUJ/etpMU7FN2pQRZA25BqOzgT3GZi1kf+GwSVq7 k4IOdkjsMDYoE75nhAJPu1M5p2gWc/ffGwLsMOsWIH6U X-Google-Smtp-Source: ABdhPJxyrHgJ041ko4YvhcTDIqYbqT1A8Dww0/Sbx8V4RDYcsTgbBE4wAwSTa27iuiCctNJUh8DJr8qu20sR8Eqnx9M= X-Received: by 2002:a9d:6e03:: with SMTP id e3mr6857904otr.71.1592409602384; Wed, 17 Jun 2020 09:00:02 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= Date: Wed, 17 Jun 2020 17:59:51 +0200 Message-ID: Subject: Re: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: Ed Maste Cc: freebsd-git@freebsd.org X-Rspamd-Queue-Id: 49n8sc5919z3yPf X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; REPLY(-4.00)[] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.33 X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jun 2020 16:00:05 -0000 On Wed, Jun 17, 2020 at 4:25 AM Ed Maste wrote: > On Tue, 16 Jun 2020 at 14:41, Ed Maste wrote: > > > > I've found one new class of strange commit - r265044, which I've > > excluded from mergeinfo parsing (in the above list). However, svnweb > > highlights an issue: > I've built a repo w/o the MFHead commits going into project branches. That is, projects will merge into head, but head will *not* merge into projects. Running the subtree split, I get a history with about 437 commits. I see in your https://github.com/emaste/elftoolchain/tree/split-from-cgit-beta that you only end up with 277 commits (if that display is to be trusted). For this repo, it now looks like so (r265044 is from 2014-04-28) * | | | | | | | 52046640c377 - Merge from head (Baptiste Daroussin, 4 years, 9 months ago, 2015-10-01) * | | | | | | | 7d76ac023498 - Finish merging from head, messed up in previous attempt (Baptiste Daroussin, 4 years, 9 months ago, 2015-09-12) * | | | | | | | 184614fff731 - Merge from head (Baptiste Daroussin, 4 years, 9 months ago, 2015-09-12) * | | | | | | | 06d2968292d4 - Merge from head@274131 (Baptiste Daroussin, 5 years ago, 2015-06-16) |\| | | | | | | | * | | | | | | a8f362c5fe90 - Add META_MODE support. (Simon J. Gerraty, 5 years ago, 2015-06-13) | |\ \ \ \ \ \ \ | | * | | | | | | 670c2113728e - Merge sync of head (Simon J. Gerraty, 5 years ago, 2015-05-27) | | * | | | | | | 045693fbc721 - Merge from head@274682 (Simon J. Gerraty, 6 years ago, 2014-11-19) | | * | | | | | | d3282ab267a8 - Merge head (Simon J. Gerraty, 6 years ago, 2014-04-28) | | / / / / / / | * | | / / / / 0233f02a35bc - elfcopy: Handle objects without a ".shstrtab" section string table (Ed Maste, 5 years ago, 2015-06-13) | | |_|/ / / / | |/| | | | | | * | | | | | 01a7e6868f7a - Update to ELF Tool Chain r3223 (Ed Maste, 5 years ago, 2015-05-27) | | |/ / / / | |/| | | | | * | | | | 356943f64b09 - Update to ELF Tool Chain r3197 (Ed Maste, 5 years ago, 2015-05-14) | * | | | | 2dae10c39647 - Merge ^/projects/release-arm-redux into ^/head. (Glen Barber, 5 years ago, 2015-05-09) | |\ \ \ \ \ | | * | | | | bb93d386e833 - MFH: r280643-r281852 (Glen Barber, 5 years ago, 2015-04-22) | | * | | | | fad1f163a320 - MFH: r278968-r280640 (Glen Barber, 5 years ago, 2015-03-25) | | * | | | | d2f00e64668a - MFH: r278593-r278966 (Glen Barber, 5 years ago, 2015-02-18) | | * | | | | 0a8b1fb8842b - MFH: r278202,r278205-r278590 (Glen Barber, 5 years ago, 2015-02-11) | | | |/ / / | | |/| | | | * | | | | cd6104d28a5c - Update elftoolchain to upstream revision 3179 (Ed Maste, 5 years ago, 2015-04-01) | | |_|_|/ | |/| | | | * | | | 14a1bbced50b - Merge ^/head r279163 through r279308. (Dimitry Andric, 5 years ago, 2015-02-26) So this is clearly not ideal yet. All MFH are there as cherrypicks and the merges to head from unrelated branches still gets pulled in. A potential post-processing step could be to squash subsequent commits with the same tree hash. Here, the 2nd hash is the tree hash: * 9b32ed0c66ef 74ab27147825 - elftoolchain nm(1): Initialize allocated memory before use (Conrad Meyer, 2 years, 3 months ago, 2018-03-16) * 5c976b8c4811 1d39238f5607 - Merge ^/head r327886 through r327930. (Dimitry Andric, 2 years, 5 months ago, 2018-01-13) |\ | * 344ba1731347 1d39238f5607 - elfcopy: copy raw (untranslated) contents to binary output (Ed Maste, 2 years, 5 months ago, 2018-01-02) * | 3714d762a9e0 1d39238f5607 - Merge ^/head r327341 through r327623. (Dimitry Andric, 2 years, 5 months ago, 2018-01-06) |/ * db9ffaaa5457 c3635117b95c - readelf: report byte size for DT_PREINIT_ARRAYSZ (Ed Maste, 2 years, 6 months ago, 2017-12-26) * a616bc1dbec6 c1e10d6f9540 - Merge ^/head r326132 through r326161. (Hans Petter Selasky, 2 years, 7 months ago, 2017-11-24) |\ | * 899122e1d197 c1e10d6f9540 - MFhead@r323646 (Enji Cooper, 2 years, 9 months ago, 2017-09-16) | |\ | | * 09fe4662fc65 c1e10d6f9540 - Add missing newline after unknown MIPS-specific dynamic entries. (John Baldwin, 2 years, 9 months ago, 2017-09-15) | | * a27ad9741c63 2bdfff10c1e6 - Recognize NT_PTLWPINFO and NT_ARM_VFP in FreeBSD ELF cores. (John Baldwin, 2 years, 9 months ago, 2017-09-14) | * | 1e46d0a611f7 c1e10d6f9540 - MFhead@r323635 (Enji Cooper, 2 years, 9 months ago, 2017-09-16) | * | 8caa235148b3 2200cd80945d - MFhead@r322515 (Enji Cooper, 2 years, 10 months ago, 2017-08-14) | |\| | | * 81c051e299b3 2200cd80945d - o Replace __riscv__ with __riscv o Replace __riscv64 with (__riscv && __riscv_xlen == 64) (Ruslan Bukin, 2 years, 10 months ago, 2017-08-07) | * | da88e12ed370 2200cd80945d - MFhead@r322451 (Enji Cooper, 2 years, 10 months ago, 2017-08-13) | * | 48461537238e b6e4a7671e42 - MFhead@r321431 (Enji Cooper, 2 years, 11 months ago, 2017-07-24) | |\| I'm not sure whether it would be straightforward to squash the right commits and keep the ones with the proper commit message. Your repo still has a view MFH commits that one might want to remove. Using git `filter-repo` might do the trick ... > I've now successfully used Tom Clarkson's patched git subtree[1] to > split elftoolchain out of cgit-beta, keeping the vendor branch history > intact, and avoiding the extraneous and bogus commits. > > I used https://cgit-beta.freebsd.org/src.git at > 4c09cab462f2d27794d51a4dcb06df806dc9f3a6, and performed the following > steps: > > % git log contrib/elftoolchain > > 1. Observe that the initial commit under that prefix is: > > commit f4b5186d24f3e1969c32a49f126c8ad29ece63e9 > Merge: 8b06418614b0 5265ace0e440 > Author: Kai Wang > Date: Wed Jan 15 22:30:48 2014 +0000 > > Copy libelf, libdwarf and common files from vendor/ to contrib/. > Just to make sure, you know that you can get this like so: % git log --reverse --format=%h master -- contrib/elftoolchain/ | head -1 37429c2aa7e7 (note sure why using -n1 instead of head(1) will result in the latest, not the oldest. Seems that it ignores --reverse) Would be good if you could run a script against all contrib prefixes and later count the number of commits that a contrib-tree produces to see if something weird happens. > > 2. Inspect the two merge parents and determine that the second > (5265ace0e440) is the initial elftoolchain import: > > it 5265ace0e440a23fb522c516f4ee20f43eaed2b3 > (origin/vendor/elftoolchain/elftoolchain-r2974) > Author: Kai Wang > Date: Wed Jan 15 08:43:20 2014 +0000 > > Initial import of elftoolchain r2974. > > Obtained from: elftoolchain.org > > You can test both parents whether they are reachable from vendor/elftoolchain/dist, or look at their notes: % git log -n1 --format=%P 37429c2aa7e7 | xargs -n1 -I@ git log -n1 --format="%h %N" @ 8a7f75c8fcc5 svn path=/head/; revision=260666 5265ace0e440 svn path=/vendor/elftoolchain/dist/; revision=260684 svn path=/vendor/elftoolchain/elftoolchain-r2974/; revision=260685; tag=vendor/elftoolchain/elftoolchain-r2974 > 3. Split the subtree, using patched git: > > ~/src/git/contrib/subtree/git-subtree.sh split \ > --prefix=contrib/elftoolchain \ > --onto=5265ace0e440a23fb522c516f4ee20f43eaed2b3 > > The "onto" revision is the initial hash in the subtree. With the > patched git subtree this is needed, because we don't have metadata > about the subtree (as would be created if `git subtree add` was used > to initially create the subtree). > > After some time the hash of the split tree is printed: > 5f7e5a9dbe67c06fb6baf08f26745e537fcfe9dd > > At this point I can do what I like with the subtree. I could create a > branch for it, or push it to a GitHub repo: > > % git push github-elftoolchain \ > 5f7e5a9dbe67c06fb6baf08f26745e537fcfe9dd:refs/heads/split-from-cgit-beta > > Which is now available here: > https://github.com/emaste/elftoolchain/tree/split-from-cgit-beta For my own understanding, all the issues around subtree splitting are actually not blocking the conversion in any way, right? All they do is make the lives miserable for contrib-software maintainers and they might delay new code drops under contrib/ yes? Cheers Uli From owner-freebsd-git@freebsd.org Wed Jun 17 16:47:12 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 01FBA353768 for ; Wed, 17 Jun 2020 16:47:12 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-io1-f65.google.com (mail-io1-f65.google.com [209.85.166.65]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49n9vz4pS6z43Wh; Wed, 17 Jun 2020 16:47:11 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: by mail-io1-f65.google.com with SMTP id u13so3505448iol.10; Wed, 17 Jun 2020 09:47:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=4eQQZjDkg/yR3wpcg1Iucle2H7dLwVnemFBw7XjGrHs=; b=ZQ7+VrHdJOVOb5HeNKJVtTiB6kguWp2xqPo3ce/W9uJ1XHV/NfsbUKDAFJlNIqYoQI TKiw0nKojL1Naoh0FRLNOn319iM5wO+bJRaZRcm2y7+FcL4B7ODAC49yGBR/TANTPrU5 NfvVw1ZRxnRfBKuEFKITl54LsBcer1pY1KRwdF+9eG73MkWdLPkBmmk6waTTMwaDHnm4 3iKMXxrxMjFnT0bi4/l0bcVfHxbZ5Gn6kX48nfHzejQHKs7ZgYRakGTF+FVXEgfXc2+Z aq+LqDM3p0tETlUNLYgciBR2XWh3YMayuMaWliDSh1xjIMFNIlOrMyEYHU6IZLC9Qnv3 PYFA== X-Gm-Message-State: AOAM533E/xRjG0CtiayC+hnqegrWLLYeTQdhuuFsYuhBNh7XjhoTJCRv cKSrMleA1s8H1PCtHF5lbJq4k8yAmlnNKehLbQzvtdgWvrQ= X-Google-Smtp-Source: ABdhPJz+Q9zHM3OWRKLBY1G/wtc8q+Q4B5pJgqFm/BJKvVLi5jTagJp8i5amqw/rWXsLlZcjY4m5GVicBGWe8SN8AYs= X-Received: by 2002:a6b:b252:: with SMTP id b79mr282673iof.31.1592412429309; Wed, 17 Jun 2020 09:47:09 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ed Maste Date: Wed, 17 Jun 2020 12:46:57 -0400 Message-ID: Subject: Re: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= Cc: freebsd-git@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 49n9vz4pS6z43Wh X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US] X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jun 2020 16:47:12 -0000 On Wed, 17 Jun 2020 at 12:00, Ulrich Sp=C3=B6rlein wrote: > > Running the subtree split, I get a history with about 437 commits. I see = in your https://github.com/emaste/elftoolchain/tree/split-from-cgit-beta th= at you only end up with 277 commits (if that display is to be trusted). Are you using unmodified subtree split, from git port/pkg? The patch set from Tom Clarkson improves the detection of mainline vs subtree significantly. In the existing cgit-beta (without the MFH changes you discussed here) it produces a subtree with tens/hundreds of thousands of commits, because a mainline commit "leaks" into the subtree via a merge. The patched git subtree is what I used for the split elftoolchain that I shared. > I'm not sure whether it would be straightforward to squash the right comm= its and keep > the ones with the proper commit message. Your repo still has a view MFH c= ommits that > one might want to remove. Using git `filter-repo` might do the trick ... Indeed, although I'm not particularly concerned if there are a few stray MFH commits - it's a little bit of clutter but accurately represents what happened in that subtree in the svn world. > Just to make sure, you know that you can get this like so: > % git log --reverse --format=3D%h master -- contrib/elftoolchain/ | head = -1 > 37429c2aa7e7 For the email I sent I just reviewed all of the contrib/elftoolchain history anyway, and looked at the last commit. Thanks for this though; I suspect that if we try automating this we could add --merges. > (note sure why using -n1 instead of head(1) will result in the latest, no= t the oldest. Seems that it ignores --reverse) Indeed, this looks like a git bug. > Would be good if you could run a script against all contrib prefixes and = later > count the number of commits that a contrib-tree produces to see if someth= ing > weird happens. You mean try running `git subtree split` on each contrib prefix, and checking that the number of commits in each generated tree is sensible? For example, inspect any subtree with over say 500 commits? As a first pass for identifying contrib prefixes I tried: ls -1d contrib/* sys/contrib/* crypto/* sys/crypto/* cddl/* sys/cddl/* sys/= gnu/* sys/crypto/ and the cddl ones aren't quite right, and I still need to check for additional hierarchy (e.g., if we have cases like contrib/netbsd/blocklist instead of contrib/blocklist) > You can test both parents whether they are reachable from vendor/elftoolc= hain/dist, I'm hoping to find an algorithm that could be made general and submitted upstream, so that we could have something like git subtree split --initial --prefix=3Dcontrib/elftoolchain, and have the --initial calculate the --onto revision automatically. If we produce some bespoke tooling for FreeBSD though this branch name approach should work, but I think we'd have to have a map of contrib directory to vendor branch. I believe that some are not the same in contrib and vendor. > or look at their notes: > > % git log -n1 --format=3D%P 37429c2aa7e7 | xargs -n1 -I@ git log -n1 --fo= rmat=3D"%h %N" @ > 8a7f75c8fcc5 svn path=3D/head/; revision=3D260666 > > 5265ace0e440 svn path=3D/vendor/elftoolchain/dist/; revision=3D260684 > svn path=3D/vendor/elftoolchain/elftoolchain-r2974/; revision=3D260685; t= ag=3Dvendor/elftoolchain/elftoolchain-r2974 This seems like a simpler, workable approach for our tree - anything with a note containing "svn path=3D/vendor" is a subtree commit. > For my own understanding, all the issues around subtree splitting are act= ually > not blocking the conversion in any way, right? All they do is make the li= ves miserable > for contrib-software maintainers and they might delay new code drops unde= r > contrib/ yes? It depends on your definition of "blocking" I think, but your statement is generally true - we could use the existing cgit-beta conversion, build releases from it, etc. In the current form, with unpatched git-subtree, the bootstrap process will be quite awkward for contrib software maintainers though. I think we have three ways we can address this: 1. Change the svn2git process so that we don't trip over unpatched git-subtree's issue with mainline history leaking into the subtree. 2. Get Tom Clarkson's git-subtree patches into upstream git, or require that contrib maintainers use our own patched git until that happens. 3. Develop and use an alternate subtree splitter. I suppose there is also 4. Reconsider git subtree altogether (e.g. submodules). but I think there's little appetite for this. At this point I think that option 2 is the most straightforward, and I'm now reasonably confident that it will work as we want. With this being the case I'd say we should focus on tuning svn2git to produce "sensible" output without regard to how unpatched git-subtree handles the output. That is, I'd say I'm broadly happy with the state of conversion in cgit-beta today. From owner-freebsd-git@freebsd.org Thu Jun 18 08:44:39 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 66E973490F0 for ; Thu, 18 Jun 2020 08:44:39 +0000 (UTC) (envelope-from uspoerlein@gmail.com) Received: from mail-oi1-f195.google.com (mail-oi1-f195.google.com [209.85.167.195]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49nb8k4hNYz4KYW; Thu, 18 Jun 2020 08:44:38 +0000 (UTC) (envelope-from uspoerlein@gmail.com) Received: by mail-oi1-f195.google.com with SMTP id a21so4370386oic.8; Thu, 18 Jun 2020 01:44:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2QmdhkWZ/UqF8An+SlDZy+ctg5+o9gUduYQ+TPZQxow=; b=AO07tb4XjgBxRuRwq5RnOhxE4riR+1FsD6dJLjddmAr3U4v9CJspEPf8aOjqtPSXCG aanw3fm2sW+MIO4npLHl0SMqNdqdz/sk1Cj0FwheFX6KR9F41pi4AEDtOHZsPoLsRdyH 5pqZIeF0y61NZzXsCUozU1Hcssg/yj1WRuaFB8y/GkDygGabNSTDzfAdyg0NHkMazMbF CK3DUSlZNfk0Gu/EqnCH/lcp0mbm3uK4VEIn+T4KU3BSRSSMiNNeEMoEidYH+wLVc5s9 oKTVBHjv0mTWwIR7ODCVsZskeF4huqPmVXhzF5Zsr9wQZmtXmXb5pEPlUq2Q5T16FyX5 kVIA== X-Gm-Message-State: AOAM532XQo3Fd531yIxsxNFKChXyPuBgTRJ6SjoLi1O7TluKeAVXfcE0 4qWk5gqkSX9hBkwAlTr5IhHlg1rqqHz+gf7l+c0AuL0l X-Google-Smtp-Source: ABdhPJyCZUx7Y1F236r2Ml7E9F/lET6OHK4KDxkImE/l2joo5ihKI9A3ZGTZlY0x+bRj+uJTAX66D1kn4q7rjrPSlkM= X-Received: by 2002:aca:ebca:: with SMTP id j193mr2014186oih.42.1592469876850; Thu, 18 Jun 2020 01:44:36 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= Date: Thu, 18 Jun 2020 10:44:25 +0200 Message-ID: Subject: Re: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: Ed Maste Cc: freebsd-git@freebsd.org X-Rspamd-Queue-Id: 49nb8k4hNYz4KYW X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US] Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.33 X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Jun 2020 08:44:39 -0000 On Wed, Jun 17, 2020 at 6:47 PM Ed Maste wrote: > On Wed, 17 Jun 2020 at 12:00, Ulrich Sp=C3=B6rlein wrot= e: > > > > Running the subtree split, I get a history with about 437 commits. I se= e > in your https://github.com/emaste/elftoolchain/tree/split-from-cgit-beta > that you only end up with 277 commits (if that display is to be trusted). > > Are you using unmodified subtree split, from git port/pkg? The patch > set from Tom Clarkson improves the detection of mainline vs subtree > significantly. In the existing cgit-beta (without the MFH changes you > discussed here) it produces a subtree with tens/hundreds of thousands > of commits, because a mainline commit "leaks" into the subtree via a > merge. The patched git subtree is what I used for the split > elftoolchain that I shared. > Yes, this is using plain git subtree w/o patches, but it's on a repo that has no MFH head =E2=86=92 project merges. As I wrote, it comes out with ~400 commits, = while a patched subtree split will only produce about 280, so going with the patched subtree seems more sensible. > > I'm not sure whether it would be straightforward to squash the right > commits and keep > > the ones with the proper commit message. Your repo still has a view MFH > commits that > > one might want to remove. Using git `filter-repo` might do the trick ..= . > > Indeed, although I'm not particularly concerned if there are a few > stray MFH commits - it's a little bit of clutter but accurately > represents what happened in that subtree in the svn world. > I tried filter-repo on your elftoolchain history, and it does nothing :( There are still many tree objects that are identical between commits, but they are usually merges, so it's not a simple "empty" commit that we can toss out. However, I wonder if we cannot flatten the history to be a single line instead of the merges. Do merge commits make any sense for the resulting repo? If we could make it all linear, then the "empty" commits would stand out better and filter-repo could toss them away. Here's what I mean: * | e7db0cf7 0a4a2d68 - Add META_MODE support. |\ \ | * \ 3cc0c3e8 935facd3 - Merge sync of head | |\ \ | * | | a16e339e d637a411 - Merge from head | |\| | * | | | eb3e8834 0a4a2d68 - elfcopy: Handle objects without a ".shstrtab" section string table ... | |/ |/| * | b5199d77 d637a411 - Fix the conversion macro for .note sections, broken in the cas There's a long "branch" that was created from b5199d77, then sees 2 commits (with different trees, mind you) a16e339e and 3cc0c3e8. But once it is merged back into main with e7db0cf7, the resulting tree is 0a4a2d68, which is just the same pre-merge (in commit eb3e8834). So the whole exercise of merging in the off-shoot branch doesn't alter the tree at all. The algorithm would be simple: - for all merge commits, check their tree vs. the tree of all parents. IFF one of the parents has the same tree as the merge commit =E2=86=92 remove the merge commit The last step is actually tricky, as you need to update all the children of the merge commit with the new parent. You need to check all revisions for that (in your repo, there are actually 2!) Doing it manually with a graft: % git replace --graft e7db0cf7 eb3e8834 results in * | e7db0cf7 0a4a2d68 - (replaced) Add META_MODE support. * | eb3e8834 0a4a2d68 - elfcopy: Handle objects without a ".shstrtab" section string table * | 666a6d09 7d6c3ca1 - Update to ELF Tool Chain r3223 This looks useful, I guess. There is a different case of history also, where a branch is merged in multiple times, but the branch never had a commit so far! * | ab49ac72 2fcd31e1 - Copy elftoolchain readelf from vendor branch |\| * | b5f0dac9 00d9d606 - Correct elftoolchain strip(1) memory size calculation * | daa082dd 57cf3e02 - libelf: Fix cross-endian ELF note file / memory conversion * | c7ce7bba 41742fa1 - Track libarchive API change * | 1481386a 62292f3c - Temporarily disable non-FreeBSD NT_ note types * | 2b9d3d05 d080f9c9 - Fix elftoolchain tools in-tree build * | f4b35757 b87618f0 - Copy elftoolchain binutils replacements from vendor branch |\| * | b5199d77 d637a411 - Fix the conversion macro for .note sections, broken in ... * | eda82609 cffe5c27 - GCC for PowerPC does not align .note secti... * | 0501d6e9 b7163de0 - Reapply r221569, r233401, r233524 and r255105: Add support ... * | 28d52963 195f7d2b - Remove trailing whitespace. * | 3342d176 e298487d - * Allow API dwarf_loclist_n() and dwarf_loclist() to be called with ... * | 297fc330 e3796a34 - Add a sanity check: The provided offset for the desired location ... * | 8bd68bad 4613fb47 - API dwarf_attrval_flag() should properly handle an attribute ... * | 039bcdce 651ee9b1 - Fix typo: the public API dwarf_child() should return ... * | 605443d0 c431e1b8 - Fix a warning in libdwarf found by -Wmissing-variable * | 6458e5be f938195c - Apply r241720 by ed: * | fdaacf72 322e4f0d - Use FreeBSD's ELF headers instead of the elfdefinitions.h ... * | afb9ed40 bcc80b4a - Copy libelf, libdwarf and common files from vendor/ to contrib/. |/ * 5265ace0 13068447 - Initial import of elftoolchain r2974. The branch on the right is merged in f4b35757 and ab49ac72, but it never actually had a commit itself till then. I don't know how to detect that, but let's graft it away anyway. % git replace --graft f4b35757 b5199d77 % git replace --graft ab49ac72 b5f0dac9 % git filter-repo --debug --prune-empty=3Dalways --prune-degenerate=3Dalway= s --force This results in at most (!) 3 branches active at the same time, and that only for 1 commit anyway. All MFHs are gone. The graph output has the branches swapped, so a diff(1) is useless, you need to manually look at them side-by-side (I hope this will be readable, the hashes are tree-hashes, left is old, right is after 3 grafts and filter-repo) * | f5beebdd Update to ELF Tool Chain r3|* | f5beebdd Update to ELF Tool Chain r34 |\| ||\| | * 5d850df4 Import ELF Tool Chain snaps|| * 5d850df4 Import ELF Tool Chain snapsh | * 4faefe71 Import ELF Tool Chain snaps|| * 4faefe71 Import ELF Tool Chain snapsh * | ce92a093 elfcopy: map all !alnum cha|* | ce92a093 elfcopy: map all !alnum char * | e1c6f66f MFH |* | e1c6f66f elfcopy: fix symbol table ha |\ \ |* | fba12860 elfcopy: overhaul of LMA han | * | e1c6f66f elfcopy: fix symbol table|* | b6574d32 libelf: correct byte count i * | | fba12860 MFH |* | ecdda97c libdwarf: fix SHT_REL reloca |\| | |* | f3d41e3c elfcopy: fail if debug link | * | fba12860 elfcopy: overhaul of LMA |* | 6c9c0b57 Allow elfcopy to convert bet * | | b6574d32 MFH |* | a6ca6b74 Update ELF Tool Chain to ups |\| | ||\| | * | b6574d32 libelf: correct byte coun|| * 3de9500b Import ELF Tool Chain snapsh | * | ecdda97c libdwarf: fix SHT_REL rel|* | 848f4fb5 readelf: decode AArch64 TLS * | | f3d41e3c MFH |* | bc4cfa9f readelf: report value of unk |\| | |* | 75cc50b8 readelf: avoid accidental fa | * | f3d41e3c elfcopy: fail if debug li|* | 641e5321 Add config for RISC-V ISA. * | | 6c9c0b57 MFH |* | f4bd9e9f Fixed uninitialized variable |\| | |* | 47b58c30 Update to ELF Tool Chain r32 | * | 6c9c0b57 Allow elfcopy to convert ||\| * | | a6ca6b74 MFH || * fa4048de Import ELF Tool Chain snapsh |\| | |* | eff30dab elfcopy: include extension b | * | a6ca6b74 Update ELF Tool Chain to |* | 49fea608 elfcopy: exclude extension w | |\| |* | e2028f7d readelf: add Xen ELF notes | | * 3de9500b Import ELF Tool Chain sna|* | e1809d24 Add missing commas * | | 848f4fb5 MFH |* | 3d703d04 Add definitions for MIPS TLS |\| | |* | 3f2daddd addr2line: initialize die to | * | 848f4fb5 readelf: decode AArch64 T|* | 1314d2c4 Update to ELF Tool Chain r32 | * | bc4cfa9f readelf: report value of ||\| | * | 75cc50b8 readelf: avoid accidental|| * 3204d66e Import ELF Tool Chain snapsh * | | 641e5321 MFH |* | f56b32a7 Rename ELFOSABI_SYSV to ELFO |\| | |* | f1fa8947 readelf: Correct typo HPUS - | * | 641e5321 Add config for RISC-V ISA|* | 5cdc13da addr2line: skip CUs lacking > > Would be good if you could run a script against all contrib prefixes an= d > later > > count the number of commits that a contrib-tree produces to see if > something > > weird happens. > > You mean try running `git subtree split` on each contrib prefix, and > checking that the number of commits in each generated tree is > sensible? For example, inspect any subtree with over say 500 commits? > > As a first pass for identifying contrib prefixes I tried: > > ls -1d contrib/* sys/contrib/* crypto/* sys/crypto/* cddl/* sys/cddl/* > sys/gnu/* > > sys/crypto/ and the cddl ones aren't quite right, and I still need to > check for additional hierarchy (e.g., if we have cases like > contrib/netbsd/blocklist instead of contrib/blocklist) > Yes, we need to check all of them and it might need a manual mapping (or at least a handful exceptions). > > > You can test both parents whether they are reachable from > vendor/elftoolchain/dist, > > I'm hoping to find an algorithm that could be made general and > submitted upstream, so that we could have something like git subtree > split --initial --prefix=3Dcontrib/elftoolchain, and have the --initial > calculate the --onto revision automatically. If we produce some > bespoke tooling for FreeBSD though this branch name approach should > work, but I think we'd have to have a map of contrib directory to > vendor branch. I believe that some are not the same in contrib and > vendor. > It might really be too bespoke for FreeBSD and we only need to do all of this once, yes? > I think we have three ways we can address this: > > 1. Change the svn2git process so that we don't trip over unpatched > git-subtree's issue with mainline history leaking into the subtree. > 2. Get Tom Clarkson's git-subtree patches into upstream git, or > require that contrib maintainers use our own patched git until that > happens. > 3. Develop and use an alternate subtree splitter. > > I suppose there is also > 4. Reconsider git subtree altogether (e.g. submodules). > but I think there's little appetite for this. > > At this point I think that option 2 is the most straightforward, and > I'm now reasonably confident that it will work as we want. With this > being the case I'd say we should focus on tuning svn2git to produce > "sensible" output without regard to how unpatched git-subtree handles > the output. That is, I'd say I'm broadly happy with the state of > conversion in cgit-beta today. > 2 would be my preference as well. So someone will need to run subtree split for all contrib software and let me know if there are blockers that the svn2git conversion could help out with (or we need to help out all subtree splits with a dozen carefully placed grafts and a rewrite. Doesn't sound too bad?) Cheers Uli From owner-freebsd-git@freebsd.org Thu Jun 18 17:39:27 2020 Return-Path: Delivered-To: freebsd-git@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6079D35607B for ; Thu, 18 Jun 2020 17:39:27 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-il1-f179.google.com (mail-il1-f179.google.com [209.85.166.179]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49nq1q0NHMz3WJr; Thu, 18 Jun 2020 17:39:26 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: by mail-il1-f179.google.com with SMTP id h3so6571397ilh.13; Thu, 18 Jun 2020 10:39:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=18VN2U68Nkp+OO7dTY/dVm6JqflHs5bruOyHW64v+Vc=; b=qeYsdcoOCSR79bPcn+QFgKRXdkCJPurEsYoJkF/DKSyJVMCFC3HX/uXAHPmjpr36yC FyQ+1guTEq4fuZxfB2pKopRAPOJDFr6/4GpiVVgmV9mH/W4pFcvMmpW+z7vTB23sBuPQ 1GxVaUQiDFnL6bCjieTqqvJjtmQE4m+U4oqueLIoTVJldK6tq1p5sorm80J30SUq+2Op 7auFsFHKxkI3JVUx1QXQDXWVr+aB5+VVTN3XayX7LblFQsSv7BV2yqT6pyOj+9DY1nYu su2gmiej8zjqUHuflpvtM2FasL65pkURxWB4Ae85EDI+UgJeesAx/NmpPq1ANOWEOp/X PmKA== X-Gm-Message-State: AOAM531Z4G+lXBzfCYVw8Mfv71ctnDfSy7XLyefXJLoE962REyEPQLrG 3+7CwVzFaChAQMkhdAGI/fWD+91gtP+YRyr2+ghmzywg X-Google-Smtp-Source: ABdhPJxwa5Ett2T1lG8DbcLnOK3SskhE3SIcmNfgr3RPZurZaG3xdrE57d1GsA/8fotNBySPuHG0DYGQLLvdZ/xt8yY= X-Received: by 2002:a92:de0d:: with SMTP id x13mr5384831ilm.256.1592501964888; Thu, 18 Jun 2020 10:39:24 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ed Maste Date: Thu, 18 Jun 2020 13:39:12 -0400 Message-ID: Subject: Re: Next odd commit affecting `git subtree split` experiments with contrib/elftoolchain To: =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= Cc: freebsd-git@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 49nq1q0NHMz3WJr X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US] X-BeenThere: freebsd-git@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussion of git use in the FreeBSD project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Jun 2020 17:39:27 -0000 On Thu, 18 Jun 2020 at 04:44, Ulrich Sp=C3=B6rlein wrote: > > Yes, this is using plain git subtree w/o patches, but it's on a repo that= has no MFH > head =E2=86=92 project merges. As I wrote, it comes out with ~400 commits= , while a patched > subtree split will only produce about 280, so going with the patched subt= ree seems > more sensible. Indeed, but it's good to know that option 1 is also workable - this gives me more confidence in our ability to have a final version of the conversion within weeks/months. > However, I wonder if we cannot flatten the history to be a single line in= stead of the merges. > Do merge commits make any sense for the resulting repo? If we could make = it all linear, then > the "empty" commits would stand out better and filter-repo could toss the= m away. We can't make it entirely linear, because we do want to continue to represent upstream updates that went via the vendor branch happening concurrently with FreeBSD-local changes in contrib/, to support future updates. > The algorithm would be simple: > - for all merge commits, check their tree vs. the tree of all parents. > IFF one of the parents has the same tree as the merge commit > =E2=86=92 remove the merge commit I think this would work, but is not worth the effort, because this is an issue only for those maintaining some contrib/ software, and as long as there are a "reasonable" number of these extraneous commits they're easy to just ignore. > It might really be too bespoke for FreeBSD and we only need to do all of = this once, yes? Yes, I expect that we'll have to do this once (for each piece of contrib software) to bootstrap. I have seen other examples of projects with subtree-style updates not using git subtree though, so if there is a sufficiently general version it's worth upstreaming. >From git subtree's help: | If your subtree was originally imported using something other t= han | git subtree, its history may not match what git subtree is | expecting. In that case, you can specify the commit id t= hat | corresponds to the first revision of the subproject's history t= hat | was imported into your project, and git subtree will attempt to | build its history from there. However, having something upstreamable is not a requirement - I'm fine with a bespoke, even hacky solution, as we'll only need to use it once. (Option 2 is the patched git-subtree) > 2 would be my preference as well. So someone will need to run subtree spl= it > for all contrib software and let me know if there are blockers that the s= vn2git > conversion could help out with (or we need to help out all subtree splits= with a > dozen carefully placed grafts and a rewrite. Doesn't sound too bad?) I will try a few more contrib/ subtree splits individually, and later try running it on every one. Once we have appropriate documentation individual maintainers can test their subtrees.