Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 May 2023 22:47:44 +0100
From:      Kaya Saman <kayasaman@optiplex-networks.com>
To:        Paul Procacci <pprocacci@gmail.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Tool to compare directories and delete duplicate files from one directory
Message-ID:  <344b29c6-3d69-543d-678d-c2433dbf7152@optiplex-networks.com>
In-Reply-To: <CAFbbPugfhXGPfscKpx6B0ue=DcF_qssL6P-0GgB1CWKtm3U=tQ@mail.gmail.com>
References:  <9887a438-95e7-87cc-a162-4ad7a70d744f@optiplex-networks.com> <CAFbbPugfhXGPfscKpx6B0ue=DcF_qssL6P-0GgB1CWKtm3U=tQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------nwzKkVn1jzK0pw3WOSUpGRKx
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable


On 5/4/23 17:29, Paul Procacci wrote:
>
>
> On Thu, May 4, 2023 at 11:53=E2=80=AFAM Kaya Saman=20
> <kayasaman@optiplex-networks.com> wrote:
>
>     Hi,
>
>
>     I'm wondering if anyone knows of a tool like diff or so that can al=
so
>     delete files based on name and size from either left/right or
>     source/destination directory?
>
>
>     Basically what I have done is performed an rsync without using the
>     --remove-source-files option onto a newly bought and created disk
>     pool
>     (yes zpool) that i am trying to consolidate my data - as it's
>     currently
>     spread out over multiple pools with the same folder name.
>
>
>     The issue I am facing mainly is that I perform another rsync and
>     use the
>     --remove-source-files option, rsync will delete files based on name
>     while there are some files that have the same name but not same
>     size and
>     I would like to retain these files.
>
>
>     Right now I have looked at many different options in both rsync and
>     other tools but found nothing suitable. I even tested using a few
>     test
>     dirs and files that I put into /tmp and whatever I tried, the
>     files of
>     different size either got transferred or deleted.
>
>
>     How would be a good way to approach this problem?
>
>
>     Even if I create some kind of shell script and use diff, I think
>     it will
>     only compare names and not file sizes.
>
>
>     I'm really lost here....
>
>
>     Regards,
>
>
>     Kaya
>
>
>
>
> It sounds like you want fdupes.=C2=A0 It's in the ports tree.
>
> ~Paul
>
> --=20
> __________________
>
> :(){ :|:& };:



I tried fdupes and installed it a while back. For me it felt like it=20
only works on a single directory.


My dir structure is that I have"


/dir <- main directory where everything has now been rsync'ed to

/dir_1 <- old directory with partial content

/dir_2 <- more partial content

/dir_3 <- more partial content


The key thing here is that I need to compare:


/dir_(x) with /dir


if the files are different sizes in /dir_(x) then leave them, otherwise=20
delete if both name and file size are the same.

--------------nwzKkVn1jzK0pw3WOSUpGRKx
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DUTF=
-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class=3D"moz-cite-prefix">On 5/4/23 17:29, Paul Procacci wrote:<=
br>
    </div>
    <blockquote type=3D"cite"
cite=3D"mid:CAFbbPugfhXGPfscKpx6B0ue=3DDcF_qssL6P-0GgB1CWKtm3U=3DtQ@mail.=
gmail.com">
      <meta http-equiv=3D"content-type" content=3D"text/html; charset=3DU=
TF-8">
      <div dir=3D"ltr">
        <div>
          <div dir=3D"ltr"><br>
          </div>
          <br>
          <div class=3D"gmail_quote">
            <div dir=3D"ltr" class=3D"gmail_attr">On Thu, May 4, 2023 at
              11:53=E2=80=AFAM Kaya Saman &lt;<a
                href=3D"mailto:kayasaman@optiplex-networks.com"
                moz-do-not-send=3D"true" class=3D"moz-txt-link-freetext">=
kayasaman@optiplex-networks.com</a>&gt;
              wrote:<br>
            </div>
            <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">Hi,<br>
              <br>
              <br>
              I'm wondering if anyone knows of a tool like diff or so
              that can also <br>
              delete files based on name and size from either left/right
              or <br>
              source/destination directory?<br>
              <br>
              <br>
              Basically what I have done is performed an rsync without
              using the <br>
              --remove-source-files option onto a newly bought and
              created disk pool <br>
              (yes zpool) that i am trying to consolidate my data - as
              it's currently <br>
              spread out over multiple pools with the same folder name.<b=
r>
              <br>
              <br>
              The issue I am facing mainly is that I perform another
              rsync and use the <br>
              --remove-source-files option, rsync will delete files
              based on name <br>
              while there are some files that have the same name but not
              same size and <br>
              I would like to retain these files.<br>
              <br>
              <br>
              Right now I have looked at many different options in both
              rsync and <br>
              other tools but found nothing suitable. I even tested
              using a few test <br>
              dirs and files that I put into /tmp and whatever I tried,
              the files of <br>
              different size either got transferred or deleted.<br>
              <br>
              <br>
              How would be a good way to approach this problem?<br>
              <br>
              <br>
              Even if I create some kind of shell script and use diff, I
              think it will <br>
              only compare names and not file sizes.<br>
              <br>
              <br>
              I'm really lost here....<br>
              <br>
              <br>
              Regards,<br>
              <br>
              <br>
              Kaya<br>
              <br>
              <br>
              <br>
            </blockquote>
          </div>
          <br>
        </div>
        <div>It sounds like you want fdupes.=C2=A0 It's in the ports tree=
.</div>
        <div><br>
        </div>
        <div>~Paul<br>
        </div>
        <div><br>
          <span class=3D"gmail_signature_prefix">-- </span><br>
          <div dir=3D"ltr" class=3D"gmail_signature">__________________<b=
r>
            <br>
            :(){ :|:&amp; };:</div>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p><br>
    </p>
    <p>I tried fdupes and installed it a while back. For me it felt like
      it only works on a single directory.</p>
    <p><br>
    </p>
    <p>My dir structure is that I have"</p>
    <p><br>
    </p>
    <p>/dir &lt;- main directory where everything has now been rsync'ed
      to<br>
    </p>
    <p>/dir_1 &lt;- old directory with partial content<br>
    </p>
    <p>/dir_2 &lt;- more partial content<br>
    </p>
    <p>/dir_3 &lt;- more partial content</p>
    <p><br>
    </p>
    <p>The key thing here is that I need to compare:</p>
    <p><br>
    </p>
    <p>/dir_(x) with /dir</p>
    <p><br>
    </p>
    <p>if the files are different sizes in /dir_(x) then leave them,
      otherwise delete if both name and file size are the same.<br>
    </p>
  </body>
</html>

--------------nwzKkVn1jzK0pw3WOSUpGRKx--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?344b29c6-3d69-543d-678d-c2433dbf7152>