Date: Thu, 4 May 2023 22:47:44 +0100 From: Kaya Saman <kayasaman@optiplex-networks.com> To: Paul Procacci <pprocacci@gmail.com> Cc: freebsd-questions@freebsd.org Subject: Re: Tool to compare directories and delete duplicate files from one directory Message-ID: <344b29c6-3d69-543d-678d-c2433dbf7152@optiplex-networks.com> In-Reply-To: <CAFbbPugfhXGPfscKpx6B0ue=DcF_qssL6P-0GgB1CWKtm3U=tQ@mail.gmail.com> References: <9887a438-95e7-87cc-a162-4ad7a70d744f@optiplex-networks.com> <CAFbbPugfhXGPfscKpx6B0ue=DcF_qssL6P-0GgB1CWKtm3U=tQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
On 5/4/23 17:29, Paul Procacci wrote:
>
>
> On Thu, May 4, 2023 at 11:53 AM Kaya Saman
> <kayasaman@optiplex-networks.com> wrote:
>
> Hi,
>
>
> I'm wondering if anyone knows of a tool like diff or so that can also
> delete files based on name and size from either left/right or
> source/destination directory?
>
>
> Basically what I have done is performed an rsync without using the
> --remove-source-files option onto a newly bought and created disk
> pool
> (yes zpool) that i am trying to consolidate my data - as it's
> currently
> spread out over multiple pools with the same folder name.
>
>
> The issue I am facing mainly is that I perform another rsync and
> use the
> --remove-source-files option, rsync will delete files based on name
> while there are some files that have the same name but not same
> size and
> I would like to retain these files.
>
>
> Right now I have looked at many different options in both rsync and
> other tools but found nothing suitable. I even tested using a few
> test
> dirs and files that I put into /tmp and whatever I tried, the
> files of
> different size either got transferred or deleted.
>
>
> How would be a good way to approach this problem?
>
>
> Even if I create some kind of shell script and use diff, I think
> it will
> only compare names and not file sizes.
>
>
> I'm really lost here....
>
>
> Regards,
>
>
> Kaya
>
>
>
>
> It sounds like you want fdupes. It's in the ports tree.
>
> ~Paul
>
> --
> __________________
>
> :(){ :|:& };:
I tried fdupes and installed it a while back. For me it felt like it
only works on a single directory.
My dir structure is that I have"
/dir <- main directory where everything has now been rsync'ed to
/dir_1 <- old directory with partial content
/dir_2 <- more partial content
/dir_3 <- more partial content
The key thing here is that I need to compare:
/dir_(x) with /dir
if the files are different sizes in /dir_(x) then leave them, otherwise
delete if both name and file size are the same.
[-- Attachment #2 --]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 5/4/23 17:29, Paul Procacci wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAFbbPugfhXGPfscKpx6B0ue=DcF_qssL6P-0GgB1CWKtm3U=tQ@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, May 4, 2023 at
11:53 AM Kaya Saman <<a
href="mailto:kayasaman@optiplex-networks.com"
moz-do-not-send="true" class="moz-txt-link-freetext">kayasaman@optiplex-networks.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
<br>
I'm wondering if anyone knows of a tool like diff or so
that can also <br>
delete files based on name and size from either left/right
or <br>
source/destination directory?<br>
<br>
<br>
Basically what I have done is performed an rsync without
using the <br>
--remove-source-files option onto a newly bought and
created disk pool <br>
(yes zpool) that i am trying to consolidate my data - as
it's currently <br>
spread out over multiple pools with the same folder name.<br>
<br>
<br>
The issue I am facing mainly is that I perform another
rsync and use the <br>
--remove-source-files option, rsync will delete files
based on name <br>
while there are some files that have the same name but not
same size and <br>
I would like to retain these files.<br>
<br>
<br>
Right now I have looked at many different options in both
rsync and <br>
other tools but found nothing suitable. I even tested
using a few test <br>
dirs and files that I put into /tmp and whatever I tried,
the files of <br>
different size either got transferred or deleted.<br>
<br>
<br>
How would be a good way to approach this problem?<br>
<br>
<br>
Even if I create some kind of shell script and use diff, I
think it will <br>
only compare names and not file sizes.<br>
<br>
<br>
I'm really lost here....<br>
<br>
<br>
Regards,<br>
<br>
<br>
Kaya<br>
<br>
<br>
<br>
</blockquote>
</div>
<br>
</div>
<div>It sounds like you want fdupes. It's in the ports tree.</div>
<div><br>
</div>
<div>~Paul<br>
</div>
<div><br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">__________________<br>
<br>
:(){ :|:& };:</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p><br>
</p>
<p>I tried fdupes and installed it a while back. For me it felt like
it only works on a single directory.</p>
<p><br>
</p>
<p>My dir structure is that I have"</p>
<p><br>
</p>
<p>/dir <- main directory where everything has now been rsync'ed
to<br>
</p>
<p>/dir_1 <- old directory with partial content<br>
</p>
<p>/dir_2 <- more partial content<br>
</p>
<p>/dir_3 <- more partial content</p>
<p><br>
</p>
<p>The key thing here is that I need to compare:</p>
<p><br>
</p>
<p>/dir_(x) with /dir</p>
<p><br>
</p>
<p>if the files are different sizes in /dir_(x) then leave them,
otherwise delete if both name and file size are the same.<br>
</p>
</body>
</html>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?344b29c6-3d69-543d-678d-c2433dbf7152>
