Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 May 2023 02:32:28 +0100
From:      Kaya Saman <kayasaman@optiplex-networks.com>
To:        Paul Procacci <pprocacci@gmail.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Tool to compare directories and delete duplicate files from one directory
Message-ID:  <7747f587-f33e-f39c-ac97-fe4fe19e0b76@optiplex-networks.com>
In-Reply-To: <CAFbbPujUALOS%2BsUxsp=54vxVAHe_jkvi3d-CksK78c7rxAVoNg@mail.gmail.com>
References:  <9887a438-95e7-87cc-a162-4ad7a70d744f@optiplex-networks.com> <CAFbbPugfhXGPfscKpx6B0ue=DcF_qssL6P-0GgB1CWKtm3U=tQ@mail.gmail.com> <344b29c6-3d69-543d-678d-c2433dbf7152@optiplex-networks.com> <CAFbbPuiNqYLLg8wcg8S_3=y46osb06%2BduHqY9f0n=OuRgGVY=w@mail.gmail.com> <ef0328b0-caab-b6a2-5b33-1ab069a07f80@optiplex-networks.com> <CAFbbPujUALOS%2BsUxsp=54vxVAHe_jkvi3d-CksK78c7rxAVoNg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 5/5/23 01:13, Paul Procacci wrote:
> #!/bin/sh
>
> #
> # dir_1, dir_2, and dir_3 are the directories I want to search through.
> for i in dir_1 dir_2 dir_3;
> do
> =C2=A0 # Retrieve the filenames within each of those directories
> =C2=A0 ls $i/ | while read file;
> =C2=A0 do
> =C2=A0=C2=A0=C2=A0=C2=A0 If the file doesn't exist in the base dir, cop=
y it and continue=20
> with the top of the loop.
> =C2=A0 =C2=A0 [ ! -f dir_base/$file ] && cp $i/$file dir_base/ && conti=
nue
>
> =C2=A0=C2=A0=C2=A0 #
> =C2=A0=C2=A0=C2=A0 # Getting to this point means the file eixsts in bot=
h locations.
> =C2=A0=C2=A0=C2=A0 #
>
> =C2=A0=C2=A0=C2=A0 # Get the file size as it is in the dir_base
> =C2=A0 =C2=A0 ref=3D`stat -f '%z' dir_base/$file`
>
> =C2=A0=C2=A0=C2=A0 # Get the file size as it is in $i
> =C2=A0 =C2=A0 src=3D`stat -f '%z' $i/$file`
>
> =C2=A0=C2=A0=C2=A0 # If the sizes are the same, remove the file from th=
e source directory
> =C2=A0 =C2=A0 [ $ref -eq $src ] && rm -f $i/file
>
> =C2=A0 done
> done


Thanks so much!


just a quick question... you have dir_base written in the script. Do I=20
need to define this or is this part of the shell language itself?


Right now I have modifed the script to make it non destructive so that=20
it doesn't do any copying or removing yet... call it a test instance if=20
you like. I personally prefer doing things like this so I don't have any=20
accidents and loose things in the meantime...


So my initial modification is this:


> #!/bin/sh
>
> #
> # dir_1, dir_2, and dir_3 are the directories I want to search through.
> for i in /dir_1 /dir_2 /dir_3;
> do
> =C2=A0 # Retrieve the filenames within each of those directories
> =C2=A0 ls $i/ | while read file;
> =C2=A0 do
> =C2=A0=C2=A0=C2=A0 # If the file doesn't exist in the base dir, copy it=
 and continue=20
> with the top of the loop.
> =C2=A0 =C2=A0 [ ! -f dir_base/$file ] && ls $i/$file && continue
>
> =C2=A0=C2=A0=C2=A0 #
> =C2=A0=C2=A0=C2=A0 # Getting to this point means the file eixsts in bot=
h locations.
> =C2=A0=C2=A0=C2=A0 #
>
> =C2=A0=C2=A0=C2=A0 # Get the file size as it is in the dir_base
> =C2=A0 =C2=A0 ref=3D`stat -f '%z' dir_base/$file`
>
> =C2=A0=C2=A0=C2=A0 # Get the file size as it is in $i
> =C2=A0 =C2=A0 src=3D`stat -f '%z' $i/$file`
>
> =C2=A0=C2=A0=C2=A0 # If the sizes are the same, remove the file from th=
e source directory
> =C2=A0 =C2=A0 [ $ref -nq $src ] && ls $i/file > /tmp/file
>
> =C2=A0 done
> done


If this works it should just output the different files into a file=20
called "file" under /tmp


Ok, this didn't work at all.... it just listed a whole bunch of top=20
level folders and didn't recurse through them :-(


I ran it on the assumption that I needed to run the script under /dir=20
and that dir_base was a shell function which would essentially be /dir/.


[EDIT]


Currently, I managed to get it partly running by modifying ls to use ls=20
-R *but* I think that the 'stat' statements don't allow for recursion?


The script is running as I type this but it's most likely just=20
outputting a whole bunch of ls information... as I see many 'stat'=20
errors in the shell output.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7747f587-f33e-f39c-ac97-fe4fe19e0b76>