Skip to content

Use find (1) as a quick and dirty duplicate file finder

Run the following two commands in bash to get a listing of all duplicate files (from a directory or location). This can help you clean out duplicate files that sometimes accumulate over time.

The first command uses find to print all files (and specific attributes) from a specific location to a file, prefixing the size of the file in the name. This way all files with the same filename and same size can be grouped together. Which is usually a strong indicator that files are similar.

When you run the second command you will get a sorted list of all actual duplicates, grouped together. This way, you can quickly pick out similar files and manually choose which ones to keep or delete.

find . -type f -printf "%s-%f\t %f %c\t %p\n" >> /tmp/findcmd

for i in `sort -n /tmp/findcmd|awk '{print $1}'|uniq -cd|sort -n|awk '{print $2}'`; do grep $i /tmp/findcmd; done

The output will look something like this, you can instantly tell which files are duplicates, based on size, name and/or timestamp.

1067761-P4270521.JPG     P4270521.JPG Wed Apr 27 18:05:04.0000000000 2011        ./Backups Laptops/Ri-janne/2011 Diversen
1067761-P4270521.JPG     P4270521.JPG Wed Apr 27 18:05:04.0000000000 2011        ./Backups Laptops/Ri-janne/2011 camera
1067898-IMG_3418.JPG     IMG_3418.JPG Thu Aug 28 20:08:28.0000000000 2008        ./Piks/2008/Vakantie USA 2008/Dag 7 Louisville Shopping
1067898-IMG_3418.JPG     IMG_3418.JPG Thu Aug 28 19:08:28.0000000000 2008        ./Backups Laptops/Ri-janne/2008 USA
1067969-P9180184.JPG     P9180184.JPG Sat Sep 18 17:45:52.0000000000 2010        ./Backups Laptops/Ri-janne/2010 Diversen
1067969-P9180184.JPG     P9180184.JPG Sat Sep 18 17:45:52.0000000000 2010        ./Backups Laptops/Ri-janne/2010 uitzoeken
1068244-100_2962.jpg     100_2962.jpg Thu Jul 17 18:18:52.0000000000 2008        ./.Trash-1000/files/Mijn afbeeldingen/Italia 09/Greece '08
1068244-100_2962.jpg     100_2962.jpg Thu Jul 17 18:18:52.0000000000 2008        ./Backups Laptops/Jan/Mijn documenten/Mathea/Mijn afbeeldingen/Italia 09/Greece '08
1068284-DSC_7640.JPG     DSC_7640.JPG Sat Apr 26 14:47:58.0000000000 2014        ./Piks/2014/20140426 KDag
1068284-DSC_7640.JPG     DSC_7640.JPG Tue Apr 29 21:56:54.0000000000 2014        ./Piks/2014/20140426 Koningsdag
Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *