How to de-duplicate your albums
In this new recipe I will share my way of deduplicate my photo albums (you can apply this also to songs).
When I need to change my phone or simply reset it, I take all the pictures in the camera and do a backup, but when I need to put all the backups togheter I struggle to know what picture is alredy in a folder or folders; so at the end I have the same picture few times with diferent names in diferent folders.
I don’t want to go one by one and check if alredy exist, so I use a script to help me with the job.
Stay tune for it … but just in case here is the commands … I will tidy it up shortly
First you need to have 2 folders:
- one for thr new pictures you wanted to add to de library, lets call it creatively new
- another with the structure of you library, where you have all in order catalogue
For folder with the catalogue
We need to generate a file with the list of the pictures and the correspondant checksum, for that I used the following command:
|
|
Folder with new images
For this folder we also need to create a similar file with the checksum, so we can compare with the previous one:
|
|
Combine the two files with the checksum
Once you have both files, you need to combine and sort them, so for each ocurrence you will have the firt line with the original file in the catalogue and below will be the duplicates of that image but in the new folder. I use this command for that:
|
|
NOTE: This work in Mac as the field separator is " = “, you will need to change it for Linux or Windows
Generate the deletion file based on duplicates
In this step, we will be generating a file with all the duplicated to delete; once generated you can check if efectively those are duplicates.
|
|
NOTE: if you want to check that the list don’t have a file from your catalogue run:
|
|
The last two number should be the same
Finally deleting the files
To delete the files from the to_delete.txt file:
|
|
At this point if you has been following the instructions, all the duplicated files should be deleted. To confirm you can run the the commands again and the file to_delete.txt should be empty.
G