An Endless Xploration..: Removing Duplicated File Automatically

Thursday, November 27, 2008

Removing Duplicated File Automatically

It's annoying to have two or more of the same file laying around in your hard drive. Out of curiosity, I devised a small script to automate the process of finding and deleting duplicated file in two different directories. This is it.


#!/bin/bash
#
# This script checks whether there is a file duplication 
# in two different directrories and deletes the one in 
# ${SRC_DIR} if it found the same file
#

SRC_DIR="/home/darmawan/download"
DST_DIR="/home/sources"
TMP_FILENAMES="__filenames.txt"
CUR_FILE=""
DST_FILE=""
RESULT=""

find ${SRC_DIR} -type f  > ${TMP_FILENAMES} 

while read LINE
do
 unset RESULT
 unset CUR_FILE

 CUR_FILE=$(basename "${LINE}")
 #find ${DST_DIR} -type f -name "${CUR_FILE}" -exec diff -q "${LINE}" '{}' ';'
 RESULT=$(find ${DST_DIR} -type f -name "${CUR_FILE}" -print)  
 
 if [ "${RESULT}" != "" ] ; then 
      echo "File of the same name found at ${LINE} and ${RESULT}" ;
      echo "Diffing.."
      diff -q ${LINE} ${RESULT}

      # If the file differ (diff return value _is_not_0_ ),
      # then print it out, otherwise delete the file in ${SRC_DIR}
      if [ $? -eq 0 ]; then 
       rm -vf ${LINE}
      fi
 fi

done < ${TMP_FILENAMES}

unset LINE
unset FILENAME
unset CUR_FILE
unset DST_FILE
unset RESULT

rm -v ${TMP_FILENAMES}

This is a very rough script. Don't expect robustness out of it.

An Endless Xploration..

Thursday, November 27, 2008

Removing Duplicated File Automatically

No comments:

About Me

Labels

Blog Archive

Linked Blogs

Followers