Thursday, November 27, 2008

Removing Duplicated File Automatically

It's annoying to have two or more of the same file laying around in your hard drive. Out of curiosity, I devised a small script to automate the process of finding and deleting duplicated file in two different directories. This is it.

#!/bin/bash
#
# This script checks whether there is a file duplication
# in two different directrories and deletes the one in
# ${SRC_DIR} if it found the same file
#

SRC_DIR="/home/darmawan/download"
DST_DIR="/home/sources"
TMP_FILENAMES="__filenames.txt"
CUR_FILE=""
DST_FILE=""
RESULT=""

find ${SRC_DIR} -type f > ${TMP_FILENAMES}

while read LINE
do
unset RESULT
unset CUR_FILE

CUR_FILE=$(basename "${LINE}")
#find ${DST_DIR} -type f -name "${CUR_FILE}" -exec diff -q "${LINE}" '{}' ';'
RESULT=$(find ${DST_DIR} -type f -name "${CUR_FILE}" -print)

if [ "${RESULT}" != "" ] ; then
echo "File of the same name found at ${LINE} and ${RESULT}" ;
echo "Diffing.."
diff -q ${LINE} ${RESULT}

# If the file differ (diff return value _is_not_0_ ),
# then print it out, otherwise delete the file in ${SRC_DIR}
if [ $? -eq 0 ]; then
rm -vf ${LINE}
fi
fi

done < ${TMP_FILENAMES}

unset LINE
unset FILENAME
unset CUR_FILE
unset DST_FILE
unset RESULT

rm -v ${TMP_FILENAMES}

This is a very rough script. Don't expect robustness out of it.
Post a Comment

No comments: