Repeated Files

Top  Previous  Next

RepeatedFiles

Repeated files are "cloned" files:

by Name: are files with same names on different folders (maybe Duplicates).
by Size: are files with same size on same or different folders. In some situations and jobs, a file with same size may be considered a repeated file. But you need to make sure this works for you, because some files may have the same size but are not repeated files (files with same size, but different content).
by Content: this process takes more time but it is the only way to ensure a file is repeated (or "cloned") in same folder or other folder(s). Using this method you may be surprised found files with different name or repeated across different folders, but are the same file. This may apply for MP3 files, Movies, Documents, etc. Example: you may have the files "C:\MP3s\Top Ten.mp3" and "C:\MP3\American Idol\American Idol's last hit.mp3", both are really the same file (same content) but are repeated across 2 folders: "C:\MP3s" and "C:\MP3s\American Idol". This consumes more disk space and with thousands of files you may loose more time. The analysis by content is based on a hash code, unique for each file based on content. The common hash used is CRC32. It produces an unique "code" for each file, if two or more files have the same CRC32, are the same files no matter file name or location.

LightSmall If you need to look for Repeated Files on multiple subfolders from a root folder, just set the root folder as Source and  to scan. For example: to perform a scan for Repeated Files on "c:\tmp", "c:\books", "c:\downloads", just set the "c:\" as Source and choose subfolders Tmp, Books and Downloads.

 

What's the difference with DUPLICATES?

Duplicates are files detected as duplicate based on same Name, Size and the search mode you specified: Time, Data or CRC. Ok, comparing by Data or CRC helps you to detect if 2 files are the same, but also mean the files must have same Name and Size (if different Size are considered Bad Duplicates). Duplicates also belongs to same folder structure starting with specified Source and Target folders as reference. This means if you have two duplicate files "c:\folder1\data\file.doc" and "c:\folder2\documents\file.doc" you'll get different results:

Comparing "c:\folder1\data" vs. "c:\folder2\documents": will detect file.doc as Duplicate.
Comparing "c:\folder1" vs. "c:\folder2": will detect file.doc as Missing on Source and also Missing on Target, because Comparator Fast starts comparison using "c:\folder1" and "c:\folder2", and the folder "data" on Source does not exists at Target, this causes all files in "data" subfolder to appears as Missing. The same happens with the "documents" subfolder on Target.

 

Search Repeated Files at Scan Parameters, Repeated because the file is repeated (by Name) in different subfolders. Please note is important to choose the best search engine that fit your needs. If we refers to previous example, suppose file.doc at Source have 1,024 bytes and file.doc at Target have 3,231 bytes, they are Repeated by Name because have the same name, but are not Repeated by Size because have different size (files with different size will produce different hashes values too).

 

GROUPS

Repeated files are shown in groups, the first column is the group number. This is just an internal reference number to help you identify repeated files grouped. Grouped files are shown in alternate colors, using light green for background. If you change the sort order clicking on any other column than Group, the files are no longer grouped, and to avoid accidents, the colors change, using a light yellow as background.

 

FILES

Each file shows relevant information in a row. You may notice which file is file newer newer on a group, and also the older file older file in the group. This status also can be indeterminate, for example, when all files in a Group have the same modified Date and Time, Comparator Fast is unable to determine which file is newer and older.

 

LightSmall Right-click on any file name (on file name column) to get Windows Context Menu for selected file(s). Right-click on any other area of information in the file's row to get Repeated Files Context Menu with options. Please note if you alter files using Windows Context menu, you may need to repeat the scan to get the most recent results.

 

Each file row have a check box. In Repeated Files section, you need to check the files to be processed for Copy to folder, Move to folder, Delete. Unchecked files will not be processed. This mechanism may look tedious, but is a great help to avoid accidental operations, and you may review many files to process without worry about keep all them selected.

 

The check boxes are only used with specified specific commands (Copy to folder, Move to folder, Delete.). You may select one or more rows as usual, and perform other tasks using Context Menu or the Tasks button.

 

You need to decide what to do with repeated files: Copy to folder, Move to folder, Delete..

 

CHANGING SEARCH MODE

In Search Repeated Files scan parameters, you defined the search engine mode for Repeated Files. But you may switch easily the search engine mode during results review. If you change search engine mode (from "by Name" to "by Size", for example), Comparator Fast will reanalyze the Repeated Results and now shows Repeated Files using the new mode you marked. You may notice changes on results based on the mode to analyze for Repeated Files.

 

LightSmall CRC32 may appears disabled if you didn't marked Search for Repeated Files by CRC32 or MD5scan.

MD5 may appears disabled if you didn't marked  by MD5scan. Hashing files takes longer times to complete a scan, and if hashes wasn' t calculated during the scan, Repeated Files can't be analyzed by hashes not selected. To avoid repeat a scan, use the Repeated Files context menu or Tasks button, go to Calculate Hashes option, and choose  (hold Shift key when choose the command to force recalculation of already hashed files, if any).

Note: before Scan, Search Repeated Files must be enabled by Name or Size. If not, Repeated Files are disabled completely for your current Scan (no scan data is processed for Repeated Files). In this case, you need to repeat the Scan with Search Repeated Files enabled.

 

See Also: Search Repeated Files, Hashes, Duplicates, .