A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/unruledboy/SharpDups below:

unruledboy/SharpDups: find duplicate file with C# parallel MapReduce compute using quick hashing, quick search

Skip to content Navigation Menu Search code, repositories, users, issues, pull requests...

Saved searches Use saved searches to filter your results more quickly

Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert unruledboy/SharpDups

Fast duplicate file search via parallel processing with C#.

The tool will find duplicate files using Map/Reduce method. It accepts a list of files and then perform the duplicate checking. It could be extended easily to support file search filter etc.

Logic:

  1. Group files with same size
  2. Check first/middle/last bytes for quick hash
  3. Group files with same quick hash by comparing the bytes in the header/middle and end of the file
  4. Get progressive hash for files with same quick hash, if intermediate hash is different, discard the remaining comparison
  5. Group files with same full hash

Methods:

Features:

现有方案

我们判断文件是否重复,一般是给两个需要比较的文件进行哈希,然后比较哈希值。

这个做法有个问题,就是比较慢:

新的方案

步骤如下:

方案特色

About

find duplicate file with C# parallel MapReduce compute using quick hashing, quick search

Resources License Stars Watchers Forks

You can’t perform that action at this time.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4