From Wikibooks, open books for an open world
Some people view checksums as a kind of hashing but collisions (different inputs, same result) are likely. Checksums are used when irrecoverable errors must be detected to prevent further data corruption in a system. If correction is needed, ECC (Error Correcting Codes such as Hamming, Reed Solomon or BCH) are required.
Collision rates are a compromise with size and speed of computation. The odds of failed detections for n bits of checksum is 2^-n. The choice of this size depends on the application's requirements and the expected error model (the types and probabilities of errors). 16-bit checksum fields are small but one false positive occurs for every 65536 random inputs (at best) so it is used when data blocks are small, the consequences are benign and/or the corruption is very unlikely. 32-bit checksums have double the size overhead but only 1 chance in 4 billions of collision so it is common for storing or transmitting large files if they may be altered.
A programmer writing a checksum algorithm could use one's complement addition if one is willing to sacrifice error detection effectiveness for very high speed (such as for EXE files or IP header packets), Fletcher-32 checksum for a balance of speed and error detection (though it can not distinguish between 0xFFFF and 0x0000), CRC-32 for better error detection but more complex implementations when performance is needed.[1] A programmer should use a MAC or digital signature—typically involving a cryptographically secure hash such as SHA-256 or SHA-3 -- if one is willing to run slower and allocate 256 bits or more in order to detect malicious alterations to data. Such hashes are covered in a later chapter, Hashing.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4