Remote Differential Compression
Remote Differential Compression (RDC) is a client–server synchronization algorithm that allows the contents of two files to be synchronized by communicating only the differences between them. It was introduced with Windows Server 2003 R2 and is included with later Windows client and server operating systems.
Unlike Binary Delta Compression (BDC), which is designed to operate only on known versions of a single file, RDC does not make assumptions about file similarity or versioning. The differences between files are computed on the fly, therefore RDC is suitable for efficient synchronization of files that have been updated independently, where network bandwidth is small, or where the files are large but the differences between them are small.
The algorithm used is based on fingerprinting blocks on each file locally at both ends of the replication partners. Since many types of file changes can cause the file contents to move (for example, a small insertion or deletion at the beginning of a file can cause the rest of the file to become misaligned to the original content) the blocks used for comparison are not based on static arbitrary cut points but on cut points defined by the contents of each file segment. This means that if a part of a file changes in length or blocks of the contents get moved to other parts of the file, the block boundaries for the parts that have not changed remain fixed related to the contents, and thus the series of fingerprints for those blocks don't change either, they just change position. By comparing all hashes in a file to the hashes for the same file at the other end of the replication pair, RDC is able to identify which blocks of the file have changed and which haven't, even if the contents of the file has been significantly reshuffled. Since comparing large files could imply making large numbers of signature comparisons, the algorithm is recursively applied to the hash sets to detect which blocks of hashes have changed or moved around, significantly reducing the amount of data that needs to be transmitted for comparing files.
- Introduction to DFS replication
- About Remote Differential Compression
- Optimizing File Replication over Limited-Bandwidth Networks using Remote Differential Compression