Skip to end of metadata
Go to start of metadata

Duplicate reduction

Cost Effectiveness: WAS 3.8.1

WAS curators will have the option of switching on "duplicate reduction" features for any archive. When the feature is enabled, the web crawler will maintain a record of the size and checksum for each website captured. Subsequent captures of that site will compare the live content to the benchmark content, and will only save files that are new or changed. This will particularly impact sites with extensive PDF or multimedia content which rarely changes, but requires significant storage. Archives built with duplicate reduction features will render content seamlessly; end-users will not be impacted by the more efficient use of storage.

  • No labels