In support of the Memento: Time Travel for the Web challenge at dev8d, the University of California Curation Service at the California Digital Library has enabled access to content in Web Archiving Service public archives for Memento.
To help developers identify URLs that have multiple representations in the public archives, CDL is providing text files with URL lists. These are a selection of the URLs that have at least 4 instances in the public Web archives. They are divided into groups, based additional qualities that might be meaningful to developers working with Memento.
- California Wildfires 2007 [wildfire_urls.txt]
This is an event capture that unfolded over the course of several days. There are up to five captures of each site over a short time-span.
- Leftist Political Movements [left_urls.txt ]
Captures from this set may span from January 2007 to August 2009.
- Middle-Eastern Political Organizations [middle_east_urls.txt ]
In addition to a timespan from January 2007-August 2009, content from this set of URLs may contain Arabic and other character sets.
An Excel spreadsheet is also available that contains URLs from several other public archives, along with the archive name and site name. [multiple_urls.xls ]
In all cases, the URLs should have 4 or more publicly available instances. The archive names in this spreadsheet correspond with the archives at http://webarchives.cdlib.org. The site names in this spreadsheet correspond with what you will see in the "Site List" tab if you go to that archive. (note that Arabic site names did not survive the transition to Excel).