Child pages
  • MementoURLs

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

To help developers identify URLs that have multiple representations in the public archives, CDL is providing text files with URL lists.  These are a selection of the URLs that have at least 4 instances in the public Web archives.  They are divided into groups, based additional qualities that might be meaningful to developers working with Memento. 

  • Wiki MarkupCalifornia Wildfires 2007 \ [[*wildfire_urls.txt*|^wildfire_urls.txt]\] This is an event capture that unfolded over the course of several days.  There are up to five captures of each site over a short time-span.  \\  
  • Wiki Markup
    Leftist Political Movements \[[*left_urls.txt*|^left_urls.txt] \]
    Captures from this set may span from January 2007 to August 2009.
    \\
    \\
  • Wiki MarkupMiddle-Eastern Political Organizations \[[*]
    This is an event capture that unfolded over the course of several days.  There are up to five captures of each site over a short time-span. 
     
  • Leftist Political Movements [left_urls.txt ]
    Captures from this set may span from January 2007 to August 2009.

  • Middle-Eastern Political Organizations [middle_east_urls.txt*|^middle_east_urls.txt] \] In addition to a timespan from January ]
    In addition to a timespan from January 2007-August 2009, content from this set of URLs may contain Arabic and other character sets. \\

Wiki MarkupAn Excel spreadsheet is also available that contains URLs from several other public archives, along with the archive name and site name.  \[[*  [multiple_urls.xls*|^multiple_urls.xls] \] ]

In all cases, the URLs should have 4 or more publicly available instances.  The archive names in this spreadsheet correspond with the archives at http://webarchives.cdlib.org.   The site names in this spreadsheet correspond with what you will see in the "Site List" tab if you go to that archive.  (note that Arabic site names did not survive the transition to Excel).