ARK: Archival Resource Key
ARKs are URLs designed to support long-term access to information objects. They can identify objects of any type:
- digital objects – documents, databases, images, software, websites, etc.
- physical objects – books, bones, statues, etc.
- living beings and groups – people, animals, companies, orchestras, etc.
- intangible objects – places, chemicals, diseases, vocabulary terms, performances, etc.
ARKs are assigned by information providers for a variety of reasons:
- self-sufficiency – assigners are not required to join a group or pay fees
- simplicity – access mechanisms rely only on mainstream web protocols and "redirects"
- versatility – with "inflections" (different endings), an ARK should access metadata, promises, and more
- transparency – no identifier can guarantee stability, but ARK inflections help users make informed judgements
- visibility – syntax rules make ARKs easy to extract from texts and to compare for object variant and containment relationships
- reliability – an ARK URL retains its core identity even when hosted by different providers, whether in parallel or by future succession
To date about 100 organizations have registered to assign ARKs. Some of the largest users are
- The California Digital Library
- The Internet Archive
- National Library of France (Bibliothèque nationale de France)
- Portico Digital Preservation Service
- University of California Berkeley
- University of Chicago
We are very interested in building a community of users and will be announcing an email forum soon. Here is a brief summary of other resources relevant to ARKs.
- The ARK Identifier Scheme Specification PDF version TXT version
- Towards Electronic Persistence Using ARK Identifiers (July 2003)
- ARK and UC3/CDL Identifier conventions
- Archival Resource Key - Wikipedia
- EZID service: long term identifiers made easy
- N2T resolver: Name-to-Thing
- NOID: (Nice Opaque Identifier) Minting and Binding Tool
ARK Anatomy and NAANs (Name Assigning Authority Numbers)
An ARK is represented by a sequence of characters that contains the label, "ark:", optionally preceded by the protocol name ("http://") and hostname that begins every URL. That first part of the URL, or the "Name Mapping Authority" (NMA), is mutable and replaceable, as neither the web server itself nor the current web protocols are expected to last longer than the identified objects. The immutable, globally unique identifier follows the "ark:" label. This includes a "Name Assigning Authority Number" (NAAN) identifying the naming organization, followed by the name that it assigns to the object.
Here is a diagrammed example:
The ARK syntax can be summarized,
The NMA part, which makes the ARK actionable (clickable in a web browser), is in brackets to indicate that it is optional and replaceable. ARKs are intended to work with objects that last longer than the organizations that provide services for them, so when the provider changes it should not affect the object's identity. A different provider hosting the object would simply replace the NMA to reflect the new "home" of the object. For example,
Note that the ark:/NAAN/Name remains the same.
NAAN: Name Assigning Authority Number
The NAAN part, following the "ark:" label, uniquely identifies the organization that assigned the Name part of the ARK. Often the initial access provider (the first NMA) coincides with the original namer (represented by the NAAN), however, access may be provided by one or more different entities instead of or in addition to the original naming authority.
The NAAN used in the ARK anatomy diagram, 13030, represents the California Digital Library. As of 2012, roughly a hundred organizations have registered for ARK NAANs, including numerous universities, Google, the Internet Archive, WIPO, the British Library, and other national libraries.
UC3/CDL maintains a complete registry of all currently assigned NAANs, which is mirrored at the (U.S.) National Library of Medicine and the Bibliothèque nationale de France.