A METS based information package for long term accessibility of web archives - British Library Research Repository
Skip to main content
Shared Research Repository
Conference paper (published)

A METS based information package for long term accessibility of web archives

2010

Abstract

The British Library’s web archive comprises several terabyte of harvested websites. Like other content streams this data should be ingested into the library’s central preservation repository. The repository requires a standardized Submission- and Archival Information Package. Harvested Websites are stored in Archival Information Packages (AIP). Each AIP is described by a METS file. Operational metadata for resource discovery as well as archival metadata are normalized and embedded in the METS descriptor using common metadata profiles such as PREMIS and MODS. The British Library’s METS profile for web archiving considers dissemination and preservation use cases ensuring the authenticity of data. The underlying complex content model disaggregates websites into web pages, associated objects and their actual digital manifestations. The additional abstract layer ensures accessibility over the long term and the ability to carry out preservation actions such as migrations. The library wide preservation policies and principles become applicable to web content as well.

Files

File nameDate UploadedVisibilityFile size
enders_mets.pdf
13 Sep 2019
Public
414 kB