Conference ItemAbstract: The role of standards in digital preservation is widely acknowledged. The current version of the ePub standard, used for publishing and disseminating eBooks, is ePub3, specifically 3.1 (January 2017). A marked difference from ePub2 is support for fixed layout files and, whilst several different ePub readers are available, not all have upgraded to provide full support for ePub3. In late 2017 and early 2018 the British Library’s digital preservation team undertook research into the impact of using an ePub viewer without explicit support for ePub3 on a mixed sample of ePub3 files. The sample comprised 54 files: 20 of these were of fixed layout, the remainder utilized reflowable layouts. For the analysis, content was accessed using two different open source ePub readers: Calibre, which has a wide user base but does not currently explicitly support ePub3, and Readium, which does explicitly support ePub3. ePub3 files with a reflowable layout broadly rendered to an acceptable standard using both readers. There was one notable exception for both readers, and investigations indicated this was likely due to a problem with the file itself rather than the rendering software. Problems manifested in a more serious way when using Calibre to access just under half of the fixed layout ePub3 files. The rendering of these items inhibited access to intellectual content for example by overlaying it on other content, misrepresenting it, or not including it at all. Other issues that were initially apparent with the other half of the fixed layout sample were mostly resolved by the simple quick fix of switching from ‘page’ view to ‘flow’ view. By contrast, Readium was able to display all of fixed layout files correctly. The research serves as a reminder that whilst standards remain an essential tool in the digital preservation toolbox, updates to a standard may necessitate changes to the rendering software in use. It further underlines the importance of accurate characterization so that repositories can identify formats at a version level or be able to identify items with explicit rendering needs beyond those served by their default rendering viewers.
Pennock, Maureen; Day, Michael
Conference ItemAbstract: The British Library and the other UK Legal Deposit Libraries have been collecting various forms of born-digital digital publications since 2013 as part of what is known as Non-Print Legal Deposit (NPLD). In 2017, the UK Legal Deposit Libraries established an Emerging Formats project to look at selected types of content that were potentially in scope for NPLD. At the beginning of 2018, the British Library’s digital preservation team commenced research, as part of the project, into the preservation implications of some of these new forms of publication. Over the course of three months, the project analysed a small sample of interactive narrative works and mobile eBook apps from the Apple store. The evaluation revealed specific characteristics including: the automated personalization of narrative content; the integration of images taken by device cameras; movement-driven behavioural changes to item displays; the use of third party content to drive narratives forward; and, game-like features with a high dependency on visual, illustrated displays. Engagement with content experts underlined the significance of the interactive elements, suggesting that any subsequent preservation plans need to take this into account. Accompanying technical analysis using the Library’s format sustainability assessment framework identified issues around DRM (digital rights management), proprietary environments, and the use of third-party content. Whilst further research is clearly needed to validate the findings beyond the initial sample size and to resolve the technical challenges identified, other more conceptual questions remain about such matters as ownership, authority, provenance, versioning, and personalization. Engagement with content providers will be key to resolving all of these in a satisfactory manner.
Day, Michael; Pennock, Maureen; Smith, Caylin; Jenkins, Jeremy; Cooke, Ian
Conference ItemAbstract: File format assessments have been the subject of much debate in and outside of the preservation community in the past decade. Recognizing the unique structural, operational, and collecting context of the British Library, the Library’s digital preservation team recently initiated new format assessment work to deliver recommendations on which file formats will best enable the preservation of integral, authentic representations of British Library collection content over the long term. This paper describes the work carried out to review previous assessments, identify appropriate sustainability categories and newly assess formats accordingly. We posit that the relatively ‘fuzzy’ nature of a file format requires a relatively open-ended assessment framework and a nuanced understanding of preservation risk that does not solely lie with ‘all-or-nothing’ format obsolescence. We review other work in this area and suggest that whilst previous format assessment work has addressed a range of subtly different aims, experience has since indicated that some of the criteria used - such as considering number of pages in a format specification as a measure of complexity - may be invalid. British Library assessments are made on documented points of principle, for example, an emphasis on evidence-based preservation risks and the avoidance of numerical scores leading to comparisons between formats, and these have formed the base upon which sustainability categories are defined. We present these categories, which help to identify preservation risks or other challenges in the management of digital collections, and provide an overview of initial assessments of three formats: TIFF, JP2, and PDF. We acknowledge however, that implementation of preservation requirements, e.g., the use of particular preservation-justified file formats, must be balanced against other business requirements, such as storage costs and access needs, and argue that transparency of this format assessment process is fundamental if the resulting recommendations are to be fully understood in the future.
Pennock, Maureen; Wheatley, Paul; May, Peter
Identifying digital preservation requirements: Digital Preservation Strategy and collection profiling at the British LibraryAbstract: The British Library is increasingly a digital library. Over past decades, it has built up significant collections of digital content covering a very wide range of content types. In addition to the increasing amounts of digital content acquired by purchase or donation, the Library and its partners have also invested heavily in the digitization of selected collection content, helping to create large collections of certain types of content (e.g., newspapers, outof-copyright books, and sound). Most recently, the extension of legal deposit provisions to non-print works in 2013 has meant that the British Library - working in conjunction with the other UK legal deposit libraries - has begun to collect new categories of digital content, including periodic harvests of the UK Web domain. In order to support this, the Library has also invested heavily in developing scalable infrastructures for the acquisition, storage and management of large amounts of digital content. The British Library Digital Preservation Strategy, 2013-2016 is focused on the embedding of digital sustainability as an organizational principle across the Library and to help manage preservation risks and challenges across all digital collection content lifecycles. This practice paper describes work being undertaken by the Digital Preservation Team at the British Library to develop content profiles of high-level digital collections that will support the implementation of the strategy, in particular for the capture of long-term preservation requirements.
Day, Michael; McDonald, Ann; Kimura, Akiko; Pennock, Maureen