Buscar
Resultados de la búsqueda
-
Conference paper (published)
DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching
We present DeezyMatch, a free, open-source software library written in Python for fuzzy string matching and candidate ranking. Its pair classifier supports various deep neural network architectures for training new classifiers and for fine-tuning a pretrained model, which paves the way for transfer learning in fuzzy string matching. This approach...Hosseini, Kasra ; Nanni, Federico ; Coll Ardanuy, Mariona
Natural Language Processing, string matching, toponym matching, machine learning, and digital humanities
-
Conference paper (published)
Archiving Interactive Narratives at the British Library
This paper describes the creation of the Interactive Narratives collection in the UK Web Archive, as part of the UK Legal Deposit Libraries Emerging Formats Project. The aim of the project is to identify, collect and preserve complex digital publications that are in scope for collection under UK Non-Print Legal...Clark, Lynda ; Rossi, Giulia Carla ; Wisdom, Stella
Emerging Formats, digital storytelling, new media collection management, Interactive Narratives collection, digital preservation, and web archiving
-
Presentation
On the verge of success – or failure? Repositories and the wider knowledge infrastructure, plus a bit about Hyku
Samvera Connect (Online) 2020 keynote presentation.Reimer, Torsten
open source, Samvera, open access, Hyku, and repositories
-
Presentation
British Library UK DataCite Summer Meeting
Learn about how institutions and projects in the UK and internationally are using DataCite DOIs to enhance discovery and citation of content, along with recent and upcoming changes for DataCite users in the UK, with the recording of our 2020 Summer Meeting. This year’s speakers were: • Rachael Kotarski, British...British Library
-
Conference paper (unpublished)
Conference Panel: Documenting the Olympics & Paralympics
The online panel event Documenting the Olympics & Paralympics is a collaboration between the British Library, the International Centre for Sports History and Culture (ICSHC) at De Montfort University, and the British Society of Sports History (BSSH). Originally, this was supposed to be a full day face-to-face event, but due...Byrne, Helena
-
Conference paper (unpublished)
Assessing the Impact of OCR Quality on Downstream NLP Tasks
A growing volume of heritage data is being digitized and made available as text via optical character recognition (OCR). Scholars and libraries are increasingly using OCR-generated text for retrieval and analysis. However, the process of creating text through OCR introduces varying degrees of error to the text. The impact of... -
Conference paper (unpublished)
ICDAR2019 Competition on Recognition of Early Indian Printed Documents – REID2019
This paper presents an objective comparative evaluation of page analysis and recognition methods for historical documents with text mainly in Bengali language and script. It describes the competition rules, dataset, and evaluation methodology. Results are presented for five methods - three submit-ted, one re-run, and one open source state-of-the-art system....Clausner, Christian ; Antonacopoulos, Apostolos ; Derrick, Tom ; Pletschacher, Stefan
-
Conference paper (published)
Preserving eBooks: Past, Present and Future - A Series of National Library Perspectives
This panel will present and discuss different eBook workflows and challenges from four national libraries, considering a range of issues from technical complexities to evolution of the content type and changes in the publishing/collecting landscape.Owens, Trevor ; Pennock, Maureen ; Smyth, Tom ; Steinke, Tobias
access, ingest, ebooks, digital preservation, formats, and scale
-
Conference paper (published)
Dawn of Digital Repositories Certification under ISO 16363. Exploring the Horizon and beyond
The dawn of Trustworthy Digital Repository Certification under the ISO 16363:2012 standard is on the horizon. Across the digital preservation community, institutions are eager to learn more about the processes of preparing for and undergoing an ISO 16363 audit from an accredited third-party organization. As the first ISO 16363 audits...Giaretta, David ; LaPlant, Lisa ; Shiers, Jamie ; Tieman, Jessica ; Pennock, Maureen …
repository, certification, trustworthy, audit, and standards
-
Conference paper (unpublished)
Victorian decorated books with cloth covers in the second half of the nineteenth century
The developments of how the manufacture, embossing and blocking of cloth covers was achieved between 1825 and 1850 have been set out elsewhere. However, to set the scene for what came after 1850, I wish to briefly describe the machine embossing of cloth, and show some of the grain types....King, Ed
-
Lecture
The Crystal Palace
The Crystal Palace evokes a response from almost everyone that you meet. Its fame is part of our culture. There were, of course two Crystal Palaces. They were built for different purposes. This talk will explain some of the motives that brought this about. I have concentrated upon British Library...King, Ed
-
Conference paper (published)
Resolving places, past and present: toponym resolution in historical British newspapers using multiple resources
Newspapers and their metadata are richly geographical, not only in their distribution but also their content. Attending to these spatial features is a prerequisite in newspaper research. Following other projects to have geoparsed place names in newspapers, we describe our approach to linking historical geospatial information in text to real-world...Coll Ardanuy, Mariona ; McDonough, Katherine ; Krause, Amrey ; Wilson, Daniel C.S. ; Hosseini, Kasra …
-
Conference paper (published)
Cross-disciplinary Collaborations to Enrich Access to Non-Western Language Material in the Cultural Heritage Sector
The British Library is home to millions of items representing every age of written civilisation, including books, manuscripts and newspapers in all written languages. Large digitisation programmes currently underway are opening up access to this rich and unique historical content on an ever increasing scale. However, particularly for historical material...Derrick, Tom ; McGregor, Nora
HTR, page analysis, layout analysis, recognition, Bangla script, Arabic script, OCR, and datasets
-
Conference paper (published)
Using METS, PREMIS and MODS for Archiving eJournals: Paper - iPRES 2008 - London
As institutions turn towards developing archival digital repositories, many decisions on the use of metadata have to be made. In addition to deciding on the more traditional descriptive and administrative metadata, particular care needs to be given to the choice of structural and preservation metadata, as well as to integrating...Dappert, Angela ; Enders, Marcus
-
Conference paper (published)
Risk Assessment; using a risk based approach to prioritise handheld digital information
The British Library (BL) Digital Library Programme (DLP) has a broad set of objectives to achieve over the next few years, from web-archiving to the ingest of e-journals through to mass digitisation of newspapers and books. These projects are decided by the DLP programme board and are managed by the...McLeod, Rory
-
Conference paper (published)
Modeling Organizational Preservation Goals to Guide Digital Preservation
Digital preservation activities can only succeed if they go beyond the technical properties of digital objects. They must consider the strategy, policy, goals, and constraints of the institution that undertakes them and take into account the cultural and institutional framework in which data, documents and records are preserved. Furthermore, because...Dappert, Angela ; Farquhar, Adam
-
Conference paper (published)
Costing the Digital Preservation Lifecycle More Effectively
Having confidence in the permanence of a digital resource requires a deep understanding of the preservation activities that will need to be performed throughout its lifetime and an ability to plan and resource for those activities. The LIFE (Lifecycle Information For E-Literature) and LIFE2 Projects have advanced understanding of the...Wheatley, Paul
-
Conference paper (published)
Adapting Existing Technologies for Digitally Archiving Personal Lives. Digital Forensics, Ancestral Computing, and Evolutionary Perspectives and Tools
The adoption of existing technologies for digital curation, most especially digital capture, is outlined in the context of personal digital archives and the Digital Manuscripts Project at the British Library. Technologies derived from computer forensics, data conversion and classic computing, and evolutionary computing are considered. The practical imperative of moving...John, Jeremy Leighton
-
Conference paper (unpublished)
Is there a role for ILL in an open access world – a British Library perspective
The 2017 UUK report on the transition to open access reported that 54% of UK-authored articles in 2016 were accessible within 12 months of publication. This is compared to 32% of articles authored in 2014. Over the past five years, open access research has flourished in an environment of funding...Flanagan, Dimity
-
Conference paper (published)
The Integrated Preservation Suite: Scaled and automated preservation planning for highly diverse digital collections (long paper)
The Integrated Preservation Suite is an internally funded project at the British Library to develop and enhance the Library's preservation planning capabilities, largely focussed on automation and addressing the Library's heterogeneous collections. Through agile development practices, the project is iteratively designing and implementing the technical infrastructure for the suite as...May, Peter ; Pennock, Maureen ; Russo, David
software preservation, knowledge base, preservation watch, and preservation planning
-
Conference paper (published)
Developing a robust migration workflow for preserving and curating hand-held media
Many memory institutions hold large collections of hand-held media, which can comprise hundreds of terabytes of data spread over many thousands of data-carriers. Many of these carriers are at risk of significant physical degradation over time, depending on their composition. Unfortunately, handling them manually is enormously time consuming and so...Dappert, Angela ; Jackson, Andrew ; Kimura, Akiko
disk-copying robot, iPRES, data-carrier stabilization, auto loader, and digital preservation
-
Conference paper (published)
An analysis of contemporary JPEG2000 codecs for image format migration
This paper presents results of an analysis of different implementations of the JPEG2000 standard, specifically part 1: JP2, an image format that is currently popular within the digital preservation community. In particular we are interested in the effect different JPEG2000 codecs (encoders and decoders) have on image quality in response...Palmer, William ; May, Peter ; Cliff, Peter
TIFF, image quality, generational loss, JPEG2000, migration, codec, and PSNR
-
Conference paper (published)
Quality assured image file format migration in large digital object repositories
This article gives an overview on how different components developed by the SCAPE project are intended to be used in composite file format migration workflows; it will explain how the SCAPE platform can be employed to make sure that the workflows can be used to migrate very large image collections...Schlarb, Sven ; Cliff, Peter ; May, Peter ; Palmer, William ; Hahn, Matthias …
-
Conference paper (published)
Capturing and replaying streaming media in a web archive – a British Library case study
A prerequisite for digital preservation is to be able to capture and retain the content which is considered worth preserving. This has been a significant challenge or web archiving, especially for websites with embedded streaming media content, which cannot be copied via a simple HTTP request to a URL. This...Hockx-Yu, Helen ; Crawford, Lewis ; Coram, Roger ; Johnson, Stephen
-
Conference paper (published)
LIFE3: A predictive costing tool for digital collections
Predicting the costs of long-term digital preservation is a crucial yet complex task for even the largest repositories and institutions. For smaller projects and individual researchers faced with preservation requirements, the problem is even more overwhelming, as they lack the accumulated experience of the former. Yet being able to estimate...Hole, Brian ; Lin, Li ; McCann, Patrick ; Wheatley, Paul
-
Conference paper (published)
Deal with conflict, capture the relationship: the case of digital object properties
Properties of digital objects play a central role in digital preservation. All key preservation services are linked via a common understanding of the properties which describe the digital objects in a repository's care. Unfortunately, different services deal with properties on sometimes different levels of description. While, for example, a preservation...Dappert, Angela
-
Conference paper (published)
A METS based information package for long term accessibility of web archives
The British Library’s web archive comprises several terabyte of harvested websites. Like other content streams this data should be ingested into the library’s central preservation repository. The repository requires a standardized Submission- and Archival Information Package. Harvested Websites are stored in Archival Information Packages (AIP). Each AIP is described by...Enders, Markus
-
Poster (published)
Malware threats in digital preservation: Extending the evidence base (poster)
Virus checking is an established process in most pre-ingest digital preservation workflows. It is typically included as part of a general threat model response and there has to date been relatively little research into the virus checking function specifically within a long term context. The British Library recently began a...Pennock, Maureen ; Day, Michael ; Samaras, Evanthia
malware, Flashback, virus checking, and digital preservation
-
Conference paper (published)
Considerations on the acquisition and preservation of ebook mobile apps
In 2018 and 2019, as part of the UK Legal Deposit Libraries’ sponsored ‘Emerging Formats’ project, the British Library’s digital preservation team undertook a program of research into the preservation of new forms of content. One of these content types was eBooks published as Mobile Apps. Research considered a relatively...Pennock, Maureen ; May, Peter ; Day, Michael
access, mobile apps, acquisition, digital preservation, and preservation
-
Presentation
The Integrated Preservation Suite: Demonstrating a scalable preservation planning toolset for diverse digital collections (demonstration)
The Integrated Preservation Suite is an internally funded project at the British Library to develop automated and scalable preservation planning capability for a highly diverse and growing digital collection. Core components include a technical knowledge base, a software repository, a policy and planning repository, and a preservation watch function, all...May, Peter ; Pennock, Maureen ; Russo, David
software preservation, preservation planning, preservation watch, digital preservation strategies, and knowledge base
-
Conference paper (published)
Practical analysis of TIFF file size reductions achievable through compression
This paper presents results of a practical analysis into the effects of three main lossless TIFF compression algorithms – LZW, ZIP and Group 4 – on the storage requirements for a small set of digitized materials. In particular we are interested in understanding which algorithm achieves a greater reduction in...May, Peter ; Davies, Kevin
LZW, Group 4, LibTiff, TIFF, ZIP, compression, and ImageMagick
-
Conference paper (published)
Not just a British library: enabling a global discovery experience
Within the walls of the British Library lies one of the greatest collections in the world. However, the value of the British Library lies not only in the preservation of heritage items, but also in its determination to keep pace with the many changes in the global information environment. As...Flanagan, Dimity
open access; repositories; discovery; persistent identifiers; text and data mining; digitisation
-
Presentation
FREYA halfway webinar 9 May 2019
During the webinar we looked back at the progress of the first part of the project and discussed the PID Graph and the growth of a PID community.Lavasa, Artemis ; Dohna, Tina ; Ferguson, Christine A. ; Bunakov, Vasily ; Jong, Maaike De …
-
Presentation
The Power of PIDs
Lightning Talk slide presented at Carpentry Connect 2019 in Manchester UK.Madden, Frances
-
Presentation
FREYA RDA UK Workshop July 2019
Presentation introducing the FREYA project at a joint RDA UK and FREYA workshop held 16 July 2019.Madden, Frances
-
Presentation
Webs Of Life And Data: Impacts Of Open And Networked Data On Scientific Practices In Biodiversity Studies (Draft DPhil Research Proposal)
A presentation of my doctoral (DPhil) research topic at the Oxford Internet Institute, University of Oxford, on Nov. 22, 2017. This is an early-stage presentation outlining the context of my research, which will investigate the impacts of open and networked data derived from digital natural history collections, on scientific practices...Stewart, Sarah A.
data management, museums, research data lifecycle, science and technology studies, biodiversity, digital collections, internet studies, DPhil, knowledge, and data
-
Conference paper (unpublished)
Towards a Networked Digital Cultural Heritage: Data Services and Persistent Identifiers at the British Library
Presentation given at the ‘Museums and Big Data’ Conference, April 30 - May 3rd, in Doha, Qatar. This presentation investigates the use of persistent identifiers in digital cultural heritage and digital collections.Stewart, Sarah Anna
museums, persistent identifiers, DOIs, cultural heritage, data, open research, digital scholarship, archives, DataCite, art galleries, digital collections, and libraries
-
Conference paper (unpublished)
Research Data Management in 'GLAM': Managing Data for Cultural Heritage
A presentation given at the ‘Open Science Infrastructures for Big Cultural Data’ Masterclass, Dec. 13-15th, in Plovdiv, Bulgaria, looking at research data management in the context of open digital cultural collections, with a case study of the developments in data management and data management infrastructures at the British Library.Stewart, Sarah Anna
data management, museums, research data management, cultural heritage, data, digital collections, digital scholarship, archives, libraries, art galleries, and open research
-
Conference paper (unpublished)
Conference Panel: The past, present and future of digital scholarship with newspaper collections
Historical newspapers are of interest to many humanities scholars, valued as sources of information and language closely tied to a particular time, social context and place. Following library and commercial microfilming and, more recently, digitisation projects, newspapers have been an accessible and valued source for researchers. The ability to use...Ridge, Mia ; Colavizza, Giovanni
-
Presentation
"...with the Software in the Library": Best practices for research software management and citation
A presentation given at the University College London Knowledge Quarter KQ Codes Tech Social on July 17, 2019. Outlines the research data landscape, data services and collections at the British Library and best practices for research software management and citation including software management planning, licensing resources for open source software... -
Conference paper (unpublished)
The sound of artists' books
Artists’ books – any books – are capable of sound, whether dropped, as in Keith Godard’s otherwise text-less and image-less Sounds (1972), or, fluttering noisily, drying out, in the chill spring wind, on the monastery roof in Sergo Paradjanov’s film, The Colour of Pomegranates (1969).Bury, Stephen
-
Conference paper (unpublished)
Structuring curatorial responsibilities to incorporate sabbaticals, research etc
A.W. Pollard, a keeper of printed books at the British Museum at the beginning of last century and an important Shakespearean scholar in his own right, remarked that one of the incentives to his career as a published writer was the low pay of the curator. So the simple way...Bury, Stephen
-
Lecture
The William Dyce and Edward Machell Cox collections of art sale catalogues in the British Library
In the flyer for this talk I called sale catalogues ‘unassuming’ and ‘half-hidden’. ‘Unassuming’ because they often are simple lists of works, ‘half-hidden’ because, in the British Library at least, they are not always catalogued separately and therefore are easy to miss. They are, however, extremely important for the study...Michaelides, Chris
-
Conference paper (unpublished)
Translating the enemy
This paper has three sources or “causes”, two of them “prior”, the third “final”. These are: firstly, the translation by the present writer of a fairly large group of poems and texts by the poet Velimir Khlebnikov (b1885, d1922), intended as a contribution to an anthology of English language translations...Chadwick, Brian
-
Conference paper (unpublished)
The epic unwriting of Empire: a case study. Khlebnikov -nash edinstvennyi poet-epik XX veka
I was discussing with a friend the problems I was having in introducing my topic or theme. The friend in question is one of the artists who has been working on the film which I will show later. He had read through my text, which was, I thought, mainly finished,...Chadwick, Brian
-
Conference paper (unpublished)
How to look after your Collection - A basic guide
Many philatelists understand that they are the guardians of the material in their collections for themselves and for future owners. It is unfortunate when some collectors show a disregard for looking after their collection and dismiss comment with a remark like “it will be OK in my life time”. It...Beech, David R.
-
Conference paper (unpublished)
The Time of Place: Louis-Sebastien Mercier and the hours of the day
I was recently reading The White Cities, Joseph Roth’s reports from France, 1925–1939, when, amongst many other moments, I was struck by the following passage: The manufacturers have their villas on the other side of the Rhône. That’s where the workers live – not in villas, alas, but in tenements....Shaw, Matthew J.
-
Conference paper (unpublished)
The British Library Philatelic Collections 1998 to 2005
This Paper is the third in a series that has reported to the Society and the philatelic world on the activities of the British Library Philatelic Collections. The first was given on 1st December, 1988 by my predecessor R F “Bob” Schoolley-West FRPSL and the second I gave on 9th...Beech, David R.
-
Conference paper (unpublished)
The Philately of the Edwardian era as shown in its literature
As this Paper is being given in 2006 no one can be alive who has any meaningful experience of philately in the reign of His Majesty King Edward VII. To discover virtually anything at all the researcher must examine the literature and the archives of the period. As far as...Beech, David R.
-
Conference paper (unpublished)
Collective intelligence or intelligent collecting: alternative survival strategies for audiovisual archives in the Information Age
Despite the evident prescriptive statement in the sub-title to this presentation, this sketch of the way things appear to me to be is intended to generate collaborative inquiry within IASA and its institutional members rather than present strategic actions that can be applied on return from this Conference.Clark, Chris
-
Conference paper (unpublished)
Lost for words? The Earliest Representations of the Americas in European Sources
West, Geoffrey
-
Conference paper (unpublished)
La adquisición del Amadís de Gaula, Libros I-IV (Zaragoza, 1508) por el Museo Británico
West, Geoffrey
-
Conference paper (published)
Adventures with ePub3: when rendering goes wrong
The role of standards in digital preservation is widely acknowledged. The current version of the ePub standard, used for publishing and disseminating eBooks, is ePub3, specifically 3.1 (January 2017). A marked difference from ePub2 is support for fixed layout files and, whilst several different ePub readers are available, not all...Pennock, Maureen ; Day, Michael
-
Conference paper (published)
Preservation planning for emerging formats at the British Library
The British Library and the other UK Legal Deposit Libraries have been collecting various forms of born-digital digital publications since 2013 as part of what is known as Non-Print Legal Deposit (NPLD). In 2017, the UK Legal Deposit Libraries established an Emerging Formats project to look at selected types of...Day, Michael ; Pennock, Maureen ; Smith, Caylin ; Jenkins, Jeremy ; Cooke, Ian
-
Poster (published)
Scaled and automated preservation planning for highly diverse digital collections: the Integrated Preservation Suite
This poster describes the Integrated Preservation Suite (IPS) project. IPS is a British Library initiative to develop and populate an infrastructure capable of supporting preservation planning of highly diverse digital collections at scale. IPS comprises: A Representation Information Registry with information about formats and wider technical environments relevant to the...Pennock, Maureen ; May, Peter
-
Conference paper (published)
The Flashback Project: Rescuing disk-based content from the 1980s to the present day
This paper introduces the British Library's Flashback project, a proof-of-concept that explored the practical challenges of preserving digital content stored on physical media (magnetic and optical disks) using a sample of content from hybrid collection items dating from between 1980 and 2010. It describes some of the activities undertaken by...Pennock, Maureen ; May, Peter ; Day, Michael ; Davies, Kevin ; Whibley, Simon …
-
Conference paper (published)
Sustainability assessments at the British Library: Formats, frameworks and findings
File format assessments have been the subject of much debate in and outside of the preservation community in the past decade. Recognizing the unique structural, operational, and collecting context of the British Library, the Library’s digital preservation team recently initiated new format assessment work to deliver recommendations on which file...Pennock, Maureen ; Wheatley, Paul ; May, Peter
file formats, British Library, assessments, transparency, sustainability, and preservation master
-
Conference paper (published)
Identifying digital preservation requirements: Digital Preservation Strategy and collection profiling at the British Library
The British Library is increasingly a digital library. Over past decades, it has built up significant collections of digital content covering a very wide range of content types. In addition to the increasing amounts of digital content acquired by purchase or donation, the Library and its partners have also invested...Day, Michael ; McDonald, Ann ; Kimura, Akiko ; Pennock, Maureen
preservation planning, institutional contexts of preservation, collection content profiling, and digital preservation
-
Conference paper (unpublished)
From standard to community resource: a view on ISNIs and ORG IDs
Over the last year, the International Standard Name Identifier board have been considering the ways in which ISNI as a system can improve to meet new challenges and become more open and transparent. One particular consideration has been to make ISNIs a better solution for organisation identifiers. The British Library...Reimer, Torsten ; Madden, Frances
-
Conference paper (published)
Arabic dialect identification in the context of bivalency and code-switching
In this paper we use a novel approach towards Arabic dialect identification using language bivalency and written code-switching. Bivalency between languages or dialects is where a word or element is treated by language users as having a fundamentally similar semantic content in more than one language or dialect. Arabic dialect...El-Haj, Mahmoud ; Rayson, Paul ; Aboelezz, Mariam
Arabic, machine learning, dialects, language identification, NLP, and bivalency
-
Conference paper (published)
A Latinised Arabic for All? Issues of representation, purpose and audience
This paper reviews two major issues which account for much of the variation in representing Arabic using Latin characters. Since the Latinisation of Arabic entails encoding additional phonetic information(by adding short vowels), how we choose to represent Arabic for Latinisation becomes a central issue. This representation may either reflect the...Aboelezz, Mariam
-
Conference paper (unpublished)
Bibliographic records as 'Big Data': seeking harmony in music metadata
The collaborative research project ‘A Big Data History of Music’ draws on a disparate array of music catalogues created over nearly two centuries. During that time, many different cataloguing rules have existed; national and international standards have developed for cataloguing printed materials, and, in many countries, separate protocols established for...Tuppen, Sandra
-
Conference paper (published)
Prospects for a Big Data History of Music
This position paper sets out the possibility of a musicology based on the analysis of musical-bibliographical metadata as Big Data. It outlines the work underway, as part of the AHRC-funded project A Big Data History of Music, to align seven major datasets of musical-bibliographical metadata. After discussing some of the...Rose, Stephen ; Tuppen, Sandra
-
Conference paper (unpublished)
Ejemplares del Quijote de la British Library: algunos datos sobre las procedencias de las ediciones de 1604/1605
West, Geoffrey
-
Conference paper (unpublished)
La biblioteca de F.W. Cosens, su dispersión y las adquisiciones de la Biblioteca Nacional de España
West, Geoffrey
-
Conference paper (unpublished)
Indicios de procedencias en libros españoles antiguos de la British Library de Londres
West, Geoffrey