Search Constraints
Search Results
-
Dataset
Digitised 19th Century Books - Metadata - 01/09/2013
The dataset holds metadata on the the books digitised within this collection, providing a quick means to connect a book identifier with some of the key bibliographic metadata about it. The metadata is held in JSON notation for ease of reuse.British Library ; British Library Labs
JSON, microsoft, books, metadata, and bibliographic
-
Dataset
OCR text derived from digitised books (unknown precise publication dates) in ALTO XML
This set consists 284 volumes. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language...British Library ; British Library Labs
XML, microsoft, books, digitised, metadata, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1890 - 1899 in ALTO XML
This set consists 14847 volumes, published between 1890-1899. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1870 - 1879 in ALTO XML
This set consists 8630 volumes, published between 1870-1879. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1850 - 1859 in ALTO XML
This set consists 5818 volumes, published between 1850-1859. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1860 - 1869 in ALTO XML
This set consists 7498 volumes, published between 1860-1869. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1880 - 1889 in ALTO XML
This set consists 10856 volumes, published between 1880-1889. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, ALTO, and nineteenth century
-
Dataset
OCR text derived from digitised books published 1830 - 1839 in ALTO XML
This set consists 2639 volumes, published between 1830-1839. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1810 - 1819 in ALTO XML
This set consists 2338 volumes, published between 1810-1819. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1820 - 1829 in ALTO XML
This set consists 2739 volumes, published between 1820-1829. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1700 - 1799 in ALTO XML.
The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, eighteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1800 - 1809 in ALTO XML
This set consists 1502 volumes, published between 1800-1809. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published c. 1510 - 1699 in ALTO XML
This set consists 693 volumes, published between c. 1510 - 1699. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and...British Library ; British Library Labs
XML, sixteenth century, books, digitised, seventeenth century, Microsoft, ALTO, and metadata
-
Dataset
OCR text derived from digitised books published 1840 - 1849 in ALTO XML
This set consists 4070 volumes, published between 1840-1849. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
Digitised Quarterly Lists PDFs and Metadata
The files in this dataset are derived from the British Library’s collection of bound volume Quarterly Lists: printed catalogue records of Indian books published quarterly and by province of British India between 1867 and 1947. The dataset comprises full-text searchable PDFs of 215 volumes as well as the associated metadata...Derrick, Tom
-
Dataset
Pelagios Project: Digitised Cornaro Atlas. Egerton MS 73
This dataset comprises 37 images from a Portolano executed by different Venetian artists between 1489 and 1492, known as the 'Cornaro Atlas'. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until 2039. However, given the...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Single Sheets and Supplementary Materials
This dataset comprises 30 images selected among individual maps and non-map based medieval materials such as travel accounts. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until 2039. However, given the age of the manuscripts...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Liber insularum Cycladum. Arundel MS 93.art.7
This dataset comprises 45 images from the Liber insularum Cycladum produced by Christophori Bondelmonti around 1422. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until 2039. However, given the age of the manuscripts and their...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Portolano. Egerton MS 2855
This dataset comprises 7 images from a Portolano produced in 1473 by Grazioso Benincasa. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Update – 9/3/2018: Please note that there were technical issues with this dataset. If you downloaded before 9/3/2018, please replace with this repaired...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Digitised Liber insularum Arcipelagi Cotton MS Vespasian a.XIII.art.1
This dataset comprises 82 images from the Liber insularum Arcipegelagi, an illustrated account of the islands and major ports of the Mediterranean produced by Christophori Bondelmonti around 1422. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Digitised Insularium Illustratum. Additional MS 15760
This dataset comprises 123 images from the Insularium Illustratum, an account of the islands of the Mediterranean, and of some others produced by Henricus Martellus Germanus in 1495. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Maps after Ptolemy's Geographia. Burney MS 111
This dataset comprises 68 images from a Greek manuscript edition of Ptolemy's Geography containing many diagrams and coloured maps, and produced between the 1375 and 1425. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Linked Open British National Bibliography - Books. 1950- N-Triples and RDF/XML.
This dataset includes metadata for books published or distributed in the UK since 1950.Deliot, Corine
British National Bibliography, BNB, NT, linked open data, RDF/XML, N-Triples, books, and metadata
-
Dataset
Digitised Quarterly Lists XML and Metadata
Two Centuries of Indian Print 1867-1947. The files in this dataset are derived from the British Library’s collection of bound volume Quarterly Lists: printed catalogue records of Indian books published quarterly and by province of British India between 1867 and 1947. The dataset comprises text from the collection of digitised...Derrick, Tom
-
Dataset
Linked Open British National Bibliography - Forthcoming Books N-Triples and RDF/XML
This dataset includes metadata for forthcoming books to be published or distributed in the UK.Deliot, Corine
British National Bibliography, forthcoming, linked open data, BNB, N-Triples, RDF/XML, NT, CIP, and metadata
-
Dataset
Linked Open British National Bibliography - Serials. 1950- N-Triples and RDF/XML
This dataset includes metadata for serials published or distributed in the UK since 1950.Deliot, Corine
British National Bibliography, serials, linked open data, BNB, N-Triples, RDF/XML, NT, and metadata
-
Journal article
Automated Language Identification of Bibliographic Resources
This article describes experiments in the use of machine learning techniques at the British Library to assign language codes to catalog records, in order to provide information about the language of content of the resources described. In the first phase of the project, language codes were assigned to 1.15 million...Morris, Victoria
machine learning, automatic metadata generation, legacy record enhancement, metadata, and language identification
-
Dataset
Books divided by Genre from the Digitised 19th century books dataset
A dataset derived from the Digitised 19th Century Books dataset which classifies the books by genre (Drama, Poetry, Prose, Music and unidentified). For Drama, Music and Prose several types were identified. For Drama: comedy, play, recitation and tragedy. For Prose: novel, parody, romance, satire, story, history subset of story and...British Library ; British Library Labs
Music, Genre, Prose, books, Poetry, metadata, bibliographic, and Drama
-
Dataset
Books containing images about Finland
A dataset derived from the Digitised 19th Century books dataset comprising books with images about Finland, approximately 40 titles. This dataset was compiled by Ruby Dixon a student at Graveney School who completed work experience at British Library Labs in 2016.British Library ; British Library Labs
books, Finland, metadata, and bibliographic
-
Dataset
Russian language books in the Digitised 19th century books dataset
A dataset which is a subset of the Digitised 19th Century books dataset comprising Russian Language books. The spreadsheet contains metadata of 585 books in Russian. This dataset was compiled by Nadya Miryanova a student at Lady Eleanor Holles who completed work experience at British Library Labs in 2017.British Library ; British Library Labs
books, metadata, bibliographic, and Russia
-
Dataset
Latin American books in Digitised 19th century books
A dataset which is derived from the 19th Century Books dataset comprising c.1,100 books which are related to Latin America, written in Spanish, English, German, French, Italian, Swedish and Dutch.British Library ; British Library Labs
books, Latin America, metadata, and bibliographic
-
Dataset
British and Irish Newspapers
A title-level list of British, Irish, British Overseas Territories and Crown Dependencies newspapers held by the British Library.British Library
datasets, catalogues, media, newspapers, periodicals, and metadata
-
Interactive resource
Project FREYA: How persistent identifiers can connect research together
This webinar will showcase the latest developments from the EC-funded FREYA project, including the PID Graph which provides a method to discover the relationships between different researchers and their organisations and find out the full impact of research outputs. It will also describe upcoming developments planned in the final year...Madden, Frances
persistent identifiers, DOIs, FREYA, metadata, research services, and PID graph
-
Blog post
When is a persistent identifier not persistent? Or an identifier?
Ever wondered what that bar code on the back of every book is? It’s an ISBN: an International Standard Book Number. Every modern book published has an ISBN, which uniquely identifies that book, and anyone publishing a book can get an ISBN for it whether an individual or a huge...Cope, Jez
-
Dataset
Sir Hans Sloane's Catalogues of his Library and Manuscripts
The files in this dataset are derived from microfilm copies of the original library catalogue of Sir Hans Sloane, now presented across 9 volumes, Sloane MS 3972 C 1-8, and the name index to the Sloane library catalogue, Sloane MS 3972 D. The catalogues are crucial for understanding the development...British Library
library, catalogues, Sloane, and metadata
-
Journal article
MARC transformed: MARC and XML – the perfect partnership?
I first met a very British version of MARC (Machine Readable Cataloguing) in 1983, straight out of university. I didn't know anything about cataloguing, indexing, classification, or data. MARC made sense of it all. AACR2 (Anglo-American Cataloguing Rules, 2nd Edition) was impenetrable without MARC as a framework. LCSH (Library of...Rosie, Heather
MarcEdit, LCSH, EThOS, Dublin Core, TDM, XML, MARC Report, metadata, MARC 21, MARC, OAI-PMH, and AACR2
-
Dataset
Books related to 19th Century British Colonies derived from the Digitised 19th Century books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books related to 19th Century British Colonies. The dataset of 1288 items was created using filtering by keywords of locations and then manually checked for accuracy. The data was augmented with additional columns including 'City', 'Colony Name' and...British Library ; British Library Labs
Africa, colonialism, Canada, Ceylon, metadata, bibliographic, India, Australia, books, British Colony, and British Colonies
-
Dataset
Books related to War derived from the Digitised 19th Century Books Dataset
A dataset which is derived from the Digitised 19th Century Books dataset comprising all non-fiction English language books related to armed conflicts. The dataset of 1127 items was developed by refining based on keywords such as 'war', 'battle', 'uprising', 'revolt', 'rebellion', 'invasion' and 'mutiny'. This dataset was curated by students...British Library ; British Library Labs
non-fiction, War, books, metadata, and bibliographic
-
Dataset
Books related to India from the Digitised 19th Century Books dataset
A dataset which is derived from the Digitised 19th Century books dataset focusing on books related to India. The dataset was created by refining the book title field using keywords related to names used for India during the period, places within India, cultural terms such as 'Hindu' and another term...British Library ; British Library Labs
books, metadata, bibliographic, and India
-
Dataset
Books related to theatre derived from the Digitised 19th Century Books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books pertaining to theatre written in English. The dataset of 841 items was created by filtering by keywords which are related to different genre of play including Drama, Act, Scene, Play, Comedy, Farce, Pantomime, Tragedy and Shakespeare and...British Library ; British Library Labs
act, genre, books, metadata, bibliographic, theatre, and play
-
Dataset
Books related to the Industrial Revolution derived from the Digitised 19th Century books dataset
A dataset which is a subset of the Digitised 19th Century Books dataset comprising books related to the Industrial Revolution in Britain. The subset of 354 items was refined by using keywords associated with placenames and the topic of industrialism. This dataset was curated by the Aepyi student group at...British Library ; British Library Labs
books, industrialism, metadata, bibliographic, and Industrial Revolution
-
Dataset
19th Century Books - metadata with additional crowdsourced annotations
This dataset contains metadata for resources belonging to the British Library’s digitised printed books (18th-19th century) collection (bl.uk/collection-guides/digitised-printed-books). This metadata has been extracted from British Library catalogue records. The metadata held within our main catalogue is updated regularly. This metadata dataset should be considered a snapshot of this metadata. For...British Library
metadata, zooniverse, and monographs
-
Dataset
Digitised 19th Century Books - Metadata - 01/09/2021
This dataset contains metadata for resources belonging to the British Library’s digitised printed books (18th-19th century) collection (https://www.bl.uk/collection-guides/digitised-printed-books). This metadata has been extracted from British Library catalogue records. The metadata held within our main catalogue is updated regularly. This metadata dataset should be considered a snapshot of this metadata. For...British Library ; British Library Labs
Microsoft, JSON, metadata, books, and bibliographic
-
Dataset
19th Century Books - Metadata 05/2021
This dataset contains metadata for a selection of monographs that are identified in the catalogue as being published during the 19th Century. This metadata has been extracted from British Library catalogue records. The metadata held within our main catalogue is updated regularly. This metadata dataset should be considered a snapshot...British Library
metadata and monographs
-
Dataset
British Library Newspaper Title-level List: A list of catalogued newspaper titles held by the British Library
A title-level list of catalogued newspapers held by the British Library.British Library
datasets, catalogues, media, newspapers, periodicals, and metadata
-
Dataset
Incunabula Printed Catalogue Dataset: Volumes 1-10 copy of github repository
This dataset includes the github repository used to derive catalogue entries from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library....British Library
book history, metadata, catalogues, datasets, incunabula, early printed books, and early printing
-
Report
Archive of Tomorrow: Capturing public health discourse in the UK Web Archive
This report provides an overview of the Archive of Tomorrow project, a pilot project and partnership between four UK libraries—the National Library of Scotland, the University of Oxford’s Bodleian Libraries, Cambridge University Library, and Edinburgh University Library. The project was funded by the Wellcome Trust in 2022-2023 with a budget...UK Legal Deposit Libraries
public health, research networks, web archiving, and metadata
-
Journal article
FAST the Inside Track: Where We Are, Where Do We Want to Be, and How Do We Get There?
This is an overview of the development of FAST (Faceted Application of Subject Terminology) from its inception in the late 1990s, through its development and implementation to the work being undertaken by OCLC and the FAST Policy and Outreach Committee (FPOC) to develop and promote FAST. FPOC members explain how... -
Dataset
Incunabula Printed Catalogue Dataset: Volumes 1-10
This dataset includes the catalogue entries derived from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library. The dataset was created...British Library
datasets, catalogues, early printing, book history, early printed books, metadata, and incunabula