Buscar
Resultados de la búsqueda
-
Dataset
Ground Truth transcriptions for training OCR of historical Bengali printed texts - Recognition of Early Indian Printed Documents competition
This dataset comprises 81 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transcriptions (XML) for each page that can be used for training optical character recognition software on historical...British Library ; Derrick, Tom
Indian, transcription, and OCR
-
Dataset
Ground Truth transcriptions for training OCR of historical Bengali printed texts - Transkribus
This dataset comprises 74 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transcriptions (XML) for each page that can be used for training optical character recognition software on historical...British Library ; Derrick, Tom
OCR, transcription, and Indian
-
Dataset
Judicial Committee of the Privy Council: Linked Appeals Data
The dataset in this collection contains Linked Data about appeal cases heard by the Judicial Committee of the Privy Council between 1860 and 1998. The Judicial Committee of the Privy Council (JCPC) is the final court of appeal for British overseas territories and Crown dependencies, as well as ecclesiastical and...Middle, Sarah
-
Dataset
Ground Truth transcriptions for training OCR of historical Arabic handwritten texts
This dataset comprises 120 digitised images (TIFF files) drawn from a selection of historical Arabic scientific manuscripts (10th-19th century) digitised through the British Library Qatar Foundation Partnership. Also contained are ground truth transcriptions (XML) for each page that can be used for training optical character recognition (OCR) or handwritten text...British Library ; Keinan-Schoonbaert, Adi
Arabic, transcription, and OCR
-
Dataset
British and Irish Newspapers
A title-level list of British, Irish, British Overseas Territories and Crown Dependencies newspapers held by the British Library.British Library
datasets, catalogues, media, newspapers, periodicals, and metadata
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/1188 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
-
Dataset
Russian language books in the Digitised 19th century books dataset
A dataset which is a subset of the Digitised 19th Century books dataset comprising Russian Language books. The spreadsheet contains metadata of 585 books in Russian. This dataset was compiled by Nadya Miryanova a student at Lady Eleanor Holles who completed work experience at British Library Labs in 2017.British Library ; British Library Labs
books, metadata, bibliographic, and Russia
-
Dataset
Books containing images about Finland
A dataset derived from the Digitised 19th Century books dataset comprising books with images about Finland, approximately 40 titles. This dataset was compiled by Ruby Dixon a student at Graveney School who completed work experience at British Library Labs in 2016.British Library ; British Library Labs
books, Finland, metadata, and bibliographic
-
Dataset
Books divided by Genre from the Digitised 19th century books dataset
A dataset derived from the Digitised 19th Century Books dataset which classifies the books by genre (Drama, Poetry, Prose, Music and unidentified). For Drama, Music and Prose several types were identified. For Drama: comedy, play, recitation and tragedy. For Prose: novel, parody, romance, satire, story, history subset of story and...British Library ; British Library Labs
Music, Genre, Prose, books, Poetry, metadata, bibliographic, and Drama
-
Dataset
Latin American books in Digitised 19th century books
A dataset which is derived from the 19th Century Books dataset comprising c.1,100 books which are related to Latin America, written in Spanish, English, German, French, Italian, Swedish and Dutch.British Library ; British Library Labs
books, Latin America, metadata, and bibliographic
-
Dataset
Books related to the Industrial Revolution derived from the Digitised 19th Century books dataset
A dataset which is a subset of the Digitised 19th Century Books dataset comprising books related to the Industrial Revolution in Britain. The subset of 354 items was refined by using keywords associated with placenames and the topic of industrialism. This dataset was curated by the Aepyi student group at...British Library ; British Library Labs
books, industrialism, metadata, bibliographic, and Industrial Revolution
-
Dataset
Books related to War derived from the Digitised 19th Century Books Dataset
A dataset which is derived from the Digitised 19th Century Books dataset comprising all non-fiction English language books related to armed conflicts. The dataset of 1127 items was developed by refining based on keywords such as 'war', 'battle', 'uprising', 'revolt', 'rebellion', 'invasion' and 'mutiny'. This dataset was curated by students...British Library ; British Library Labs
non-fiction, War, books, metadata, and bibliographic
-
Dataset
Books related to 19th Century British Colonies derived from the Digitised 19th Century books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books related to 19th Century British Colonies. The dataset of 1288 items was created using filtering by keywords of locations and then manually checked for accuracy. The data was augmented with additional columns including 'City', 'Colony Name' and...British Library ; British Library Labs
Africa, colonialism, Canada, Ceylon, metadata, bibliographic, India, Australia, books, British Colony, and British Colonies
-
Dataset
Books related to India from the Digitised 19th Century Books dataset
A dataset which is derived from the Digitised 19th Century books dataset focusing on books related to India. The dataset was created by refining the book title field using keywords related to names used for India during the period, places within India, cultural terms such as 'Hindu' and another term...British Library ; British Library Labs
books, metadata, bibliographic, and India
-
Dataset
Books related to theatre derived from the Digitised 19th Century Books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books pertaining to theatre written in English. The dataset of 841 items was created by filtering by keywords which are related to different genre of play including Drama, Act, Scene, Play, Comedy, Farce, Pantomime, Tragedy and Shakespeare and...British Library ; British Library Labs
act, genre, books, metadata, bibliographic, theatre, and play
-
Dataset
Freemason manuscripts in the Modern Archive collections
Dataset of brief biographical entries for 365 prominent Freemasons, with links to relevant material held with the British Library’s Modern Archives and Manuscript Collections, from the 18th to the 20th century. The dataset was supported with funding from the American Friends of the British Library, and was created by Tabitha...British Library
Freemasons and archives
-
Dataset
Sir Hans Sloane's Catalogues of his Library and Manuscripts
The files in this dataset are derived from microfilm copies of the original library catalogue of Sir Hans Sloane, now presented across 9 volumes, Sloane MS 3972 C 1-8, and the name index to the Sloane library catalogue, Sloane MS 3972 D. The catalogues are crucial for understanding the development...British Library
library, catalogues, Sloane, and metadata
-
Dataset
Government of India, Annual Administration Reports
Annual administration reports of the territories of British India for the following areas: Government of India 1870-1871; Government of Bengal, 1871-1936 ; Government of Burma, 1872-1899 ; Chin Hills, 1909-1923; Shan and Karenni States, 1889. For researchers seeking an overview of events and developments in the territories of British India...British Library
government, India Office, Bengal, Burma, Shan and Karenni States, administration reports, and Chin Hills
-
Dataset
India Office Lists
The India Office Lists are annual reference works giving details of departments and post holders in: the India Office, London, 1858-1947; the Burma Office, 1935-47; the Government of India, Calcutta, later Delhi, 1858-1947; the main provincial administrations of Bengal, Bombay and Madras and minor administrations for the same dates. From...British Library
government, India Office, Bengal, Bombay, Colonial India, and Madras
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version (5): https://doi.org/10.23636/1344 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS,...British Library ; Rosie, Heather
higher education, ethos, dissertations, HE, research, PhD, doctoral, student, UK, theses, and thesis
-
Dataset
Indexes to the Dispatches of the East India Company Court of Directors to Indian Governments
The IOR/E/4 Correspondence with India comprises 1112 volumes dating from 1703-1858. The material is arranged into eight series: four series of letters received by the Court of Directors from the administration in India; and four series of dispatches sent by the Court to the same administrations. Subject, name and place...India Office Library and Records ; British Library ; Hailey, Alex
-
Dataset
DUKweb (Diachronic UK web)
We present DUKweb, a set of large-scale resources useful for the diachronic analysis of contemporary English. The dataset is derived from JISC UK Web Domain Dataset (1996-2013), which collects resources from the Internet Archive that were hosted on domains ending in ‘.uk’. The dataset includes co-occurrences matrices for each year...Basile, Pierpaolo ; Tsakalidis, Adam
-
Dataset
Persistent Identifiers as IRO Infrastructure: Survey Data
The survey ran from 28 May to 14 September 2020 and was open to everyone working in Galleries, Libraries, Archives and Museums internationally but the survey had a clear UK focus. Some responses have been removed or recoded to protect the identity of respondents. It is intended to re-run the...Kotarski, Rachael ; Madden, Frances
-
Dataset
al-Durr al-naqī fī fann al-mūsīqī (Add MS 23494)
This dataset is a PDF file containing the images and transcription the manuscript titled al-Durr al-naqī fī fann al-mūsīqī الدرّ النقيّ في فنّ الموسيقي by Aḥmad ibn 'Abd al-Raḥmān al-Mawṣilī أحمد بن عبد الرحمن الموصلي. The manuscript was digitised through the British Library Qatar Foundation Partnership, and made available through...British Library ; Keinan-Schoonbaert, Adi
transcription, Arabic, and OCR
-
Dataset
Catalogue records of photographs (1850-1950)
A set of catalogue records for photographs (created 1850-1950) that are held at the British Library. Export from the Integrated Archives and Manuscripts System of only CC0 published records. Personal or sensitive information has been removed. This dataset was created specifically for the Legacies of Catalogue Descriptions and Curatorial Voice:...British Library ; Moretto, Nicolas
datasets, metadata, catalogues, and photographs
-
Dataset
UK Doctoral Theses (EThOS) Abstracts and Metadata - 01/03/2015. XLS.
This dataset has been superseded by a more recent version: https://doi.org/10.22021/ETHOSCSV201810 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the UK's national thesis service. We...British Library ; Rosie, Heather
-
Dataset
Developing identifiers workshop analysis
A collection of use cases gathered for the Developing Identifiers for Heritage Collections resource (https://tanc-ahrc.github.io/PIDResources/). It describes all the use cases for which PIDs are used and was used to inform the aspects described in the resource.Madden, Frances
-
Dataset
Locating a National Collection Our Place audience survey results
Results of an audience survey conducted by Locating a National Collection and funded by the AHRC. The research has been led by the National Trust in collaboration with the British Library and Research Bods, a market research company who have delivered results using the NT’s ‘Our Place’ online audience research...Vitale, Valeria ; Rees, Gethin ; Hunt, Alex
-
Dataset
Collective Wisdom crowdsourcing organiser and volunteer survey results
Results from two short surveys run for the Collective Wisdom project. Funded by the UK Arts and Humanities Research Council (AHRC), the Collective Wisdom project captures the collective wisdom of researchers and practitioners in crowdsourcing, citizen history, citizen science and public / community participation in research with cultural heritage collections....Ridge, Mia ; Ferriter, Meghan ; Blickhan, Samantha
-
Dataset
Digitised 19th Century Books - Metadata - 01/09/2021
This dataset contains metadata for resources belonging to the British Library’s digitised printed books (18th-19th century) collection (https://www.bl.uk/collection-guides/digitised-printed-books). This metadata has been extracted from British Library catalogue records. The metadata held within our main catalogue is updated regularly. This metadata dataset should be considered a snapshot of this metadata. For...British Library ; British Library Labs
Microsoft, JSON, metadata, books, and bibliographic
-
Dataset
19th Century Books - metadata with additional crowdsourced annotations
This dataset contains metadata for resources belonging to the British Library’s digitised printed books (18th-19th century) collection (bl.uk/collection-guides/digitised-printed-books). This metadata has been extracted from British Library catalogue records. The metadata held within our main catalogue is updated regularly. This metadata dataset should be considered a snapshot of this metadata. For...British Library
metadata, zooniverse, and monographs
-
Database
Siddham: the South Asia Inscriptions Database
The Siddham database is a resource for the study of inscriptions from South and Central Asia. The project focuses on the period of the Guptas (circa 320 to 550), a pivotal moment in the history of Asia, marked by an astonishing florescence in every field of endeavour. The Gupta kingdom...Rees, Gethin ; van Schaik, Sam ; Balogh, Danièl
-
Dataset
Early Music Online
Access is provided via the Official URL. These digitised volumes contain approximately 10,000 musical compositions, which have been individually indexed. The volumes mainly consist of partbooks of vocal polyphony, but also include some early printed tablatures for keyboard or plucked string instruments. They include music printed in Italy, Germany, France...Rose, Stephen ; Tuppen, Sandra
-
Software
epidoc-headers
Python script to create TEI epidoc headers created as part of Mapping the Jewish Communities of the Byzantine Empire using GIS project.Rees, Gethin
-
Software
LD bounding_box
Python script to open a file system of vector and raster GIS data, find the extent and output this into a spreadsheet in a format suitable for Marc cataloguing.Rees, Gethin
-
Dataset
IMPACT Digitisation Centre of Competence Dataset
The Impact Centre of Competence dataset contains more than half a million representative text-based images compiled by a number of major European libraries. Covering texts from as early as 1500, and containing material from newspapers, books, pamphlets and typewritten notes, the dataset is an invaluable resource for future research into...Universitat d’Alacant ; Instituut voor de Nederlandse Taal ; Koninklijke Bibliotheek ; Bibliothèque Nationale de France ; British Library …
-
Dataset
19th Century Books - Metadata 05/2021
This dataset contains metadata for a selection of monographs that are identified in the catalogue as being published during the 19th Century. This metadata has been extracted from British Library catalogue records. The metadata held within our main catalogue is updated regularly. This metadata dataset should be considered a snapshot...British Library
metadata and monographs
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/ybpt-nh33 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
higher education, ethos, dissertations, thesis, research, PhD, doctoral, student, UK, and theses
-
Dataset
Mapping Irish Football
The Mapping Irish Football project called on the crowd to share any newspaper references they may have come across of women and any code of football prior to and including 1973. It is hoped that this project will start a conversation amongst researchers interested in Irish sports to do more...Byrne, Helena ; Bolton, Steve ; Carrier, John ; Farrell, Gerald ; Faller, Helge …
women's football, newspaper data, crowdsourcing, and Irish football history
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/1137 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the UK's national thesis service. We...British Library ; Rosie, Heather
Higher education, dissertations, HE, research, doctoral, student, UK, theses, ethos, and thesis
-
Dataset
EThOS metadata files augmented with identifiers
A selection of files of the EThOS metadata augmented by the organisational identifiers listed below. These files were created to inform a deliverable which is part of the FREYA project which aims to gather enhanced provenance information in the EThOS metadata. The ReadMe file contains full details of the files,their...British Library
persistent identifiers, ISNI, ROR, and GRID
-
Dataset
Digitised maps of the former British East Africa
This dataset comprises 581 images of maps of the former British East Africa created between 1890 and 1940 and a spreadsheet of related catalogue records. All Open Government Licence v1.0 (OGL). A user-friendly geographical search index of the maps is available on Google Maps. These JPEG files were converted from...Dykes, Nick
War Office Archive, documents, Intelligence, Uganda, East Africa, British East Africa, Maps, Military maps, and Kenya
-
Dataset
Digitised Books. c. 1510 - c. 1900. JSONL (OCR derived text + metadata)
The dataset comprises metadata and OCR generated text from 49,455 digitised books published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in JSON Lines (JSONL) text format.British Library Labs ; British Library
OCR and monographs
-
Dataset
Persistent Identifiers as IRO Infrastructure: Survey 2 Data
The survey ran from 4 October to 8 November 2021 and was open to everyone working in Galleries, Libraries, Archives and Museums internationally but the survey had a clear UK focus. Some responses have been removed or recoded to protect the identity of respondents.Kotarski, Rachael ; Madden, Frances
-
Dataset
British Library Newspaper Title-level List: A list of catalogued newspaper titles held by the British Library
A title-level list of catalogued newspapers held by the British Library.British Library
datasets, catalogues, media, newspapers, periodicals, and metadata
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/j278-4b96 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
higher education, student, EThOS, research, doctoral, thesis, PhD, UK, dissertations, and theses
-
Dataset
OCR text derived from digitised books published 1890 - 1899 in ALTO XML
This set consists 14847 volumes, published between 1890-1899. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1860 - 1869 in ALTO XML
This set consists 7498 volumes, published between 1860-1869. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1850 - 1859 in ALTO XML
This set consists 5818 volumes, published between 1850-1859. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1870 - 1879 in ALTO XML
This set consists 8630 volumes, published between 1870-1879. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1830 - 1839 in ALTO XML
This set consists 2639 volumes, published between 1830-1839. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1820 - 1829 in ALTO XML
This set consists 2739 volumes, published between 1820-1829. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1810 - 1819 in ALTO XML
This set consists 2338 volumes, published between 1810-1819. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1800 - 1809 in ALTO XML
This set consists 1502 volumes, published between 1800-1809. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1700 - 1799 in ALTO XML.
The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, eighteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1840 - 1849 in ALTO XML
This set consists 4070 volumes, published between 1840-1849. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published c. 1510 - 1699 in ALTO XML
This set consists 693 volumes, published between c. 1510 - 1699. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and...British Library ; British Library Labs
XML, sixteenth century, books, digitised, seventeenth century, Microsoft, ALTO, and metadata
-
Dataset
OCR text derived from digitised books published 1880 - 1889 in ALTO XML
This set consists 10856 volumes, published between 1880-1889. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, ALTO, and nineteenth century
-
Dataset
OCR text derived from digitised books published 1900 - c. 1946. ALTO XML.
Unfortunately we are unable to make this dataset available due to copyright reasons. This set consists 1251 volumes, published between 1900-1946. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history,...British Library ; British Library Labs
-
Dataset
Digitised Books - Images identified as Medium Sized Images. c. 1567 - c. 1900. JPG
The dataset comprises c. 217,101 images identified as 'Medium Sized Images' from the British Library's Flickr Commons collections, dating between c. 1567 - c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Medium Sized...British Library ; British Library Labs
medium sized images, books, images, digitised, and Microsoft
-
Dataset
Digitised Books. c. 1510 - c. 1900. JSON (OCR derived text)
The dataset comprises text created by OCR from the 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in JavaScript Object Notation (JSON) text...British Library ; British Library Labs
-
Dataset
Digitised Books - Images identified as Medium Sized Images. c. 1567 - c. 1900. JPG
The dataset comprises c. 217,101 images identified as 'Medium Sized Images' from the British Library's Flickr Commons collections, dating between c. 1567 - c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Medium Sized...British Library ; British Library Labs
digitised, books, Microsoft, images, and medium sized images
-
Dataset
Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG
The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The images are in .JPEG format.British Library ; British Library Labs
embellishments, digitised, books, Microsoft, and images
-
Dataset
Digitised Books - Images identified as Plates. c. 1528 - c. 1900. JPG
The dataset comprises c. 385,237 images identified as 'Plates' from the British Library's Flickr Commons collections, dating between c. 1528 – c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Plates have currently been...British Library ; British Library Labs
-
Dataset
Digitised Books - Images of the bound covers of books. c. 1510 - c. 1900. JPG
The dataset comprises c. 61,561 images identified as 'Book Covers' from the British Library's Flickr Commons collections, dating between c. 1510 - c. 1900.British Library ; British Library Labs
digitised, books, Microsoft, images, and bookcovers
-
Dataset
John Jaffray dataset; a hand list of printed books and scrap books compiled by Jaffray relating to bookbinding and trade unionism in (mainly) 19th century Victorian London
This set comprises 169 records on 222 pages of a PDF listing the contents of the Jaffray Collection (shelf mark Jaff 1 to Jaff 169) composed using free text. John Jaffray (1811-1869) was a bookbinder in Victorian London, interested in bookbinding, trade unionism and Chartism. This is a restricted collection...Marks, P. J. M.
John Jaffray bookbinder, 19th century, trade unionism, and bookbinding
-
Dataset
StopsGB: Structured Timeline of Passenger Stations in Great Britain
Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquely rich and detailed account of Britain's changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both micro- and macro-scales. However, being published originally as a book (and... -
Dataset
British Library Television News Programme-Level List
This list provides a programme-level record of all television news and current affairs programmes recorded by the British Library’s Broadcast News service between March 2010 and May 2022. All of the channels featured were receivable free-to-air in the UK and licensed by Ofcom. All of the programmes listed can be...British Library
television, news, and current affairs
-
Dataset
Selected edge painting on British Library printed books: A work in progress
Bookbindings were (and are) sometimes decorated via painting the edges of the leaves, usually but not exclusively, the fore edges of text blocks. The painting can be visible when the book is closed, or hidden beneath a layer of gold, when the edges have been gilt. This dataset covers examples...Marks, P.J.M.
bindings, painting under gilt, foreedge , fanned out leaves, fore-edge painting, foreedge paintings , fore-edge paintings, hidden fore edge paintings, fore-edge, and bookbindings
-
Database
British Library Covid-19 Testimony Projects Database
A database of testimony projects in the UK that collected material during the Covid-19 pandemic, compiled by the British Library's Oral History team. The database can be downloaded as a spreadsheet and is an open resource for further research and re-use.British Library ; Johnston, Camille ; Pinkney, Lucy ; White, Madeline
pandemic, testimony projects, coronavirus, COVID, COVID-19, and Covid-19
-
Dataset
Faber Music and Music Sales Publications 2013 to 2018
The ‘Faber Music and Music Sales Publications 2013 to 2018’ dataset is an .xlsx (Excel Workbook) file containing metadata describing 57,202 digital and printed music publications published by Faber Music and Music Sales between 2013 and 2018 and deposited at the British Library under legal deposit legislation. The data was...Roper, Amelie ; British Library
music sales, legal deposit, digital music publications, and Faber Music
-
Dataset
Publishers’ Plate Numbers 1850-
Publisher’s plate numbers are a crucial element in dating 18th and 19th century music, which very rarely carries a publication date. MacLachlan's list supplements the publication “English music publishers' plate numbers in the first half of the nineteenth century” (London, Faber, 1965) by O.W. Neighbour and A. Tyson. He continues...MacLachlan, David
nineteenth century, music publishers, music, and plate numbers
-
Dataset
The Liverpool Standard etc
The Liverpool Standard and General Commercial Advertiser (1832-1856, with two changes of title) was a Conservative newspaper established by local politicians to counter the rise of Radicalism and promote “Church and State” ideology.British Library
-
Dataset
The Northern Daily Times etc
The Liverpool-based Northern Daily Times (1853-1861, with two changes of title) was the first provincial daily newspaper in England to enjoy a sustained run. It was also one of the very first one penny dailies.British Library
-
Dataset
The Sun
The Sun was a daily evening newspaper founded in 1792 with the support of then Prime Minister, William Pitt, and his Tory government. By the mid-1830s the politics of the newspaper had shifted, and it was advocating liberal and free trade principles. Ran 1792-1871, with dataset covering 1801-1871.British Library
-
Dataset
The Express
The Express (1846-1869) was an evening newspaper companion to the Daily News (1846-1912), published by Bradbury & Evans, and advocating reformist principles.British Library
-
Dataset
The Press.
The Press (1853-1866) was a weekly conservative newspaper, to which Benjamin Disraeli regularly contributed.British Library
-
Dataset
The Star
The Star (1788-1831, dataset 1801-1831) was the first daily London evening newspaper. Its circulation was facilitated by the success of the mail-coach service.British Library
-
Dataset
National Register.
The National Register (1808-1823) was a Conservative Sunday newspaper, owned by John Browne Bell, which was hostile to parliamentary reform.British Library
-
Dataset
The British Press; or, Morning Literary Advertiser
The British Press (1803-1826) was a daily newspaper founded in January 1803 in opposition to The Morning Post, with a conservative orientation. It printed the latest news, from home and abroad, for a London readership, and provided early journalistic employment for Charles Dickens.British Library
-
Dataset
Colored News
Colored News (1855) was an illustrated general interest weekly newspaper. It was the first British newspaper to publish illustrations in colour.British Library
-
Dataset
Halifax Local Opinion
The Halifax Local Opinion was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. Th is dataset (BLNewspapers_HalifaxLocalOpinion0003063_1892.zip) is currently unavailable due to a technical glitch when uploading larger files into the repository. Hopefully this will be resolved and the dataset will...British Library
-
Dataset
The Blackpool Gazette & Herald
The Blackpool Gazette & Herald (1874 - 1919) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. All but one of these datasets is currently unavailable due to a technical glitch when uploading larger files into the repository. Hopefully this will...British Library
-
Dataset
Supporting documentation for A Literature Review of Palm Leaf Manuscript Conservation: Parts 1 and 2
Part 1: a historic overview, leaf preparation, materials and media, palm leaf manuscripts at the British Library and the common types of damage Part 2: historic and current conservation treatments, boxing and storage, religious and ethical issues, recommendations The closure of the British Library during the 2020-2021 Covid-19 pandemic allowed... -
Software
Peripleo
Peripleo is a prototype application for the discovery and spatial visualisation of collection data, originally an initiative of the Pelagios Network and developed early in 2022 as part of the British Library's Locating a National Collection project (LaNC). LaNC was a Foundation project within the AHRC-funded Towards a National Collection...Simon, Rainer ; Gadd, Stephen ; Rees, Gethin ; Isaksen, Leif
location, geography, map, and cultural heritage
-
Dataset
Ground Truth transcriptions for training OCR of historical Bengali printed texts – Recognition of Early Indian Printed Documents competition - updated with improved XML coordinates
This dataset comprises 81 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transcriptions (XML) for each page that can be used for training optical character recognition software on historical...British Library ; Derrick, Tom
OCR, Indian, and transcription
-
Dataset
Dataset mapping the movement of Salkey's correspondents across the globe
Microsoft CSV file dataset created for Kepler to map the movement of Salkey's correspondents across the globeBritish Library
-
Dataset
Gephi Dataset for "Mapping Caribbean Diasporic Networks through Correspondence"
Microsoft CSV file dataset created in Gephi that can be uploaded in Gephi to create the visualisation of the network.British Library
-
Dataset
Spatial network dataset for "Mapping the Caribbean Diaspora through Andrew Salkey"
Microsoft csv. file dataset created for Kepler mapping the geographical movement of correspondentsBritish Library
-
Dataset
Kepler Dataset for "Mapping the Caribbean Diaspora through Andrew Salkey's Correspondence"
Dataset created in Kepler to map the movement of the Caribbean diasporic network present in Andrew Salkey's correspondence files.British Library
-
Dataset
All Data for "Mapping the Caribbean Diaspora through Andrew Salkey's Correspondence"
Microsoft excel of all of the metadata created by the project.British Library
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/vtpx-we51. If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
thesis, student, UK, dissertations, PhD, research, doctoral, EThOS, higher education, and theses
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/kvwc-ty06. If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
thesis, student, UK, dissertations, PhD, theses, doctoral, EThOS, higher education, and research
-
Dataset
Incunabula Printed Catalogue Dataset Metadata: Volumes 1-10
This dataset includes the combined catalogue entries derived from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library. The dataset was...British Library
datasets, catalogues, early printing, incunabula, early printed books, metadata, and book history
-
Dataset
Incunabula Printed Catalogue Dataset: Volumes 1-10
This dataset includes the catalogue entries derived from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library. The dataset was created...British Library
datasets, catalogues, early printing, book history, early printed books, metadata, and incunabula
-
Dataset
Text extracted from digitised maps of eastern Africa circa 1880-1940
This dataset comprises an Excel spreadsheet of text extracted from almost 2,000 digital images of maps and documents held in the War Office Archive, covering a large part of eastern Africa between c.1880 and 1940. The items were catalogued and digitised with generous funding from Indigo Trust. The harvested text...Dykes, Nick
War Office Archive, place names, text extraction, military maps, East Africa, computer vision, land use, colonial history, and ethnography
-
Dataset
DeezyMatch training set for OCR
Optical character recognition (OCR) is the process of automatically transcribing text from images. The presence of OCR-induced errors in digitised text is a common problem in the digital humanities. OCR errors are usually due to the misrecognition of characters, such as "h" recognised as "b", or "c" recognised as "o".... -
Dataset
Datasets for toponym recognition and disambiguation for nineteenth-century English newspapers
We present two datasets, one for the task of toponym recognition and one for the task of toponym disambiguation. The datasets are derived from the "Dataset for Toponym Resolution in Nineteenth-Century English Newspapers" (DOI: https://doi.org/10.23636/r7d4-kw08). The toponym recognition dataset consists of two JSON files (ner_fine_train.json and ner_fine_dev.json), whereas the toponym...Coll Ardanuy, Mariona ; Nanni, Federico
toponym disambiguation, nineteenth-century newspapers, named entity recognition, entity linking, toponym resolution, toponym recognition, and dataset