Search Constraints
Search Results
-
Dataset
OCR text derived from digitised books published 1900 - c. 1946. ALTO XML.
Unfortunately we are unable to make this dataset available due to copyright reasons. This set consists 1251 volumes, published between 1900-1946. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history,...British Library ; British Library Labs
-
Dataset
Books related to theatre derived from the Digitised 19th Century Books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books pertaining to theatre written in English. The dataset of 841 items was created by filtering by keywords which are related to different genre of play including Drama, Act, Scene, Play, Comedy, Farce, Pantomime, Tragedy and Shakespeare and...British Library ; British Library Labs
act, genre, books, metadata, bibliographic, theatre, and play
-
Dataset
Books related to India from the Digitised 19th Century Books dataset
A dataset which is derived from the Digitised 19th Century books dataset focusing on books related to India. The dataset was created by refining the book title field using keywords related to names used for India during the period, places within India, cultural terms such as 'Hindu' and another term...British Library ; British Library Labs
books, metadata, bibliographic, and India
-
Dataset
Books related to 19th Century British Colonies derived from the Digitised 19th Century books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books related to 19th Century British Colonies. The dataset of 1288 items was created using filtering by keywords of locations and then manually checked for accuracy. The data was augmented with additional columns including 'City', 'Colony Name' and...British Library ; British Library Labs
Africa, colonialism, Canada, Ceylon, metadata, bibliographic, India, Australia, books, British Colony, and British Colonies
-
Dataset
Books related to War derived from the Digitised 19th Century Books Dataset
A dataset which is derived from the Digitised 19th Century Books dataset comprising all non-fiction English language books related to armed conflicts. The dataset of 1127 items was developed by refining based on keywords such as 'war', 'battle', 'uprising', 'revolt', 'rebellion', 'invasion' and 'mutiny'. This dataset was curated by students...British Library ; British Library Labs
non-fiction, War, books, metadata, and bibliographic
-
Dataset
Books related to the Industrial Revolution derived from the Digitised 19th Century books dataset
A dataset which is a subset of the Digitised 19th Century Books dataset comprising books related to the Industrial Revolution in Britain. The subset of 354 items was refined by using keywords associated with placenames and the topic of industrialism. This dataset was curated by the Aepyi student group at...British Library ; British Library Labs
books, industrialism, metadata, bibliographic, and Industrial Revolution
-
Dataset
Latin American books in Digitised 19th century books
A dataset which is derived from the 19th Century Books dataset comprising c.1,100 books which are related to Latin America, written in Spanish, English, German, French, Italian, Swedish and Dutch.British Library ; British Library Labs
books, Latin America, metadata, and bibliographic
-
Dataset
Books divided by Genre from the Digitised 19th century books dataset
A dataset derived from the Digitised 19th Century Books dataset which classifies the books by genre (Drama, Poetry, Prose, Music and unidentified). For Drama, Music and Prose several types were identified. For Drama: comedy, play, recitation and tragedy. For Prose: novel, parody, romance, satire, story, history subset of story and...British Library ; British Library Labs
Music, Genre, Prose, books, Poetry, metadata, bibliographic, and Drama
-
Dataset
Books containing images about Finland
A dataset derived from the Digitised 19th Century books dataset comprising books with images about Finland, approximately 40 titles. This dataset was compiled by Ruby Dixon a student at Graveney School who completed work experience at British Library Labs in 2016.British Library ; British Library Labs
books, Finland, metadata, and bibliographic
-
Dataset
Russian language books in the Digitised 19th century books dataset
A dataset which is a subset of the Digitised 19th Century books dataset comprising Russian Language books. The spreadsheet contains metadata of 585 books in Russian. This dataset was compiled by Nadya Miryanova a student at Lady Eleanor Holles who completed work experience at British Library Labs in 2017.British Library ; British Library Labs
books, metadata, bibliographic, and Russia
-
Dataset
Digitised Books - Images identified as Medium Sized Images. c. 1567 - c. 1900. JPG
The dataset comprises c. 217,101 images identified as 'Medium Sized Images' from the British Library's Flickr Commons collections, dating between c. 1567 - c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Medium Sized...British Library ; British Library Labs
medium sized images, books, images, digitised, and Microsoft
-
Dataset
Digitised Books - Images of the bound covers of books. c. 1510 - c. 1900. JPG
The dataset comprises c. 61,561 images identified as 'Book Covers' from the British Library's Flickr Commons collections, dating between c. 1510 - c. 1900.British Library ; British Library Labs
digitised, books, Microsoft, images, and bookcovers
-
Dataset
Digitised Books - Images identified as Plates. c. 1528 - c. 1900. JPG
The dataset comprises c. 385,237 images identified as 'Plates' from the British Library's Flickr Commons collections, dating between c. 1528 – c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Plates have currently been...British Library ; British Library Labs
-
Dataset
Theatrical playbills from Britain and Ireland
The dataset comprises 264 volumes of digitised theatrical playbills published between 1660 – 1902 (mostly 19th century) from England, Scotland, Wales and Ireland. Digitised from the British Library's physical collection of over 500 volumes of playbills. The dataset in Portable Document Format (PDF). The playbills cover theatres in Bath (Royal),...British Library Labs ; Kirk, Tanya
singlesheet, playbill, and playbills
-
Dataset
OCR text derived from digitised books published 1890 - 1899 in ALTO XML
This set consists 14847 volumes, published between 1890-1899. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG
The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The images are in .JPEG format.British Library ; British Library Labs
embellishments, digitised, books, Microsoft, and images
-
Dataset
Digitised Books - Images identified as Medium Sized Images. c. 1567 - c. 1900. JPG
The dataset comprises c. 217,101 images identified as 'Medium Sized Images' from the British Library's Flickr Commons collections, dating between c. 1567 - c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Medium Sized...British Library ; British Library Labs
digitised, books, Microsoft, images, and medium sized images
-
Dataset
Theatrical playbills from Britain and Ireland (OCR text only)
The dataset comprises 264 volumes of digitised theatrical playbills published between 1660 – 1902 (mostly 19th century) from England, Scotland, Wales and Ireland. Digitised from the British Library's physical collection of over 500 volumes of playbills. The dataset contains text files (.TXT) in Optical Character Recognition (OCR) format. The playbills...British Library Labs
singlesheet, text, playbill, OCR, and playbills
-
Dataset
OCR text derived from digitised books published 1880 - 1889 in ALTO XML
This set consists 10856 volumes, published between 1880-1889. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, ALTO, and nineteenth century
-
Dataset
OCR text derived from digitised books published 1870 - 1879 in ALTO XML
This set consists 8630 volumes, published between 1870-1879. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1860 - 1869 in ALTO XML
This set consists 7498 volumes, published between 1860-1869. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1850 - 1859 in ALTO XML
This set consists 5818 volumes, published between 1850-1859. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
Digitised Books. c. 1510 - c. 1900. JSON (OCR derived text)
The dataset comprises text created by OCR from the 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in JavaScript Object Notation (JSON) text...British Library ; British Library Labs
-
Dataset
OCR text derived from digitised books published 1830 - 1839 in ALTO XML
This set consists 2639 volumes, published between 1830-1839. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
Digitised Books - Flickr Tag History - Dec 2013 to March 2016. TSV
The dataset comprises user submitted tags (with dates) added to the British Library Flickr Commons collections up to March 2016. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900. The books cover a wide range of...British Library ; British Library Labs
tag history, microsoft, books, digitised, tags, Flickr, tagging, and TSV
-
Dataset
Digitised 19th Century Books - Metadata - 01/09/2013
The dataset holds metadata on the the books digitised within this collection, providing a quick means to connect a book identifier with some of the key bibliographic metadata about it. The metadata is held in JSON notation for ease of reuse.British Library ; British Library Labs
JSON, microsoft, books, metadata, and bibliographic
-
Dataset
OCR text derived from digitised books (unknown precise publication dates) in ALTO XML
This set consists 284 volumes. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language...British Library ; British Library Labs
XML, microsoft, books, digitised, metadata, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1820 - 1829 in ALTO XML
This set consists 2739 volumes, published between 1820-1829. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1810 - 1819 in ALTO XML
This set consists 2338 volumes, published between 1810-1819. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1800 - 1809 in ALTO XML
This set consists 1502 volumes, published between 1800-1809. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1700 - 1799 in ALTO XML.
The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, eighteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published c. 1510 - 1699 in ALTO XML
This set consists 693 volumes, published between c. 1510 - 1699. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and...British Library ; British Library Labs
XML, sixteenth century, books, digitised, seventeenth century, Microsoft, ALTO, and metadata
-
Dataset
OCR text derived from digitised books published 1840 - 1849 in ALTO XML
This set consists 4070 volumes, published between 1840-1849. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO