Search Constraints
Search Results
-
Dataset
UK Doctoral Thesis Metadata from EThOS
The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the UK's national thesis service. We estimate the data covers around 98% of all PhDs ever awarded by UK Higher Education institutions, dating back to 1787. Thesis metadata from every PhD-awarding university in...British Library ; Rosie, Heather
higher education, student, UK, dissertations, PhD, theses, doctoral, ethos, thesis, and research
-
Dataset
al-Durr al-naqī fī fann al-mūsīqī (Add MS 23494)
This dataset is a PDF file containing the images and transcription the manuscript titled al-Durr al-naqī fī fann al-mūsīqī الدرّ النقيّ في فنّ الموسيقي by Aḥmad ibn 'Abd al-Raḥmān al-Mawṣilī أحمد بن عبد الرحمن الموصلي. The manuscript was digitised through the British Library Qatar Foundation Partnership, and made available through...British Library ; Keinan-Schoonbaert, Adi
transcription, Arabic, and OCR
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version (5): https://doi.org/10.23636/1344 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS,...British Library ; Rosie, Heather
higher education, ethos, dissertations, HE, research, PhD, doctoral, student, UK, theses, and thesis
-
Dataset
Books related to theatre derived from the Digitised 19th Century Books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books pertaining to theatre written in English. The dataset of 841 items was created by filtering by keywords which are related to different genre of play including Drama, Act, Scene, Play, Comedy, Farce, Pantomime, Tragedy and Shakespeare and...British Library ; British Library Labs
act, genre, books, metadata, bibliographic, theatre, and play
-
Dataset
Books related to India from the Digitised 19th Century Books dataset
A dataset which is derived from the Digitised 19th Century books dataset focusing on books related to India. The dataset was created by refining the book title field using keywords related to names used for India during the period, places within India, cultural terms such as 'Hindu' and another term...British Library ; British Library Labs
books, metadata, bibliographic, and India
-
Dataset
Books related to 19th Century British Colonies derived from the Digitised 19th Century books dataset
A dataset derived from the Digitised 19th Century Books dataset which contains books related to 19th Century British Colonies. The dataset of 1288 items was created using filtering by keywords of locations and then manually checked for accuracy. The data was augmented with additional columns including 'City', 'Colony Name' and...British Library ; British Library Labs
Africa, colonialism, Canada, Ceylon, metadata, bibliographic, India, Australia, books, British Colony, and British Colonies
-
Dataset
Books related to War derived from the Digitised 19th Century Books Dataset
A dataset which is derived from the Digitised 19th Century Books dataset comprising all non-fiction English language books related to armed conflicts. The dataset of 1127 items was developed by refining based on keywords such as 'war', 'battle', 'uprising', 'revolt', 'rebellion', 'invasion' and 'mutiny'. This dataset was curated by students...British Library ; British Library Labs
non-fiction, War, books, metadata, and bibliographic
-
Dataset
Books related to the Industrial Revolution derived from the Digitised 19th Century books dataset
A dataset which is a subset of the Digitised 19th Century Books dataset comprising books related to the Industrial Revolution in Britain. The subset of 354 items was refined by using keywords associated with placenames and the topic of industrialism. This dataset was curated by the Aepyi student group at...British Library ; British Library Labs
books, industrialism, metadata, bibliographic, and Industrial Revolution
-
Dataset
Latin American books in Digitised 19th century books
A dataset which is derived from the 19th Century Books dataset comprising c.1,100 books which are related to Latin America, written in Spanish, English, German, French, Italian, Swedish and Dutch.British Library ; British Library Labs
books, Latin America, metadata, and bibliographic
-
Dataset
Books divided by Genre from the Digitised 19th century books dataset
A dataset derived from the Digitised 19th Century Books dataset which classifies the books by genre (Drama, Poetry, Prose, Music and unidentified). For Drama, Music and Prose several types were identified. For Drama: comedy, play, recitation and tragedy. For Prose: novel, parody, romance, satire, story, history subset of story and...British Library ; British Library Labs
Music, Genre, Prose, books, Poetry, metadata, bibliographic, and Drama
-
Dataset
Books containing images about Finland
A dataset derived from the Digitised 19th Century books dataset comprising books with images about Finland, approximately 40 titles. This dataset was compiled by Ruby Dixon a student at Graveney School who completed work experience at British Library Labs in 2016.British Library ; British Library Labs
books, Finland, metadata, and bibliographic
-
Dataset
Russian language books in the Digitised 19th century books dataset
A dataset which is a subset of the Digitised 19th Century books dataset comprising Russian Language books. The spreadsheet contains metadata of 585 books in Russian. This dataset was compiled by Nadya Miryanova a student at Lady Eleanor Holles who completed work experience at British Library Labs in 2017.British Library ; British Library Labs
books, metadata, bibliographic, and Russia
-
Dataset
Ground Truth transcriptions for training OCR of historical Arabic handwritten texts
This dataset comprises 120 digitised images (TIFF files) drawn from a selection of historical Arabic scientific manuscripts (10th-19th century) digitised through the British Library Qatar Foundation Partnership. Also contained are ground truth transcriptions (XML) for each page that can be used for training optical character recognition (OCR) or handwritten text...British Library ; Keinan-Schoonbaert, Adi
Arabic, transcription, and OCR
-
Dataset
Ground Truth transcriptions for training OCR of historical Bengali printed texts - Transkribus
This dataset comprises 74 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transcriptions (XML) for each page that can be used for training optical character recognition software on historical...British Library ; Derrick, Tom
OCR, transcription, and Indian
-
Dataset
Ground Truth transcriptions for training OCR of historical Bengali printed texts - Recognition of Early Indian Printed Documents competition
This dataset comprises 81 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transcriptions (XML) for each page that can be used for training optical character recognition software on historical...British Library ; Derrick, Tom
Indian, transcription, and OCR
-
Dataset
Digitised Books - Images identified as Medium Sized Images. c. 1567 - c. 1900. JPG
The dataset comprises c. 217,101 images identified as 'Medium Sized Images' from the British Library's Flickr Commons collections, dating between c. 1567 - c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Medium Sized...British Library ; British Library Labs
medium sized images, books, images, digitised, and Microsoft
-
Dataset
Digitised Books - Images of the bound covers of books. c. 1510 - c. 1900. JPG
The dataset comprises c. 61,561 images identified as 'Book Covers' from the British Library's Flickr Commons collections, dating between c. 1510 - c. 1900.British Library ; British Library Labs
digitised, books, Microsoft, images, and bookcovers
-
Dataset
AAS Card Catalogues: Chinese (Wade Giles)
This dataset contains digitised cards from the Wade-Giles card catalogue.British Library
-
Dataset
Digitised Books - Images identified as Plates. c. 1528 - c. 1900. JPG
The dataset comprises c. 385,237 images identified as 'Plates' from the British Library's Flickr Commons collections, dating between c. 1528 – c. 1900. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900; Plates have currently been...British Library ; British Library Labs
-
Dataset
Volumes of performances connecting Sir Henry Irving. 1879 - 1905.
Sir Henry Irving's American and Provincial Tours 1883 - 1905; miscellaneous performances, including some given by Royal Command, 1883 - 1903; Lyceum Theatre 1879 – 1902; and Drury Lane Theatre, 1903 and 1905. The collection was formed by Bram Stoker.British Library
-
Dataset
Volumes of Lysons Collectanea (Trades), comprising advertisements, cuttings, and illustrations relating to trades, professions, medical cures. 1660-1825.
The dataset comprises the OCR text derived from four digitised volumes of a collection of advertisements, cuttings and illustrations relating to trades, professions and medical cures from 1660 - 1825.British Library
text, newspapers, OCR, trades, and adverts
-
Dataset
Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on amusements 1660-1840
The dataset comprises nine digitised volumes of a collection of broadsides, cuttings and advertisements, relating to public exhibitions and places of amusement from 1660 - 1840 (with OCR-derived text.) Part of the Lysons Collectanea collection.British Library
amusements, text, newspapers, broadsides, OCR, and adverts
-
Dataset
Volumes of portraits and biographies of officers in the South African wars collected by John Malcolm Bulloch. 1900 - 1902.
The dataset comprises six digitised volumes (in PDF) of a collection of portraits and biographical details of some officers distinguished in the South African War (1900 - 1902) (with OCR-derived text.) The collection was formed by John Malcolm Bulloch..British Library
South Africa, text, portraits, war, army, OCR, biographies, and biography
-
Dataset
Volumes of Madden's cuttings, views, and pamphlets about the British Museum. 1755-1870.
The dataset comprises four digitised volumes of a collection of cuttings, views and pamphlets made by Sir Frederic Madden about the British Museum, dating 1755 - 1870 (with OCR-derived text.)British Library
British Museum, text, and OCR
-
Dataset
Volume of Christmas ballads and broadsides. 1750 - 1840
110 page PDF of miscellaneous Christmas ballads and prose broadsides (with OCR-derived text.) The dataset comprises one digitised volume (110 pages) of a collection of Christmas ballads and prose broadsides chiefly printed in London by J. Pitts between 1750 - 1840. The dataset is in Portable Document Format (PDF).British Library
-
Dataset
Portraits of actors, views of theatres and playbills (covering 1750 - 1821 in a single volume)
166 page PDF of collated portraits and views (with OCR-derived text) The dataset comprises one digitised volume (166 pages) of a collection of portraits of celebrated actors and actresses, views of theatres and playbills, dating 1750 - 1821. The dataset is in Portable Document Format (PDF).British Library
text, theatres, views, portraits, actors, OCR, and playbills
-
Dataset
Volumes of signs of taverns in England and Wales. 1628 - 1858
The dataset comprises 14 digitised volumes (as PDFs) of a collection of tavern signs in and England and Wales dating 1628 – 1858 (with OCR-derived text.)British Library
-
Dataset
OCR text derived from digitised books published 1890 - 1899 in ALTO XML
This set consists 14847 volumes, published between 1890-1899. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
Digitised Books - Images identified as Embellishments. c. 1510 - c. 1900. JPG
The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The images are in .JPEG format.British Library ; British Library Labs
embellishments, digitised, books, Microsoft, and images
-
Dataset
Pelagios Project: Portolano. Egerton MS 2855
This dataset comprises 7 images from a Portolano produced in 1473 by Grazioso Benincasa. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Update – 9/3/2018: Please note that there were technical issues with this dataset. If you downloaded before 9/3/2018, please replace with this repaired...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Maps after Ptolemy's Geographia. Burney MS 111
This dataset comprises 68 images from a Greek manuscript edition of Ptolemy's Geography containing many diagrams and coloured maps, and produced between the 1375 and 1425. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Digitised Liber insularum Arcipelagi Cotton MS Vespasian a.XIII.art.1
This dataset comprises 82 images from the Liber insularum Arcipegelagi, an illustrated account of the islands and major ports of the Mediterranean produced by Christophori Bondelmonti around 1422. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Digitised Insularium Illustratum. Additional MS 15760
This dataset comprises 123 images from the Insularium Illustratum, an account of the islands of the Mediterranean, and of some others produced by Henricus Martellus Germanus in 1495. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Liber insularum Cycladum. Arundel MS 93.art.7
This dataset comprises 45 images from the Liber insularum Cycladum produced by Christophori Bondelmonti around 1422. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until 2039. However, given the age of the manuscripts and their...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
Pelagios Project: Single Sheets and Supplementary Materials
This dataset comprises 30 images selected among individual maps and non-map based medieval materials such as travel accounts. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until 2039. However, given the age of the manuscripts...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
OCR text derived from digitised books published 1880 - 1889 in ALTO XML
This set consists 10856 volumes, published between 1880-1889. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, ALTO, and nineteenth century
-
Dataset
OCR text derived from digitised books published 1870 - 1879 in ALTO XML
This set consists 8630 volumes, published between 1870-1879. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1860 - 1869 in ALTO XML
This set consists 7498 volumes, published between 1860-1869. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
OCR text derived from digitised books published 1850 - 1859 in ALTO XML
This set consists 5818 volumes, published between 1850-1859. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO)...British Library ; British Library Labs
XML, books, digitised, metadata, Microsoft, nineteenth century, and ALTO
-
Dataset
Pelagios Project: Digitised Cornaro Atlas. Egerton MS 73
This dataset comprises 37 images from a Portolano executed by different Venetian artists between 1489 and 1492, known as the 'Cornaro Atlas'. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project. Due to UK copyright law they are technically in copyright until 2039. However, given the...British Library
XML, medieval, metadata, manuscripts, and maps
-
Dataset
AAS Card Catalogues: Sanskrit
This dataset contains digitised microfilms of Sanskrit card catalogues (1926-1983).British Library
-
Dataset
AAS Card Catalogues: Hindi
This dataset contains digitised microfilms of Hindi card catalogues (1903-1983)British Library
-
Dataset
AAS Card Catalogues: Telugu
This dataset contains digitised microfilms of Telugu card catalogues.British Library
-
Dataset
AAS Card Catalogues: Chinese (Pinyin)
This dataset contains digitised cards from the Pinyin card catalogue.British Library
-
Dataset
AAS Card Catalogues: Tamil
This dataset contains digitised microfilms of Tamil card catalogues.British Library
-
Dataset
AAS Card Catalogues: Sinhalese
This dataset contains digitised microfilms of Sinhalese card catalogues.British Library
-
Dataset
AAS Card Catalogues: Malayalam
This dataset contains digitised microfilms of Malayalam card catalogues.British Library
-
Dataset
AAS Card Catalogues: Persian
This dataset contains digitised microfilms of Persian card catalogues (1921-1976).British Library
-
Dataset
AAS Card Catalogues: Armenian
This dataset contains digitised microfilms of Armenian card catalogues.British Library
-
Dataset
AAS Card Catalogues: Dravidian Minor Languages
This dataset contains digitised microfilms of Dravidian Minor Language card catalogues.British Library
AAS, Minor Languages, Dravidian, catalogue, and card