In partnership with Microsoft, the British Library has digitised, and made freely available under Public Domain Mark, over 60,000 volumes (around 25 million pages) of out of copyright 18th & 19th century texts. Items within this collection cover a wide range of subject areas including geography, philosophy, history, poetry and literature and are published in a variety of languages.
This collection, sometimes referred to as the Microsoft Books/BL 19th Century collection, and its derived datasets have been made available on various platforms under an open license.
All volumes are available for view, download and full-text search from the British Library Catalogue - https://explore.bl.uk - via the Library’s IIIF standard enabled Universal Viewer. Use the search term “blmsd” in Explore to limit results to this specific collection.
Over 1 million images extracted from the book pages programmatically can be found on British Library’s Flickr: https://www.flickr.com/photos/britishlibrary/
The Flickr API can be used to directly download large sets of these images, and other metadata such as user-generated tag information: https://www.flickr.com/services/developer/
JISC Historic Texts holds a full copy of this collection: https://www.jisc.ac.uk/historical-texts
Wikimedia Commons offers a useful introduction to the collection, including a Synoptic Index, as well as projects to georeference maps found in the texts: https://commons.wikimedia.org/wiki/Commons:British_Library/Mechanical_Curator_collection
Title-level listings of news collections held by the British Library, comprising data extracted from the British Library catalogue, with some data cleaning and enhancements.
Full-text records of newspaper titles digitised from the British Library collection. Each file contains the newspaper’s output for one year, with OCR (Optical Character Recognition) text in XML format.
The datasets in this collection comprise snapshots in time of metadata descriptions of hundreds of thousands of PhD theses awarded by UK Higher Education institutions aggregated by the British Library's EThOS service. The data is estimated to cover around 98% of all PhDs ever awarded by UK Higher Education institutions, dating back to 1787.
Previous versions of the datasets are restricted to ensure the most accurate version of metadata is available for download. Please contact openaccess@bl.uk if you require access to an older version.
Using Convolutional Neural Networks to explore and label 400 Years of Book Illustrations by the SherlockNet team, BL Labs competition winners for 2016.
The datasets in this collection contain a number of thematic digitised collections of single sheet items or ephemera between 1628 - 1902. It includes information about portraits of actors, views of theatres, Christmas ballads and broadsides, Signs of taverns, newspaper cuttings, performances of Sir Henry Irving, pamphlets about the British Museum and portraits and biographies of officers in the South African wars.
The files in these datasets are derived from the British Library’s collection of bound volume Quarterly Lists: printed catalogue records of Indian books published quarterly and by province of British India between 1867 and 1947. The catalogues are predominantly in English language with some Indian scripts and mostly arranged in table format, capturing descriptive metadata about the books, including the name and addresses of printers and publishers, the number of copies printed and often the price, as well as much more. The catalogues have been made available through the British Library's Two Centuries of Indian Print project, which is also digitising rare Bengali books dating from 1713-1914, the datasets of which will also be made available through this website.
Images, datasets and catalogue records of maps and topographical views. Currently, comprising the archive of manuscript maps built up and maintained by the British War Office between ca 1890 and 1940. In a project funded by Indigo Trust the British Library has catalogued, conserved and digitised a portion of the archive that relates to the former British East Africa (Kenya, Uganda and surrounding areas).
The India Office Medical Archives, are the records of the East India Company and India Office contain a wealth of information relating to medicine and health in India, particularly for the period 1780-1920. The scanned images are available .JPEG files were converted from TIFF files using Irfan view, read more here: https://www.bl.uk/collection-guides/india-office-medical-archive-collections