Digitised Books. c. 1510 - c. 1900. JSON (OCR derived text) - British Library Research Repository
Skip to main content
Shared Research Repository
Dataset

Digitised Books. c. 1510 - c. 1900. JSON (OCR derived text)

2014

Abstract

The dataset comprises text created by OCR from the 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in JavaScript Object Notation (JSON) text format. Links metadata, PDFs, Flickr images, digital versions

Files

File nameDate UploadedVisibilityFile size
dig19cbooksjsontext.bz2
18 Dec 2018
Public
11.4 GB