Search Constraints
Search Results
-
Dataset
Digitised Books. c. 1510 - c. 1900. JSONL (OCR derived text + metadata)
The dataset comprises metadata and OCR generated text from 49,455 digitised books published between c. 1510 - c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in JSON Lines (JSONL) text format.British Library Labs ; British Library
OCR and monographs