Search Constraints
Search Results
-
Dataset
OCR and crowdsourced annotations, Language of Mechanisation, JSON files
Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for... -
Dataset
DeezyMatch training set for OCR
Optical character recognition (OCR) is the process of automatically transcribing text from images. The presence of OCR-induced errors in digitised text is a common problem in the digital humanities. OCR errors are usually due to the misrecognition of characters, such as "h" recognised as "b", or "c" recognised as "o".... -
Dataset
Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)
Word vectors related to the paper "Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers" by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into...Pedrazzini, Nilo ; McGillivray, Barbara
historical semantics, word-vectors, late-modern-english, newspapers, diachronic-embeddings, and word2vec
-
Dataset
The Newspaper Press Directory (1846-1920) - enriched and structured version
Mitchell's Newspaper Press Directories contained an almost complete list of newspapers published in England, Wales, Scotland and Ireland. It was published regularly from 1846 onwards and provided a detailed description of the newspaper landscape over time. This version contains a structured, tabular representation of the directories (as CSV or Excel...C. Mitchell and Co. ; British Library
-
Dataset
British Library Newspaper Title-level List: A list of catalogued newspaper titles held by the British Library
A title-level list of catalogued newspapers held by the British Library.British Library
datasets, catalogues, media, newspapers, periodicals, and metadata
-
Dataset
Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
We present a new dataset (version 2) for the task of toponym resolution in digitised historical newspapers in English. It consists of 455 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated...Coll Ardanuy, Mariona ; Beavan, David ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …
nineteenth-century English, dataset, newspapers, toponym resolution, and geographic information retrieval
-
Dataset
Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
We present a new dataset for the task of toponym resolution in digitised historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions...Coll Ardanuy, Mariona ; Beavan, David ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …
nineteenth-century English, geographic information retrieval, newspapers, toponym resolution, and dataset
-
Dataset
Living with Machines alpha and beta Zooniverse 'accident' task data
Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names...Zooniverse volunteers
crowdsourcing, digital history, citizen history, Living with Machines, newspapers, and digital humanities
-
Dataset
British and Irish Newspapers
A title-level list of British, Irish, British Overseas Territories and Crown Dependencies newspapers held by the British Library.British Library
datasets, catalogues, media, newspapers, periodicals, and metadata
-
Dataset
Volumes of Lysons Collectanea (Trades), comprising advertisements, cuttings, and illustrations relating to trades, professions, medical cures. 1660-1825.
The dataset comprises the OCR text derived from four digitised volumes of a collection of advertisements, cuttings and illustrations relating to trades, professions and medical cures from 1660 - 1825.British Library
text, newspapers, OCR, trades, and adverts