Index Catalog // British Library

2015

Dataset

Volumes of Lysons Collectanea (Trades), comprising advertisements, cuttings, and illustrations relating to trades, professions, medical cures. 1660-1825.

The dataset comprises the OCR text derived from four digitised volumes of a collection of advertisements, cuttings and illustrations relating to trades, professions and medical cures from 1660 - 1825.

British Library

text, newspapers, OCR, trades, and adverts

2015

Dataset

Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on amusements 1660-1840

The dataset comprises nine digitised volumes of a collection of broadsides, cuttings and advertisements, relating to public exhibitions and places of amusement from 1660 - 1840 (with OCR-derived text.) Part of the Lysons Collectanea collection.

British Library

amusements, text, newspapers, broadsides, OCR, and adverts

2019

Poster (published)

Living with Machines - Metadata model

Hobson, Tim ; Tolfo, Giorgia

Mitchell's, Living with Machines, newspapers, and directories

2020

Dataset

Living with Machines alpha and beta Zooniverse 'accident' task data

Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names...

Zooniverse volunteers

crowdsourcing, digital history, citizen history, Living with Machines, newspapers, and digital humanities

2022

Journal article

A Dataset for Toponym Resolution in Nineteenth-Century English Newspapers

We present a new dataset for the task of toponym resolution in digitized historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions...

Coll Ardanuy, Mariona ; Beavan, David ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …

nineteenth-century English, geographic information retrieval, benchmark, newspapers, toponym resolution, and dataset

2020

Blog post

What’s in a name? The Sovietisation of the Mongolian language and the Challenges of Reversal

This blog post introduces a newly digitised collection of Mongolian newspapers and discusses how the script of the text within these newspapers highlights issues relating to the Sovietisation of the Mongolian language.

Jevon, Graham

newspapers, Russia, Central Asia, Mongolia, digitisation, China, writing, digital images, and Russian revolution

2020

Blog post

The Legacy of Slavery: A 19th Century Newspaper and 21st Century Racial Inequity

This blog post introduces a newly digitised collection of 18th/19th century Barbadian newspapers and commented on the slavery related content of these newspapers within the context of 21st century racism.

Jevon, Graham

newspapers, Christianity, resistance, Barbados, racism, empire, Americas, colonialism, Caribbean, slavery, digital images, and British Empire

2023

Dataset

The Newspaper Press Directory (1846-1920) - enriched and structured version

Mitchell's Newspaper Press Directories contained an almost complete list of newspapers published in England, Wales, Scotland and Ireland. It was published regularly from 1846 onwards and provided a detailed description of the newspaper landscape over time. This version contains a structured, tabular representation of the directories (as CSV or Excel...

C. Mitchell and Co. ; British Library

press directories and newspapers

2022

Dataset

Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)

Word vectors related to the paper "Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers" by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into...

Pedrazzini, Nilo ; McGillivray, Barbara

historical semantics, word-vectors, late-modern-english, newspapers, diachronic-embeddings, and word2vec

2023

Dataset

DeezyMatch training set for OCR

Optical character recognition (OCR) is the process of automatically transcribing text from images. The presence of OCR-induced errors in digitised text is a common problem in the digital humanities. OCR errors are usually due to the misrecognition of characters, such as "h" recognised as "b", or "c" recognised as "o"....

Coll Ardanuy, Mariona ; Nanni, Federico ; Pedrazzini, Nilo

OCR, fuzzy string matching, string variation, newspapers, digital humanities, natural language processing, DeezyMatch, and Living with Machines

Agents of Enslavement: Colonial newspapers in the Caribbean and hidden genealogies of the enslaved (Coleridge Fellowship)

User Collection

Barbados, Caribbean, slavery, newspapers, 18th century, colonialism, British Empire, and 19th century

2023

Conference paper (unpublished)

Balancing public-private partnerships with responsibilities to our communities

The Living with Machines project (2018-23) was a data science and digital history project between the British Library and The Alan Turing Institute. Its focus on the impact of mechanisation in the long 19th century was in part inspired by the Library's access to newspapers digitised for The British Newspaper...

Ridge, Mia

digitisation, research project, newspapers, and Living With Machines

2023

Dataset

OCR and crowdsourced annotations, Language of Mechanisation, JSON files

Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for...

British Library ; Vieira, Miguel ; Ong, Tiffany ; Ciula, Arianna

mechanisation, newspapers, Industrial Revolution, 19th century British English, historical newspapers, 19th century, analytics, data visualisation, crowdsourcing, transport history, and historical semantics

Research Repository

Search Constraints

Search Results

2015

Dataset

2015

Dataset

2019

Poster (published)

2020

Dataset

2022

Journal article

2020

Blog post

2020

Blog post

2023

Dataset

2022

Dataset

2023

Dataset

2023

Conference paper (unpublished)

2023

Dataset

Limit your search