Search Constraints
Search Results
-
Research report
Living with Machines Final Report
It is with great pride that I write this end of project report, as well as some sadness. When the other investigators and I set out the vision for this project in 2017, we had some big dreams. Living with Machines was imagined at once as a data-driven history project,... -
Conference paper (published)
MapReader: a computer vision pipeline for the semantic exploration of maps at scale
We present MapReader, a free, open-source software library written in Python for analyzing large map collections. MapReader allows users with little computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and...Hosseini, Kasra ; Wilson, Daniel C. S. ; Beelen, Kaspar ; McDonough, Katherine
maps and ordnance survey
-
Research report
Publishing updated version of ‘R for Newspaper Data’
My Living with Machines Digital Residency, which I carried out between May and July 2023, allowed me to update and publish an online book on accessing and analysing newspaper data. The goal of the book is to make available an end-to-end set of instructions and tutorials which would allow researchers,...Ryan, Yann Ciarán
digital humanities, Victorian, historical newspapers, and Living with Machines
-
Research report
Circulations and Entangled History in 19th Century Chile
The Living with Machines Digital Residences have offered our research team a remarkable opportunity to experiment with methods for extracting data from historical newspapers dating back 100 to 150 years. This interdisciplinary project aims to expand the scope of digital humanities and historical research by developing automated techniques for data...Hayward, Jennifer ; Valenzuela, Gillian ; Shakib, Khandokar
digital humanities, historical newspapers, Living with Machines, and Victorian
-
Research report
An Etiquette For Minor Time Travel
A report documenting the Living with Machines Digital Residency project called "An Etiquette for Minor Time Travel" by Robert Sherman. Via an open call and running from to July 2023, six Digital Residencies were funded by the Living with Machines project to support researchers and practitioners devising creative approaches to...Sherman, Robert
art, Living with Machines, Victorian, rail, data visualisation, digital humanities, and poetry
-
Research report
Visualising press politics in the United Kingdom
This project's main goal, as outlined in the initial proposal, was to develop an interactive, open-source web app that visualizes data from the Press Directories dataset alongside historical general election results. In this report, I will delve into the challenges encountered, the interesting findings uncovered, and the potential avenues for...Bonato, Nicolò
historcial newspapers, digital humanities, Victorian, politics, data visualisation, and Living with Machines
-
Research report
The Devils After the Fall
This is a report by Nicola Baldwin of a Digital Residency at Living With Machines, on the dataset: Crowdsourced accidents data from Newspapers, working in line with the LWM aims for Radical Collaboration, New Perspectives, Analysing at Scale. The following document is a personal account of work done, thoughts arising...Baldwin, Nicola
Living with Machines, accidents, Victorian, film, crowdsourcing, newspaper, and digital humanities
-
Research report
UK Railway Archive (AR-UK)
Archive The Railway UK (AR-UK) is a comprehensive digital platform designed to enhance the online archives of the UK rail network. This initiative, developed in collaboration with Living with Machines, is primarily focused on research, historical preservation, and providing public access to railway history. As a centralized resource, it caters...Sheppard, Joanne
digital humanities, Victorian, rail, data visualisation, and Living with Machines
-
Journal article
Working at scale: what do computational methods mean for research using cases, models and collections?
Open access, peer-reviewed article published in Science Museum Group Journal, as part of a double-length special issue for the AHRC TaNC discovery project, 'Congruence Engine'. The article gives a critical overview of how 'scale' operates as a keyword within computational humanities as well as reviewing a number of cognate fields,...Wilson, Daniel C S
machine learning, AI for GLAM, STS, scale, computational humanities, history, and congruence engine
-
Dataset
OCR and crowdsourced annotations, Language of Mechanisation, JSON files
Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for... -
Dataset
Language of Mechanisation: annotated historical newspaper articles
Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for... -
Software
Living-with-machines/MapReader: End of LwM
This release marks the end of the current funding for MapReader during the Living with Machines (LwM) project. @kasra-hosseini @andrewphilipsmith @rwood-97 @kmcdono2 @dcsw2 @kallewesterling @kasparvonbeelenHosseini, Kasra ; Wood, Rosie ; Smith, Andy ; McDonough, Katie ; Wilson, Daniel C. S. …
computer vision and maps
-
Other
Models for MapReader ACM SIGSPATIAL 2023 Geohumanities Workshop paper
Collection of fine-tuned models created during research published in Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen, and Katherine McDonough. 2022. MapReader: a computer vision pipeline for the semantic exploration of maps at scale. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on Geospatial Humanities (GeoHumanities '22). Association for...Hosseini, Kasra ; Beelen, Kaspar ; McDonough, Katherine ; Wilson, Daniel C. S.
computational humanities, computer vision, maps, models, and image classification
-
Learning object
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 2)
This is the second of a two-part lesson introducing deep learning based computer vision methods for humanities research. This lesson digs deeper into the details of training a deep learning based computer vision model. It covers some challenges one may face due to the training data used and the importance...Strien, Daniel van ; Beelen, Kaspar ; Wevers, Melvin ; Smits, Thomas ; McDonough, Katherine
-
Learning object
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1)
This is the first of a two-part lesson introducing deep learning based computer vision methods for humanities research. Using a dataset of historical newspaper advertisements and the fastai Python library, the lesson walks through the pipeline of training a computer vision model to perform image classification.Strien, Daniel van ; Beelen, Kaspar ; Wevers, Melvin ; Smits, Thomas ; McDonough, Katherine
-
Dataset
Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)
Word vectors related to the paper "Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers" by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into...Pedrazzini, Nilo ; McGillivray, Barbara
historical semantics, word-vectors, late-modern-english, newspapers, diachronic-embeddings, and word2vec
-
Dataset
Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919)
Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation:...Pedrazzini, Nilo
historical semantics, British newspapers, word embeddings, word vectors, word2vec, and Late Modern English
-
Dataset
Diachronic and diatopic word embeddings from newspapers digitised by the British Library (1830-1889): North and South England
Diachronic word embeddings (decade-level) trained with Word2Vec (via Gensim) on different geographic subcorpora of the Heritage Made Digital British and the Living with Machines historical newspaper collections: - North England (north.zip) - South England (south.zip) At the moment, for each subcorpus, Word2Vec models are available for each decade in the...Pedrazzini, Nilo ; McGillivray, Barbara
historical semantics, diachronic embeddings, late modern English, word embeddings, word vectors, word2vec, and diatopic embeddings
-
Dataset
Living with Machines Zooniverse Participant Survey
Summary results from a survey of contributors to Living with Machines Zooniverse crowdsourcing projects. Responses were received between 24 May and 13 June 2022. We designed the survey so that we could align our reporting with two other audience / participant research groups. Firstly, we used the demographic categories that...British Library
online volunteering, digital participation, citizen science, citizen history, questionnaire, crowdsourcing, survey, and audience research
-
Book chapter
Hunting for Treasure: Living with Machines and the British Library Newspaper Collection
This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to...Tolfo, Giorgia ; Vane, Olivia ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …
interdisciplinarity, digitised newspaper collections, digital corpus, research workflows, and digitisation strategy
-
Dataset
The Newspaper Press Directory (1846-1920) - enriched and structured version
Mitchell's Newspaper Press Directories contained an almost complete list of newspapers published in England, Wales, Scotland and Ireland. It was published regularly from 1846 onwards and provided a detailed description of the newspaper landscape over time. This version contains a structured, tabular representation of the directories (as CSV or Excel...C. Mitchell and Co. ; British Library
-
Exhibition object labels
Living with Machines: Human stories from the industrial age (exhibition board text)
‘Living with Machines: Human stories from the industrial age’ was a free exhibition at Leeds City Museum from July 2022-January 2023. It explored how machines and mechanisation changed life and work in Leeds and the surrounding regions. A collaboration between the British Library and Leeds City Museum, the exhibition was...Ridge, Mia ; McGoldrick, John
history of science, mechanisation, data science, industrialisation, history of technology, and information visualisation
-
Dataset
Stockton Herald, South Durham and Cleveland Advertiser
Stockton Herald, South Durham and Cleveland Advertiser. (1858 - 1918) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Darlington & Richmond Herald
Darlington & Richmond Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Colne Valley Guardian
Colne Valley Guardian was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Widnes Examiner
Widnes Examiner (1876-1920) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
The Runcorn Examiner
The Runcorn Examiner (1870-1954) was a weekly newspaper and years 1870-1920 have been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Warrington Examiner
Warrington Examiner (1869-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Glasgow Courier
Glasgow Courier was a thrice weekly/bi-weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Nuneaton Times
Nuneaton Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Birkenhead News
Birkenhead News was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
British Miner and General Newsman
British Miner and General Newsman was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Swansea Journal and South Wales Liberal
Swansea Journal and South Wales Liberal was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Dorset County Express and Agricultural Gazette
Dorset County Express and Agricultural Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Poole Telegram
Poole Telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Pontypridd District Herald
Pontypridd District Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Northern Weekly Gazette
Northern Weekly Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Liverpool Weekly Courier
Liverpool Weekly Courier was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Kenilworth Advertiser
Kenilworth Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Brighouse & Rastrick Gazette
Brighouse & Rastrick Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Bridport, Beaminster and Lyme Regis telegram
Bridport, Beaminster and Lyme Regis telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Bridlington and Quay Gazette
Bridlington and Quay Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Central Glamorgan Gazette
Central Glamorgan Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Blandford Weekly News
Blandord Weekly News was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Warwickshire Herald
Warwickshire Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Weymouth Telegram
Weymouth Telegram (1860 - 1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Denton and Haughton Examiner
Denton and Haughton Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. Variant titles are 1873-74 The Denton, Haughton, & District Weekly News. 1874-75 Denton & Haughton Weekly News, and Audenshaw, Hooley Hill, and Dukinfield Advertiser, 1875-78 Denton Examiner, Audenshaw,...British Library
-
Dataset
Barrow Herald and Furness Advertiser
Barrow Herald and Furness Advertiser. (1863 - 1914) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Atherstone, Nuneaton, and Warwickshire Times
Atherstone, Nuneaton, and Warwickshire Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Swansea and Glamorgan Herald, and South Wales Free Press
Swansea and Glamorgan Herald, and South Wales Free Press. (1847 - 1890) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Cradley Heath & Stourbridge Observer
Cradley Heath & Stourbridge Observer. (1864 - 1888) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Potteries Examiner
Potteries Examiner (1871 - 1881) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Glasgow Chronicle
Glasgow Chronicle was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Northern Guardian (Hartlepool)
Northern Guardian (Hartlepool) (1891 - 1902) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
North Cumberland Reformer
North Cumberland Reformer (1890 - 1898) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Lancaster Standard and County Advertiser
Lancaster Standard and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Lancaster Herald and Town and County Advertiser
Lancaster Herald and Town and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Alston Herald, and East Cumberland Advertiser
Alston Herald, and East Cumberland Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Weekly Journal
The file consists of the OCR (Optical Character Recognition) text in XML format for one year of Weekly Journal (Hartlepool) 1901. The full digitised newspaper comprises no. 1–407 (29 Nov.1901 – 17 Sep.1909). The digitised page images are available on the British Newspaper Archive website, https://www.britishnewspaperarchive.co.uk/titles/weekly-journal-hartlepool The British Newspaper Archive...British Library
-
Dataset
Tamworth Miners' Examiner and Working Men's Journal
Tamworth Miners' Examiner and Working Men's Journal (1873 - 1876) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Stretford and Urmston Examiner
Stretford and Urmston Examiner. (1879 - 1880) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Stalybridge Examiner
Stalybridge Examiner (1876) which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
St. Helens Examiner
St. Helens Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
The Stockton Examiner
The Stockton Examiner (1878-1879) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
The Cannock Chase Examiner
The Cannock Chase Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Midland Examiner and Wolverhampton Times
Midland Examiner and Wolverhampton Times (1874-1878) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Shropshire Examiner
Shropshire Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
South Staffordshire Examiner
South Staffordshire Examiner (1874) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Forest of Dean Examiner
Forest of Dean Examiner (1873-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Cotton Factory Times
Cotton Factory Times (1885-1889, 1891-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Journal article
A Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
We present a new dataset for the task of toponym resolution in digitized historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions... -
Dataset
Frederick May's London Press Dictionary and Advertiser's Handbook (1883-1911)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price publisher office political and religious leaningFrederick May & Son ; British Library
-
Dataset
The Newspaper Press Directory (1881-1920)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in...C. Mitchell and Co. ; British Library
-
Abstract
Historic machines from 'prams' to 'Parliament': new avenues for collaborative linguistic research
Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship (see e.g. Beelen et al. 2021). More importantly, a lot of computational research looks at texts in a historical...Ridge, Mia ; Tolfo, Giorgia ; Westerling, Kalle ; Pedrazzini, Nilo ; McGillivray, Barbara
crowdsourcing, computational linguistics, and digital humanities
-
Journal article
MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale
We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i)...Hosseini, Kasra ; Wilson, Daniel C.S. ; Beelen, Kaspar ; McDonough, Katherine
-
Research report
Data Study Group Final Report: The National Archives, UK: Discovering Topics and Trends in the UK Government Web Archive
The challenge we address in this report is to make steps towards improving search and discovery of resources within this vast archive for future archive users, and how the UKGWA collection could begin to be unlocked for research and experimentation by approaching it as data (i.e. as a dataset at...Beavan, David ; Nanni, Federico
-
Journal article
Design Choices for Productive, Secure, Data-Intensive Research at Scale in the Cloud
We present a policy and process framework for secure environments for productive data science research projects at scale, by combining prevailing data security threat and risk profiles into five sensitivity tiers, and, at each tier, specifying recommended policies for data classification, data ingress, software ingress, data egress, user access, user...Arenas, Diego ; Atkins, Jon ; Austin, Claire ; Beavan, David ; Cabrejas Egea, Alvaro …
-
Journal article
Neural Language Models for Nineteenth-Century English
We present four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include static (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance...Hosseini, Kasra ; Beelen, Kaspar ; Colavizza, Giovanni ; Coll Ardanuy, Mariona
-
Conference paper (published)
When Time Makes Sense: A Historically-Aware Approach to Targeted Sense Disambiguation
As languages evolve historically, making computational approaches sensitive to time can improve performance on specific tasks. In this work, we assess whether applying historical language models and time-aware methods help with determining the correct sense of polysemous words. We outline the task of time-sensitive Targeted Sense Disambiguation (TSD), which aims...Beelen, Kaspar ; Nanni, Federico ; Coll Ardanuy, Mariona ; Hosseini, Kasra ; Tolfo, Giorgia …
-
Journal article
Maps of a Nation? The Digitized Ordnance Survey for New Historical Research
Although the Ordnance Survey has itself been the subject of historical research, scholars have not systematically used its maps as primary sources of information. This is partly for disciplinary reasons and partly for the technical reason that high-quality maps have not until recently been available digitally, geo-referenced, and in color....Hosseini, Kasra ; McDonough, Katherine ; van Strien, Daniel ; Vane, Olivia ; Wilson, Daniel C.S.
-
Conference paper (published)
Living Machines: A study of atypical animacy
This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it,...Coll Ardanuy, Mariona ; Nanni, Federico ; Beelen, Kaspar ; Hosseini, Kasra ; Ahnert, Ruth …
nineteenth-century English, living machines, BERT, and animacy
-
Conference paper (unpublished)
A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching
Recognizing toponyms and resolving them to their real-world referents is required for providing advanced semantic access to textual data. This process is often hindered by the high degree of variation in toponyms. Candidate selection is the task of identifying the potential entities that can be referred to by a toponym... -
Book chapter
How we got here
Wilson, Daniel C.S.
-
Book chapter
Text Meets Space: Geographic Content Extraction, Resolution and Information Retrieval
In this half-day tutorial, we will review the basic concepts of, methods for, and applications of geographic information retrieval, also showing some possible applications in fields such as the digital humanities. The tutorial is organized in four parts. First we introduce some basic ideas about geography, and demonstrate why text...Leidner, Jochen ; Martins, Bruno ; McDonough, Katherine ; Purves, Ross
-
Research report
Data Study Group Final Report: Smart monitoring for conservation areas
WWF (World Wide Fund for Nature) monitors over 250,000 protected areas (e.g. national parks and nature reserves) and thousands of other sites and critical habitats. These sites are the foundation of global natural assets and are central to the preservation of biodiversity and human well-being. Unfortunately, they face increasing pressures... -
Abstract
Using smart annotations to map the geography of newspapers
Geographic information is a key component in the description of collection objects, and yet its format is often unsuited for use with methods of geographic analysis. Catalogue entries are often inconsistent, in plain text, and without geographic coordinates (much less coordinates linked to authority records). Georesolution of the relevant fields...Ryan, Yann ; Coll Ardanuy, Mariona ; van Strien, Daniel ; Hosseini, Kasra ; Beelen, Kaspar …
-
Conference paper (unpublished)
Contextualizing Victorian Newspapers
Beelen, Kaspar ; Ahnert, Ruth ; Beavan, David ; Coll Ardanuy, Mariona ; Hosseini, Kasra …
-
-
Conference paper (unpublished)
Defoe: A Spark-Based Toolbox for Analysing Digital Historical Textual Data
This work presents defoe, a new scalable and portable digital eScience toolbox that enables historical research. It allows for running text mining queries across large datasets, such as historical newspapers and books in parallel via Apache Spark. It handles queries against collections that comprise several XML schemas and physical representations.... -
Journal article
Babbage among the insurers: Big 19th-century data and the public interest
This article examines life assurance and the politics of ‘big data’ in mid-19th-century Britain. The datasets generated by life assurance companies were vast archives of information about human longevity. Actuaries distilled these archives into mortality tables – immensely valuable tools for predicting mortality and so pricing risk. The status of...Wilson, Daniel C.S.
big data, Thomas Rowe Edmonds, Charles Babbage, public interest, and insurance
-
Dataset
Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
We present a new dataset (version 2) for the task of toponym resolution in digitised historical newspapers in English. It consists of 455 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated...Coll Ardanuy, Mariona ; Beavan, David ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …
nineteenth-century English, dataset, newspapers, toponym resolution, and geographic information retrieval
-
Research report
Living with Machines Delivery Plan version 1, 2019
Living with Machines is a five-year collaborative project. It aims to generate new perspectives on the effects of the mechanisation of labour on the lives of ordinary people in Britain during the 'long nineteenth century' (c.1780-1918), by developing computational and historical techniques and research questions for working with historical sources....Ahnert, Ruth ; Beavan, David ; Colavizza, Giovanni ; Farquhar, Adam ; Griffin, Emma …
-
Dataset
May's British and Irish Press Guide and Advertiser's Handbook & Dictionary etc. (1871-1880)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price, publisher, office, political and religious leaning.Frederick May & Son ; British Library
-
Dataset
The Newspaper Press Directory (1846-1880)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in...C. Mitchell and Co. ; British Library
-
Dataset
Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
We present a new dataset for the task of toponym resolution in digitised historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions...Coll Ardanuy, Mariona ; Beavan, David ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …
nineteenth-century English, geographic information retrieval, newspapers, toponym resolution, and dataset
-
Dataset
Ordnance Survey Old / First series England and Wales 1:63360 (georeferenced sheet images)
Map sheet images for the Ordnance Survey Old Series / First Series England and Wales 1:63360, georeferenced and cropped at the neatlike (can be viewed together as a seamless composite). Geotiff format. The original (ungeoreferenced) sheet images can be found at: https://commons.wikimedia.org/wiki/Category:Ordnance_Survey_Old/First_series_England_and_Wales_1:63360_(full_sheets). The sheets were georeferenced by relating the sheet...Vane, Olivia
England, First Series, Old Series, maps, Ordnance Survey, and Wales
-
Conference paper (published)
DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching
We present DeezyMatch, a free, open-source software library written in Python for fuzzy string matching and candidate ranking. Its pair classifier supports various deep neural network architectures for training new classifiers and for fine-tuning a pretrained model, which paves the way for transfer learning in fuzzy string matching. This approach...Hosseini, Kasra ; Nanni, Federico ; Coll Ardanuy, Mariona
Natural Language Processing, string matching, toponym matching, machine learning, and digital humanities
-
Dataset
Living Machines atypical animacy dataset
Atypical animacy detection dataset, based on nineteenth-century sentences in English extracted from an open dataset of nineteenth-century books digitized by the British Library (available via https://doi.org/10.21250/db14, British Library Labs, 2014). This dataset contains 598 sentences containing mentions of machines. Each sentence has been annotated according to the animacy and humanness...