Buscar
Resultados de la búsqueda
-
Research report
Publishing updated version of ‘R for Newspaper Data’
My Living with Machines Digital Residency, which I carried out between May and July 2023, allowed me to update and publish an online book on accessing and analysing newspaper data. The goal of the book is to make available an end-to-end set of instructions and tutorials which would allow researchers,...Ryan, Yann Ciarán
digital humanities, Victorian, historical newspapers, and Living with Machines
-
Research report
An Etiquette For Minor Time Travel
A report documenting the Living with Machines Digital Residency project called "An Etiquette for Minor Time Travel" by Robert Sherman. Via an open call and running from to July 2023, six Digital Residencies were funded by the Living with Machines project to support researchers and practitioners devising creative approaches to...Sherman, Robert
art, Living with Machines, Victorian, rail, data visualisation, digital humanities, and poetry
-
Research report
Visualising press politics in the United Kingdom
This project's main goal, as outlined in the initial proposal, was to develop an interactive, open-source web app that visualizes data from the Press Directories dataset alongside historical general election results. In this report, I will delve into the challenges encountered, the interesting findings uncovered, and the potential avenues for...Bonato, Nicolò
historcial newspapers, digital humanities, Victorian, politics, data visualisation, and Living with Machines
-
Research report
The Devils After the Fall
This is a report by Nicola Baldwin of a Digital Residency at Living With Machines, on the dataset: Crowdsourced accidents data from Newspapers, working in line with the LWM aims for Radical Collaboration, New Perspectives, Analysing at Scale. The following document is a personal account of work done, thoughts arising...Baldwin, Nicola
Living with Machines, accidents, Victorian, film, crowdsourcing, newspaper, and digital humanities
-
Research report
UK Railway Archive (AR-UK)
Archive The Railway UK (AR-UK) is a comprehensive digital platform designed to enhance the online archives of the UK rail network. This initiative, developed in collaboration with Living with Machines, is primarily focused on research, historical preservation, and providing public access to railway history. As a centralized resource, it caters...Sheppard, Joanne
digital humanities, Victorian, rail, data visualisation, and Living with Machines
-
Research report
Circulations and Entangled History in 19th Century Chile
The Living with Machines Digital Residences have offered our research team a remarkable opportunity to experiment with methods for extracting data from historical newspapers dating back 100 to 150 years. This interdisciplinary project aims to expand the scope of digital humanities and historical research by developing automated techniques for data...Hayward, Jennifer ; Valenzuela, Gillian ; Shakib, Khandokar
digital humanities, historical newspapers, Living with Machines, and Victorian
-
Software
Living-with-machines/MapReader: End of LwM
This release marks the end of the current funding for MapReader during the Living with Machines (LwM) project. @kasra-hosseini @andrewphilipsmith @rwood-97 @kmcdono2 @dcsw2 @kallewesterling @kasparvonbeelenHosseini, Kasra ; Wood, Rosie ; Smith, Andy ; McDonough, Katie ; Wilson, Daniel C. S. …
computer vision and maps
-
Dataset
The Newspaper Press Directory (1846-1920) - enriched and structured version
Mitchell's Newspaper Press Directories contained an almost complete list of newspapers published in England, Wales, Scotland and Ireland. It was published regularly from 1846 onwards and provided a detailed description of the newspaper landscape over time. This version contains a structured, tabular representation of the directories (as CSV or Excel...C. Mitchell and Co. ; British Library
-
Other
Models for MapReader ACM SIGSPATIAL 2023 Geohumanities Workshop paper
Collection of fine-tuned models created during research published in Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen, and Katherine McDonough. 2022. MapReader: a computer vision pipeline for the semantic exploration of maps at scale. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on Geospatial Humanities (GeoHumanities '22). Association for...Hosseini, Kasra ; Beelen, Kaspar ; McDonough, Katherine ; Wilson, Daniel C. S.
computational humanities, computer vision, maps, models, and image classification
-
Dataset
Diachronic and diatopic word embeddings from newspapers digitised by the British Library (1830-1889): North and South England
Diachronic word embeddings (decade-level) trained with Word2Vec (via Gensim) on different geographic subcorpora of the Heritage Made Digital British and the Living with Machines historical newspaper collections: - North England (north.zip) - South England (south.zip) At the moment, for each subcorpus, Word2Vec models are available for each decade in the...Pedrazzini, Nilo ; McGillivray, Barbara
historical semantics, diachronic embeddings, late modern English, word embeddings, word vectors, word2vec, and diatopic embeddings
-
Dataset
Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919)
Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation:...Pedrazzini, Nilo
historical semantics, British newspapers, word embeddings, word vectors, word2vec, and Late Modern English
-
Dataset
Language of Mechanisation: annotated historical newspaper articles
Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for... -
Dataset
OCR and crowdsourced annotations, Language of Mechanisation, JSON files
Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for... -
Journal article
Working at scale: what do computational methods mean for research using cases, models and collections?
Open access, peer-reviewed article published in Science Museum Group Journal, as part of a double-length special issue for the AHRC TaNC discovery project, 'Congruence Engine'. The article gives a critical overview of how 'scale' operates as a keyword within computational humanities as well as reviewing a number of cognate fields,...Wilson, Daniel C S
machine learning, AI for GLAM, STS, scale, computational humanities, history, and congruence engine
-
Dataset
Glasgow Courier
Glasgow Courier was a thrice weekly/bi-weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Kenilworth Advertiser
Kenilworth Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Blandford Weekly News
Blandord Weekly News was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Bridlington and Quay Gazette
Bridlington and Quay Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
British Miner and General Newsman
British Miner and General Newsman was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Swansea Journal and South Wales Liberal
Swansea Journal and South Wales Liberal was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Birkenhead News
Birkenhead News was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Atherstone, Nuneaton, and Warwickshire Times
Atherstone, Nuneaton, and Warwickshire Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Weekly Journal
The file consists of the OCR (Optical Character Recognition) text in XML format for one year of Weekly Journal (Hartlepool) 1901. The full digitised newspaper comprises no. 1–407 (29 Nov.1901 – 17 Sep.1909). The digitised page images are available on the British Newspaper Archive website, https://www.britishnewspaperarchive.co.uk/titles/weekly-journal-hartlepool The British Newspaper Archive...British Library
-
Dataset
North Cumberland Reformer
North Cumberland Reformer (1890 - 1898) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Tamworth Miners' Examiner and Working Men's Journal
Tamworth Miners' Examiner and Working Men's Journal (1873 - 1876) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Alston Herald, and East Cumberland Advertiser
Alston Herald, and East Cumberland Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Lancaster Herald and Town and County Advertiser
Lancaster Herald and Town and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Shropshire Examiner
Shropshire Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Colne Valley Guardian
Colne Valley Guardian was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
South Staffordshire Examiner
South Staffordshire Examiner (1874) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Stalybridge Examiner
Stalybridge Examiner (1876) which has been digitised by the British Library for the Living with Machines project.British Library
-
Abstract
Historic machines from 'prams' to 'Parliament': new avenues for collaborative linguistic research
Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship (see e.g. Beelen et al. 2021). More importantly, a lot of computational research looks at texts in a historical...Ridge, Mia ; Tolfo, Giorgia ; Westerling, Kalle ; Pedrazzini, Nilo ; McGillivray, Barbara
crowdsourcing, computational linguistics, and digital humanities
-
Dataset
Warrington Examiner
Warrington Examiner (1869-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Stockton Herald, South Durham and Cleveland Advertiser
Stockton Herald, South Durham and Cleveland Advertiser. (1858 - 1918) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Central Glamorgan Gazette
Central Glamorgan Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Barrow Herald and Furness Advertiser
Barrow Herald and Furness Advertiser. (1863 - 1914) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
The Runcorn Examiner
The Runcorn Examiner (1870-1954) was a weekly newspaper and years 1870-1920 have been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Weymouth Telegram
Weymouth Telegram (1860 - 1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Living with Machines Zooniverse Participant Survey
Summary results from a survey of contributors to Living with Machines Zooniverse crowdsourcing projects. Responses were received between 24 May and 13 June 2022. We designed the survey so that we could align our reporting with two other audience / participant research groups. Firstly, we used the demographic categories that...British Library
online volunteering, digital participation, citizen science, citizen history, questionnaire, crowdsourcing, survey, and audience research
-
Dataset
The Newspaper Press Directory (1881-1920)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in...C. Mitchell and Co. ; British Library
-
Dataset
Frederick May's London Press Dictionary and Advertiser's Handbook (1883-1911)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price publisher office political and religious leaningFrederick May & Son ; British Library
-
Dataset
Stretford and Urmston Examiner
Stretford and Urmston Examiner. (1879 - 1880) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Exhibition object labels
Living with Machines: Human stories from the industrial age (exhibition board text)
‘Living with Machines: Human stories from the industrial age’ was a free exhibition at Leeds City Museum from July 2022-January 2023. It explored how machines and mechanisation changed life and work in Leeds and the surrounding regions. A collaboration between the British Library and Leeds City Museum, the exhibition was...Ridge, Mia ; McGoldrick, John
history of science, mechanisation, data science, industrialisation, history of technology, and information visualisation
-
Dataset
St. Helens Examiner
St. Helens Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Brighouse & Rastrick Gazette
Brighouse & Rastrick Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
The Stockton Examiner
The Stockton Examiner (1878-1879) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Warwickshire Herald
Warwickshire Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Pontypridd District Herald
Pontypridd District Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Darlington & Richmond Herald
Darlington & Richmond Herald was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Nuneaton Times
Nuneaton Times was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Poole Telegram
Poole Telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Midland Examiner and Wolverhampton Times
Midland Examiner and Wolverhampton Times (1874-1878) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Potteries Examiner
Potteries Examiner (1871 - 1881) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Cotton Factory Times
Cotton Factory Times (1885-1889, 1891-1901) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Bridport, Beaminster and Lyme Regis telegram
Bridport, Beaminster and Lyme Regis telegram was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Northern Guardian (Hartlepool)
Northern Guardian (Hartlepool) (1891 - 1902) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Lancaster Standard and County Advertiser
Lancaster Standard and County Advertiser was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Forest of Dean Examiner
Forest of Dean Examiner (1873-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Glasgow Chronicle
Glasgow Chronicle was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
The Cannock Chase Examiner
The Cannock Chase Examiner (1874-1877) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Book chapter
Hunting for Treasure: Living with Machines and the British Library Newspaper Collection
This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to...Tolfo, Giorgia ; Vane, Olivia ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …
interdisciplinarity, digitised newspaper collections, digital corpus, research workflows, and digitisation strategy
-
Learning object
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1)
This is the first of a two-part lesson introducing deep learning based computer vision methods for humanities research. Using a dataset of historical newspaper advertisements and the fastai Python library, the lesson walks through the pipeline of training a computer vision model to perform image classification.Strien, Daniel van ; Beelen, Kaspar ; Wevers, Melvin ; Smits, Thomas ; McDonough, Katherine
-
Dataset
Northern Weekly Gazette
Northern Weekly Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Swansea and Glamorgan Herald, and South Wales Free Press
Swansea and Glamorgan Herald, and South Wales Free Press. (1847 - 1890) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Liverpool Weekly Courier
Liverpool Weekly Courier was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Dorset County Express and Agricultural Gazette
Dorset County Express and Agricultural Gazette was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Denton and Haughton Examiner
Denton and Haughton Examiner was a weekly newspaper which has been digitised by the British Library for the Living with Machines project. Variant titles are 1873-74 The Denton, Haughton, & District Weekly News. 1874-75 Denton & Haughton Weekly News, and Audenshaw, Hooley Hill, and Dukinfield Advertiser, 1875-78 Denton Examiner, Audenshaw,...British Library
-
Dataset
Cradley Heath & Stourbridge Observer
Cradley Heath & Stourbridge Observer. (1864 - 1888) was a weekly newspaper which has been digitised by the British Library for the Living with Machines projectBritish Library
-
Dataset
Widnes Examiner
Widnes Examiner (1876-1920) was a weekly newspaper which has been digitised by the British Library for the Living with Machines project.British Library
-
Dataset
Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)
Word vectors related to the paper "Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers" by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into...Pedrazzini, Nilo ; McGillivray, Barbara
historical semantics, word-vectors, late-modern-english, newspapers, diachronic-embeddings, and word2vec
-
Journal article
A Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
We present a new dataset for the task of toponym resolution in digitized historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions... -
Conference paper (published)
MapReader: a computer vision pipeline for the semantic exploration of maps at scale
We present MapReader, a free, open-source software library written in Python for analyzing large map collections. MapReader allows users with little computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and...Hosseini, Kasra ; Wilson, Daniel C. S. ; Beelen, Kaspar ; McDonough, Katherine
maps and ordnance survey
-
Dataset
Ordnance Survey Old / First series England and Wales 1:63360 (georeferenced sheet images)
Map sheet images for the Ordnance Survey Old Series / First Series England and Wales 1:63360, georeferenced and cropped at the neatlike (can be viewed together as a seamless composite). Geotiff format. The original (ungeoreferenced) sheet images can be found at: https://commons.wikimedia.org/wiki/Category:Ordnance_Survey_Old/First_series_England_and_Wales_1:63360_(full_sheets). The sheets were georeferenced by relating the sheet...Vane, Olivia
England, First Series, Old Series, maps, Ordnance Survey, and Wales
-
Dataset
May's British and Irish Press Guide and Advertiser's Handbook & Dictionary etc. (1871-1880)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Frederick May and successors, containing information on newspapers, magazines and periodicals and arranged in alphabetical and sometimes tabular order. Information for each title included price, publisher, office, political and religious leaning.Frederick May & Son ; British Library
-
Dataset
The Newspaper Press Directory (1846-1880)
Newspaper directories produced and published annually in contemporary 19th Britain by advertising agent Charles Mitchell. Newspapers listed primarily listed in alphabetical order of the town the newspaper where the title was published. Information for each title included: features connected with the district such as population and trade; principal towns in...C. Mitchell and Co. ; British Library
-
Conference paper (published)
When Time Makes Sense: A Historically-Aware Approach to Targeted Sense Disambiguation
As languages evolve historically, making computational approaches sensitive to time can improve performance on specific tasks. In this work, we assess whether applying historical language models and time-aware methods help with determining the correct sense of polysemous words. We outline the task of time-sensitive Targeted Sense Disambiguation (TSD), which aims...Beelen, Kaspar ; Nanni, Federico ; Coll Ardanuy, Mariona ; Hosseini, Kasra ; Tolfo, Giorgia …
-
Journal article
MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale
We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i)...Hosseini, Kasra ; Wilson, Daniel C.S. ; Beelen, Kaspar ; McDonough, Katherine
-
Research report
Data Study Group Final Report: Smart monitoring for conservation areas
WWF (World Wide Fund for Nature) monitors over 250,000 protected areas (e.g. national parks and nature reserves) and thousands of other sites and critical habitats. These sites are the foundation of global natural assets and are central to the preservation of biodiversity and human well-being. Unfortunately, they face increasing pressures... -
Abstract
Using smart annotations to map the geography of newspapers
Geographic information is a key component in the description of collection objects, and yet its format is often unsuited for use with methods of geographic analysis. Catalogue entries are often inconsistent, in plain text, and without geographic coordinates (much less coordinates linked to authority records). Georesolution of the relevant fields...Ryan, Yann ; Coll Ardanuy, Mariona ; van Strien, Daniel ; Hosseini, Kasra ; Beelen, Kaspar …
-
Conference paper (unpublished)
Assessing the Impact of OCR Quality on Downstream NLP Tasks
A growing volume of heritage data is being digitized and made available as text via optical character recognition (OCR). Scholars and libraries are increasingly using OCR-generated text for retrieval and analysis. However, the process of creating text through OCR introduces varying degrees of error to the text. The impact of... -
Conference paper (published)
DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching
We present DeezyMatch, a free, open-source software library written in Python for fuzzy string matching and candidate ranking. Its pair classifier supports various deep neural network architectures for training new classifiers and for fine-tuning a pretrained model, which paves the way for transfer learning in fuzzy string matching. This approach...Hosseini, Kasra ; Nanni, Federico ; Coll Ardanuy, Mariona
Natural Language Processing, string matching, toponym matching, machine learning, and digital humanities
-
Dataset
Living with Machines alpha and beta Zooniverse 'accident' task data
Data created through crowdsourcing tasks hosted on the Zooniverse platform. Members of the public were asked to look at a selection of articles from 19th century newspapers that mentioned machines and decide if they described an industrial accident. A further task asked participants to transcribe personal, organisational and place names...Zooniverse volunteers
crowdsourcing, digital history, citizen history, Living with Machines, newspapers, and digital humanities
-
Conference paper (published)
Living Machines: A study of atypical animacy
This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it,...Coll Ardanuy, Mariona ; Nanni, Federico ; Beelen, Kaspar ; Hosseini, Kasra ; Ahnert, Ruth …
nineteenth-century English, living machines, BERT, and animacy
-
Dataset
Living Machines atypical animacy dataset
Atypical animacy detection dataset, based on nineteenth-century sentences in English extracted from an open dataset of nineteenth-century books digitized by the British Library (available via https://doi.org/10.21250/db14, British Library Labs, 2014). This dataset contains 598 sentences containing mentions of machines. Each sentence has been annotated according to the animacy and humanness... -
Conference paper (unpublished)
Defoe: A Spark-Based Toolbox for Analysing Digital Historical Textual Data
This work presents defoe, a new scalable and portable digital eScience toolbox that enables historical research. It allows for running text mining queries across large datasets, such as historical newspapers and books in parallel via Apache Spark. It handles queries against collections that comprise several XML schemas and physical representations.... -
Conference paper (published)
Resolving places, past and present: toponym resolution in historical British newspapers using multiple resources
Newspapers and their metadata are richly geographical, not only in their distribution but also their content. Attending to these spatial features is a prerequisite in newspaper research. Following other projects to have geoparsed place names in newspapers, we describe our approach to linking historical geospatial information in text to real-world...Coll Ardanuy, Mariona ; McDonough, Katherine ; Krause, Amrey ; Wilson, Daniel C.S. ; Hosseini, Kasra …
-
Journal article
Babbage among the insurers: Big 19th-century data and the public interest
This article examines life assurance and the politics of ‘big data’ in mid-19th-century Britain. The datasets generated by life assurance companies were vast archives of information about human longevity. Actuaries distilled these archives into mortality tables – immensely valuable tools for predicting mortality and so pricing risk. The status of...Wilson, Daniel C.S.
big data, Thomas Rowe Edmonds, Charles Babbage, public interest, and insurance