Search Constraints
Search Results
-
Conference paper (published)
MapReader: a computer vision pipeline for the semantic exploration of maps at scale
We present MapReader, a free, open-source software library written in Python for analyzing large map collections. MapReader allows users with little computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and...Hosseini, Kasra ; Wilson, Daniel C. S. ; Beelen, Kaspar ; McDonough, Katherine
maps and ordnance survey
-
Other
Models for MapReader ACM SIGSPATIAL 2023 Geohumanities Workshop paper
Collection of fine-tuned models created during research published in Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen, and Katherine McDonough. 2022. MapReader: a computer vision pipeline for the semantic exploration of maps at scale. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on Geospatial Humanities (GeoHumanities '22). Association for...Hosseini, Kasra ; Beelen, Kaspar ; McDonough, Katherine ; Wilson, Daniel C. S.
computational humanities, computer vision, maps, models, and image classification
-
Learning object
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 2)
This is the second of a two-part lesson introducing deep learning based computer vision methods for humanities research. This lesson digs deeper into the details of training a deep learning based computer vision model. It covers some challenges one may face due to the training data used and the importance...Strien, Daniel van ; Beelen, Kaspar ; Wevers, Melvin ; Smits, Thomas ; McDonough, Katherine
-
Learning object
Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1)
This is the first of a two-part lesson introducing deep learning based computer vision methods for humanities research. Using a dataset of historical newspaper advertisements and the fastai Python library, the lesson walks through the pipeline of training a computer vision model to perform image classification.Strien, Daniel van ; Beelen, Kaspar ; Wevers, Melvin ; Smits, Thomas ; McDonough, Katherine
-
Book chapter
Hunting for Treasure: Living with Machines and the British Library Newspaper Collection
This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to...Tolfo, Giorgia ; Vane, Olivia ; Beelen, Kaspar ; Hosseini, Kasra ; Lawrence, Jon …
interdisciplinarity, digitised newspaper collections, digital corpus, research workflows, and digitisation strategy
-
Journal article
A Dataset for Toponym Resolution in Nineteenth-Century English Newspapers
We present a new dataset for the task of toponym resolution in digitized historical newspapers in English. It consists of 343 annotated articles from newspapers based in four different locations in England (Manchester, Ashton-under-Lyne, Poole and Dorchester), published between 1780 and 1870. The articles have been manually annotated with mentions... -
Journal article
MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale
We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i)...Hosseini, Kasra ; Wilson, Daniel C.S. ; Beelen, Kaspar ; McDonough, Katherine
-
Journal article
Neural Language Models for Nineteenth-Century English
We present four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5.1 billion tokens. The language model architectures include static (word2vec and fastText) and contextualized models (BERT and Flair). For each architecture, we trained a model instance...Hosseini, Kasra ; Beelen, Kaspar ; Colavizza, Giovanni ; Coll Ardanuy, Mariona
-
Conference paper (published)
When Time Makes Sense: A Historically-Aware Approach to Targeted Sense Disambiguation
As languages evolve historically, making computational approaches sensitive to time can improve performance on specific tasks. In this work, we assess whether applying historical language models and time-aware methods help with determining the correct sense of polysemous words. We outline the task of time-sensitive Targeted Sense Disambiguation (TSD), which aims...Beelen, Kaspar ; Nanni, Federico ; Coll Ardanuy, Mariona ; Hosseini, Kasra ; Tolfo, Giorgia …
- « Previous
- Next »
- 1
- 2
- 3