Buscar
Resultados de la búsqueda
-
Conference paper (published)
MapReader: a computer vision pipeline for the semantic exploration of maps at scale
We present MapReader, a free, open-source software library written in Python for analyzing large map collections. MapReader allows users with little computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and...Hosseini, Kasra ; Wilson, Daniel C. S. ; Beelen, Kaspar ; McDonough, Katherine
maps and ordnance survey
-
Conference paper (published)
Los libros españoles del Dr. William Bates (1625-1699) en la Dr. Williams’s Library de Londres
Taylor, Barry
-
Conference paper (published)
Resolving places, past and present: toponym resolution in historical British newspapers using multiple resources
Newspapers and their metadata are richly geographical, not only in their distribution but also their content. Attending to these spatial features is a prerequisite in newspaper research. Following other projects to have geoparsed place names in newspapers, we describe our approach to linking historical geospatial information in text to real-world...Coll Ardanuy, Mariona ; McDonough, Katherine ; Krause, Amrey ; Wilson, Daniel C.S. ; Hosseini, Kasra …
-
Conference paper (published)
DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching
We present DeezyMatch, a free, open-source software library written in Python for fuzzy string matching and candidate ranking. Its pair classifier supports various deep neural network architectures for training new classifiers and for fine-tuning a pretrained model, which paves the way for transfer learning in fuzzy string matching. This approach...Hosseini, Kasra ; Nanni, Federico ; Coll Ardanuy, Mariona
Natural Language Processing, string matching, toponym matching, machine learning, and digital humanities
-
Conference paper (published)
Living Machines: A study of atypical animacy
This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it,...Coll Ardanuy, Mariona ; Nanni, Federico ; Beelen, Kaspar ; Hosseini, Kasra ; Ahnert, Ruth …
nineteenth-century English, living machines, BERT, and animacy
-
Conference paper (published)
When Time Makes Sense: A Historically-Aware Approach to Targeted Sense Disambiguation
As languages evolve historically, making computational approaches sensitive to time can improve performance on specific tasks. In this work, we assess whether applying historical language models and time-aware methods help with determining the correct sense of polysemous words. We outline the task of time-sensitive Targeted Sense Disambiguation (TSD), which aims...Beelen, Kaspar ; Nanni, Federico ; Coll Ardanuy, Mariona ; Hosseini, Kasra ; Tolfo, Giorgia …
-
Conference paper (published)
“Webcomics Archive? Now I'm Interested”: Comics Readers Seeking Information in Web Archives
There is a longstanding tradition of understanding information needs and interaction behavior across different user groups to inform the design of digital products and services. There is a gap in such research of comics readers, specifically how they seek and interact with the information and interfaces of web-based archives provided...Berube, Linda ; Makri, Stephann ; Cooke, Ian ; Priego, Ernesto ; Wisdom, Stella
-
Conference paper (published)
Locating a National Collection through Audience Research DH2022 Long Abstract
This abstract was submitted to the DH2022 conference where I presented a long paper. It explores how geography can help to engage the public with digital cultural heritage collections. It draws on audience research that examined values and motivations in the UK alongside the use of location-based interfaces such as...Rees, Gethin ; Vitale, Valeria ; Hunt, Alex ; Horgan, John ; Strachan, Peter
location, metadata, web maps, geography, cultural heritage, and interface design
-
Conference paper (published)
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of...Laurençon, Hugo ; Saulnier, Lucile ; Wang, Thomas ; Akiki, Christopher ; Villanova del Moral, Albert …
-
Conference paper (published)
Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
In this work, we explore whether the recently demonstrated zero-shot abilities of the T0 model extend to Named Entity Recognition for out-of-distribution languages and time periods. Using a historical newspaper corpus in 3 languages as test-bed, we use prompts to extract possible named entities. Our results show that a naive...De Toni, Francesco ; Akiki, Christopher ; De La Rosa, Javier ; Fourrier, Clémentine ; Manjavacas, Enrique …
- « Anterior
- Siguiente »
- 1
- 2
- 3
- 4
- 5
- 6
- 7