Buscar
Resultados de la búsqueda
-
Dataset
OCR and crowdsourced annotations, Language of Mechanisation, JSON files
Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for... -
Dataset
Language of Mechanisation: annotated historical newspaper articles
Datasets created through crowdsourcing tasks created on the Zooniverse crowdsourcing platform by the Living with Machines ‘language of mechanisation’ project team. Building on earlier work classifying machines by function, we asked volunteers on Zooniverse 'how did the word x change over time and place?' and presented them with options for... -
Dataset
UK Doctoral Thesis Metadata from EThOS
The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the UK's national thesis service. We estimate the data covers around 98% of all PhDs ever awarded by UK Higher Education institutions, dating back to 1787. Thesis metadata from every PhD-awarding university in...British Library ; Rosie, Heather
higher education, student, UK, dissertations, PhD, theses, doctoral, ethos, thesis, and research
-
Software
Hybrid Correspondence Network Processing Script
The Python code was developed to to interrogate the ways in which digital and analogue correspondence files (letters and e-mails) function within the Archive of Harold Pinter; reflecting upon what these patterns might mean for archivists, curators and researchers working with hybrid correspondence collections. This code is collection agnostic and...Mckean, Callum
Harold Pinter, data science, hybrid archives, and visualisations
-
Dataset
EAP031 Catalogue Metadata
This Excel spreadsheet contains the metadata that describes the archival collection digitised in Bulgaria by the EAP031 "The Treasures of Danzan Ravjaa" project team. The metadata was originally created by the EAP031 project team that digitised the archive in 2005. The project team was led by Professor Caroline Humphrey. This...EAP031 Project Team
metadata, manuscripts, and Tibetan
-
Dataset
EAP696 Catalogue Metadata
This Excel spreadsheet contains the metadata that describes the archival collection digitised in Bulgaria by the EAP696 "Minority press in Ottoman Turkish in Bulgaria" project team. The metadata was originally created by the EAP696 project team that digitised the archive in 2014. The project team was led by Mr Stoyan...EAP696 Project Team
-
Geographical dataset
Sarah FitzGerald's PhD placement project folder
This dataset is a zip file that contains the complete folder structure that Sarah used to manage this project. The content includes her planning, work, and outcomes, in the form of reports, presentations and blog posts. In addition to the data visualisations on the projects relating to Africa, Sarah also...FitzGerald, Sarah
West Africa, research collaboration, projects, Africa, humanities, digital scholarship, and data visualisation
-
Dataset
Datasets for toponym recognition and disambiguation for nineteenth-century English newspapers
We present two datasets, one for the task of toponym recognition and one for the task of toponym disambiguation. The datasets are derived from the "Dataset for Toponym Resolution in Nineteenth-Century English Newspapers" (DOI: https://doi.org/10.23636/r7d4-kw08). The toponym recognition dataset consists of two JSON files (ner_fine_train.json and ner_fine_dev.json), whereas the toponym...Coll Ardanuy, Mariona ; Nanni, Federico
toponym disambiguation, nineteenth-century newspapers, named entity recognition, entity linking, toponym resolution, toponym recognition, and dataset
-
Dataset
DeezyMatch training set for OCR
Optical character recognition (OCR) is the process of automatically transcribing text from images. The presence of OCR-induced errors in digitised text is a common problem in the digital humanities. OCR errors are usually due to the misrecognition of characters, such as "h" recognised as "b", or "c" recognised as "o".... -
Dataset
Incunabula Printed Catalogue Dataset: Volumes 1-10 copy of github repository
This dataset includes the github repository used to derive catalogue entries from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library....British Library
book history, metadata, catalogues, datasets, incunabula, early printed books, and early printing
- « Anterior
- Siguiente »
- 1
- 2
- 3
- 4
- 5
- …
- 15
- 16