Buscar
Resultados de la búsqueda
-
Dataset
Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)
Word vectors related to the paper "Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers" by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into...Pedrazzini, Nilo ; McGillivray, Barbara
historical semantics, word-vectors, late-modern-english, newspapers, diachronic-embeddings, and word2vec
-
Dataset
Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919)
Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation:...Pedrazzini, Nilo
historical semantics, British newspapers, word embeddings, word vectors, word2vec, and Late Modern English
-
Dataset
Diachronic and diatopic word embeddings from newspapers digitised by the British Library (1830-1889): North and South England
Diachronic word embeddings (decade-level) trained with Word2Vec (via Gensim) on different geographic subcorpora of the Heritage Made Digital British and the Living with Machines historical newspaper collections: - North England (north.zip) - South England (south.zip) At the moment, for each subcorpus, Word2Vec models are available for each decade in the...Pedrazzini, Nilo ; McGillivray, Barbara
historical semantics, diachronic embeddings, late modern English, word embeddings, word vectors, word2vec, and diatopic embeddings
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/j278-4b96 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
higher education, student, EThOS, research, doctoral, thesis, PhD, UK, dissertations, and theses
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/ybpt-nh33 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
higher education, ethos, dissertations, thesis, research, PhD, doctoral, student, UK, and theses
-
Dataset
IMPACT Digitisation Centre of Competence Dataset
The Impact Centre of Competence dataset contains more than half a million representative text-based images compiled by a number of major European libraries. Covering texts from as early as 1500, and containing material from newspapers, books, pamphlets and typewritten notes, the dataset is an invaluable resource for future research into...Universitat d’Alacant ; Instituut voor de Nederlandse Taal ; Koninklijke Bibliotheek ; Bibliothèque Nationale de France ; British Library …
-
Dataset
UK Doctoral Thesis Metadata from EThOS
This dataset has been superseded by a more recent version: https://doi.org/10.23636/1188 If you require access to an earlier version, please email openaccess@bl.uk, including the dataset title, date, and DOI in your request. The data in this collection comprises the bibliographic metadata for all UK doctoral theses listed in EThOS, the...British Library ; Rosie, Heather
-
Dataset
JISC UK Web Domain Dataset Format Profile. 1996 - 2010.
The dataset is a format profile, summarising media type (MIME type) data formats contained within all of the HTTP 200 OK responses in the 1996 - 2010 tranche of the JISC UK Web Domain Dataset. In partnership with the Internet Archive and JISC, UKWA had obtained access to the subset...UK Web Archive
archive, 1996-2010, web domain dataset, JISC UK, UKWA Open Data, and format profile
-
Dataset
JISC UK Web Domain Dataset Host Link Graph. 1996 - 2010. TSV.
The dataset comprises ~2.5 billion 200 OK responses from the 1996 - 2010 tranche of the JISC UK Web Domain Dataset which have been scanned for hyperlinks. For each link, UKWA extracts the host that the link targets, and uses this to build up a picture of which hosts have...UKWA Open Data
archive, 1996-2012, web domain dataset, JISC UK, host link graph, and UKWA Open Data