Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)
PubblicoDeposited
Creator
Pedrazzini, Nilo
()
McGillivray, Barbara
()
2022
Aggiungere alla collezione
Non hai accesso ad alcuna raccolta esistente. È possibile creare una nuova raccolta.
Abstract
Word vectors related to the paper "Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers" by Nilo Pedrazzini and Barbara McGillivray (2022).
The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters.
The embeddings are divided into periods of ten years each, with the vectors from each decade aligned to the ones from the most recent decade (1910s) using Orthogonal Procrustes.
See related GitHub repository for the full documentation: https://github.com/Living-with-machines/DiachronicEmb-BigHistData.
Project webpage (Living with Machines): https://livingwithmachines.ac.uk/