Data Study Group Final Report: The National Archives, UK: Discovering Topics and Trends in the UK Government Web Archive
上市Deposited
Creator
Beavan, David
()
Nanni, Federico
()
2021
添加到收藏
您无权访问任何现有集合。您可以创建一个新集合。
Abstract
The challenge we address in this report is to make steps towards improving search and discovery of resources within this vast archive for future archive users, and how the UKGWA collection could begin to be unlocked for research and experimentation by approaching it as data (i.e. as a dataset at scale). The UKGWA has begun to examine independently the usefulness of modelling the hyperlinked structure of its collection for advanced corpus exploration; the aim of this collaboration is to test algorithms capable of searching for documents via the topics that they cover (e.g. ‘climate change’), envisioning a future convergence of these two research frameworks. This is a diachronic corpus that is ideal for studying the emergence of topics and how they feature through government websites over time, and it will indicate engagement priorities and how these change over time.