Station to Station: Linking and Enriching Historical British Railway Data
上市Deposited
Creator
Coll Ardanuy, Mariona
()
Beelen, Kaspar
()
Lawrence, Jon
()
McDonough, Katherine
()
Nanni, Federico
()
Rhodes, Joshua
()
Tolfo, Giorgia
Wilson, Daniel C.S.
()
2021
添加到收藏
您无权访问任何现有集合。您可以创建一个新集合。
Abstract
The transformative impact of the railway on nineteenth-century British society has been widely recognized, but understanding that process at scale remains challenging because the Victorian rail network was both vast and in a state of constant flux. Michael Quick’s reference work Railway Passenger Stations in Great Britain: a Chronology offers a uniquely rich and detailed account of Britain’s changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both micro- and macro-scales; however, being published originally as a book, this resource was not well suited for systematic linking to other geographical data. This paper shows how such a minimally-structured historical directory can be transformed into an openly available structured and linked dataset, named StopsGB (Structured Timeline of Passenger Stations in Great Britain), which will be of widespread interest across the historical, digital library and semantic web communities. To achieve this, we use traditional parsing techniques to convert the original document into a structured dataset of railway stations, with attributes containing information such as operating companies and opening and closing dates. We then identify a set of potential Wikidata candidates for each station using DeezyMatch, a deep neural approach to fuzzy string matching, and use a supervised classification approach to determine the best matching entity.