Search Constraints
Search Results
-
Conference paper (unpublished)
Defoe: A Spark-Based Toolbox for Analysing Digital Historical Textual Data
This work presents defoe, a new scalable and portable digital eScience toolbox that enables historical research. It allows for running text mining queries across large datasets, such as historical newspapers and books in parallel via Apache Spark. It handles queries against collections that comprise several XML schemas and physical representations....