Text and Data Mining in EThOS - British Library Research Repository
Skip to main content
Shared Research Repository
Research report

Text and Data Mining in EThOS

October 2014


EThOS (https://ethos.bl.uk) is the UK's national thesis service listing over 500,000 doctoral theses and providing immediate access to over 300,000 digital theses. In 2014 a new copyright exception for non-commercial text and data mining (TDM) came into force permitting certain acts of copying that would otherwise constitute copyright infringement, in particular the copying of an entire work and not just for fair dealing with the work. This report explores the opportunities, challenges, risks and workflows that the British Library and researchers would need to consider when wishing to support text and data mining from the EThOS corpus of doctoral theses. The author undertook the work as part of the National Compound Collection project supported by Royal Society of Chemistry to investigate the identification and re-use of chemical compounds reported inside the pages of doctoral theses.


File nameDate UploadedVisibilityFile size
17 Feb 2020
852 kB