Search Constraints
Search Results
-
Research report
UK Web Archive Quarterly Report: July, August and September 2020
This is the Web Archiving Statistics 2nd Quarter Report for 2020/2021. It presents statistics about targets (titles) created, 'Save a UK website' nominations, UKWA scope and usage. It is our intention to distribute this report quarterly (July, October, January, and April) with a more comprehensive report at the end of...Webber, Jason
-
Research report
UK Web Archive Quarterly Report: April, May and June 2020
This is the first Web Archiving Statistics Quarterly Report for 2020/2021. It is our intention to distribute this report quarterly (July, October, January, and April) with a more comprehensive report at the end of the financial year.Webber, Jason
-
Research report
UK Web Archive Annual Report 2019
This report collates the quarterly web archiving statistical reports from the preceding year (1st April 2019 to 31st March 2020). It mostly covers headline statistics but also highlights other notable areas of interest, such as collection development and projects that have either been completed or are still ongoing. The report...UK Web Archive
-
Dataset
DUKweb (Diachronic UK web)
We present DUKweb, a set of large-scale resources useful for the diachronic analysis of contemporary English. The dataset is derived from JISC UK Web Domain Dataset (1996-2013), which collects resources from the Internet Archive that were hosted on domains ending in ‘.uk’. The dataset includes co-occurrences matrices for each year...Basile, Pierpaolo ; Tsakalidis, Adam
-
Dataset
UK Selective Web Archive Classification Dataset. 1996 - 2010. TSV.
The dataset comprises a manually curated selective archive produced by UKWA which includes the classification of sites into a two-tiered subject hierarchy. In partnership with the Internet Archive and JISC, UKWA had obtained access to the subset of the Internet Archive’s web collection that relates to the UK. The JISC...UK Web Archive
archive, web domain dataset, JISC UK, classification dataset, UKWA Open Data, and 1996-2014
-
Dataset
JISC UK Web Domain Dataset Format Profile. 1996 - 2010.
The dataset is a format profile, summarising media type (MIME type) data formats contained within all of the HTTP 200 OK responses in the 1996 - 2010 tranche of the JISC UK Web Domain Dataset. In partnership with the Internet Archive and JISC, UKWA had obtained access to the subset...UK Web Archive
archive, 1996-2010, web domain dataset, JISC UK, UKWA Open Data, and format profile
-
Dataset
JISC UK Web Domain Dataset Host Link Graph. 1996 - 2010. TSV.
The dataset comprises ~2.5 billion 200 OK responses from the 1996 - 2010 tranche of the JISC UK Web Domain Dataset which have been scanned for hyperlinks. For each link, UKWA extracts the host that the link targets, and uses this to build up a picture of which hosts have...UKWA Open Data
archive, 1996-2012, web domain dataset, JISC UK, host link graph, and UKWA Open Data
-
Dataset
JISC UK Web Domain Dataset Geoindex. 1996 - 2010. TSV.
The dataset comprises ~2.5 billion 200 OK responses in the 1996 - 2010 tranche of the JISC UK Web Domain Dataset Dataset which have been scanned for geographic references - specifically postcodes. This set of postcode citations, found at particular URLs and crawled at particular times, forms an historical geoindex...UK Web Archive
archive, 1996-2011, JISC UK, geoindex, UKWA Open Data, and web domain dataset
-
Dataset
JISC UK Web Domain Dataset Crawled URL Index. 1996 - 2013. CDX.
The dataset comprises original compound index (CDX) files that have been re-assembled into 18 separate CDX files for each year of crawling activity represented (1996 - 2013). Please note that the individual CDX files are not sorted. In order to enable access to web archives, UKWA uses CDX files to...UKWA Open Data
archive, 1996-2013, crawled URL index, web domain dataset, JISC UK, and UKWA Open Data
- « Previous
- Next »
- 1
- 2
- 3
- 4