Big UK Domain Data for the Arts and Humanities builds on the work of two pilot projects, supported by Jisc under the 16/11 funding call, Strand B.
Analytical Access to the Domain Dark Archive (AADDA)
AADDA was a collaboration between the British Library and the Institute of Historical Research. The main aim of the project was to enhance the sustainability of the dataset derived from UK web space for the period 1996–2010, and to improve access to and understanding of web archives more generally. The project helped to increase our understanding of the UK web domain dataset – of the gaps and inconsistencies, the differences in the timing and nature of the collection process, the various elements which constitute a page, the relationship between developing technologies and content, even the problems that arise from archiving networked data at the page level – and developed a beta interface which allows researchers to use it more effectively. This work has informed scholarly access arrangements at the domain level, demonstrated the value of the data for researchers, and helped to provide a business case for further funding.
Project blog: http://domaindarkarchive.blogspot.co.uk/
Big Data: Demonstrating the Value of the UK Web Domain Dataset for Social Science Research
This project, a collaboration between the British Library and the Oxford Internet Institute, aimed to increase visibility, accessibility, and ease-of-use of the JISC UK Web Domain Dataset, a 30 terabyte web archive of the .uk country-code top level domain (ccTLD) collected from 1996 to 2010. The project extracted link graphs from the data, assessed the feasibility and impact of using the .uk ccTLD as a boundary for UK web presence, and conducted and disseminated high-quality social science research examples using the collection. It also trialled tools and procedures to make the data more easily accessible.
Project blog: http://www.oii.ox.ac.uk/research/projects/?id=88