This article arises from work by the Digital Curation Centre (DCC) Working Group examining mechanisms to roll out audit and certification services for digital repositories in the United Kingdom. Our attempt to develop a program for applying audit and certification processes and tools took as its starting point the RLG-NARA Audit Checklist for Certifying Digital Repositories . Our intention was to appraise critically the checklist and conceive a means of applying its mechanics within a diverse range of repository environments. We were struck by the realization that while a great deal of effort has been invested in determining the characteristics of a 'trusted digital repository', far less effort has concentrated on the ways in which the presence of the attributes can be demonstrated and their qualities measured. With this in mind we sought to explore the role of evidence within the certification process, and to identify examples of the types of evidence (e.g., documentary, observational, and testimonial) that might be desirable during the course of a repository audit1.
Digital repositories have become a focal point of institutional development . Numerous activities from eprint services to institutional repositories to developments at national libraries and archives reflect the recognized need to develop, deploy, and maintain digital repositories that are worthy of trust. At the same time within our community our outlook appears to be shifting from optimistic trust behaviour to pessimistic trust behaviour. That is, a healthy level of pessimism has emerged that has led many to recognise that digital repositories are only worthy of trust if they can demonstrate that they have the properties of trustworthiness. The desire to construct a robust audit, certification, and accreditation programme for digital repositories arises because as a community we acknowledge the uncertainty as to whether repositories can secure the authenticity, integrity, and accessibility of digital materials over the long term. As a result we aim to put in place mechanisms to reduce uncertainty and the anxiety associated with it; establishing evidence of trust is an approach to handling uncertainty. Programmes such as audit and certification allow us to restrain uncertainty and to transform it to risks that can be measured and managed. The reasons why independently measuring and validating the trustworthiness of repositories is essential have been the focus of earlier discussions [5, 9, 10]. For the purposes of this study we have taken as axiomatic that certification is one marker that helps users to establish the level of trust that they might reasonably have in a particular digital repository. Audit is a critical step in establishing whether certification of a particular repository should be granted. Here we aim to open the debate on the types of evidence needed if digital repositories are to be effectively and transparently audited.
Funded jointly by the Joint Information Systems Committee (JISC)2 and the core e-Science Programme, the Digital Curation Centre (DCC) aims to support and promote continuing improvement in the quality of data curation and digital preservation, within the United Kingdom. The four partners, the University of Edinburgh,3 HATII4 at the University of Glasgow,5 UKOLN6 at the University of Bath7 and the Council for the Central Laboratory of the Research Councils (CCLRC)8 have collaborated to build the DCC on the international expertise and renown of the partners in research, development, service, and training delivery . The DCC has four fundamental priorities: to establish a vibrant research programme, to build and foster strong community relationships, to explore innovative development activities that lead to tangible heavily used services, and to achieve a virtuous circle whereby every aspect of our own outputs and community input feeds into and informs our existing activities and shapes emerging ones.
Since it is anticipated that the successful development of accreditation, audit, and certification will depend on international consensus, the DCC has developed relationships with the leading audit and certification efforts in this area . Much of the effort to date appears to have concentrated on defining the characteristics of a 'trusted digital repository'; considerably less effort has been committed to establishing a context in which these characteristics can be shown to be present and, if they are present, how their qualities can be measured and evaluated. Although here we focus on approaches to audit and certification within the archives and library community and how they might be enhanced, we have recognised elsewhere  that this work must not be conducted in isolation and that there is much to be gained from building on the work of other audit and certification organisations and their methods and approaches (e.g., ISACA,9 Information Systems Audit and Control Association). As we argued earlier, 'digital curation and preservation is a risk management activity at all stages of the longevity pathway' . Many other communities have developed strategies for identifying, monitoring, and managing classes of risk that are directly relevant to our work.10
2. Defining Activities
The generally accepted starting point for much of this work is the 2002 Research Libraries Group (RLG)11 and Online Computer Library Centre (OCLC)12 Working Group paper, Trusted Digital Repositories: Attributes and Responsibilities . Subsequently, the RLG and US National Archives and Records Administration (NARA)13 Digital Repository Certification Task Force published in late 2005 a draft Audit Checklist for Certifying Digital Repositories , comprising just under ninety criteria for determining whether a digital repository should be trusted. These criteria are organised into four categories: organisation; functions, processes and procedures; the designated community and information usability; and, technologies and technical infrastructure. The principles, terminology, and functional characteristics outlined in the Reference Model for an Open Archival Information System (OAIS) , published by the Consultative Committee for Space Data Systems (CCSDS) and subsequently galvanised as international standard ISO14721, form the bedrock on which the checklist, at least in its draft form, is built. Together, these three documents have provided a foundation for activities that are being undertaken within the Center for Research Libraries' (CRL)14 Certification of Digital Archives15 project with support from the Andrew W. Mellon Foundation. This project is currently conducting pilot audits of digital archives including the Koninklijke Bibliotheek (KB),16 the Inter-University Consortium for Political and Social Research (ICPSR),17 and Portico.18 In addition, the CRL team will audit the distributed archiving system LOCKSS19 ("Lots of Copies Keep Stuff Safe"). The overall aim of the CRL work is to evaluate in as formal a way as possible the audit and certification checklist proposed by the RLG-NARA Certification Task Force, and to offer some insight into the applicability of their proposed metrics. The DCC collaborated with the CRL on the KB audit in April 2006 as a way to ensure comparability of practice during the pilot between DCC and CRL approaches.
A comparable German initiative, the Network of Expertise in Long-term Storage of Digital Resources (nestor), aims to raise awareness for digital preservation issues, promote best practices, and foster the development of an associated community of expertise.20 A nestor working group is exploring the development of a procedure for certification as well as a criteria catalogue for characterising "trustworthy archives".21 Nestor has conceived its own checklist for repository audit; this work covers the technical, organisational, and financial characteristics of a digital repository. Stefan Strathmann, Göttingen State and University Library (Germany) provided us with access to an early draft of the nestor checklist . One strength of the nestor work is that the criteria are well founded on broad thinking in digital preservation, and the criteria catalogue itself has been built on the rich literature base related to digital preservation. Recent discussions with Suzanne Dobratz indicate that nestor is working to develop mechanisms to assess a repository's fulfilment of the criteria.22
Another German example, the Deutsche Initiative für Netzwerkinformation (DINI)23 has established a certification approach covering institutional document and publication repositories focused on examining quality of service, visibility, interoperability, and reliance on standards [3, 4]. The DINI certification process starts with a repository completing an audit template. Following submission of the completed questionnaire an information specialist and a technical expert evaluate the responses and assess whether certification should be granted; this process often involves communication with the repository seeking certification to ask it for additional information. The DINI certificate, launched in 2003 by the Electronic Publishing working group established a minimum set of requirements for repositories and the institutions that administered them, covering such issues as server policies, legal matters, and long-term availability and sustainability. Although, so far, restricted to a single class repository, DINI in 2006 runs the only functional digital repository certification scheme.24
3. Planning Pilot Audits
To further clarify how repository audit and certification should be conducted, the DCC is engaged in a series of pilot audits that will complement ongoing work in Germany and the USA. DCC audits will take place during July and August of 2006 at three UK organizations, and the outcome of these audits will be published later in 2006. In preparation for these audits we recognise the need to establish an understanding of what represents an evidence base for repository audit, define a list of classes of individuals who should participate in the process of gathering and presenting the evidence, assist other initiatives in defining the metrics and strategies that should be used to evaluate documentary, system and testimonial evidence, and contribute to the refining of the thinking on audit criteria and processes.
The evaluation process begins prior to the site visit, with initial research into the institutional infrastructure of the repository, the nature of its collections, and the demographics of its depositors and consumers. As a second stage in audit planning, a pre-visit questionnaire will be sent to and returned by the target repository to provide auditors with a profile of the institution's technical architecture, organisational structure, and financial position. It will give auditors information concerning such areas as security, performance, and management control. Supporting documentation will be requested and reviewed in advance of the on-site audit. These materials will, for instance, give auditors material to support decisions about where and how to probe during a visit, and in some cases evidence to ascertain where processes, procedures, and practices are adequate. It will enable the audit team to establish an 'Audit Plan'. This will facilitate the identification of areas where observation of practice, interviews, checking of documentation and testing (e.g., disaster recovery tests, evaluation of stratified random samples of digital objects at different points in their lifecycle) should be used. The scope and nature of these data collection instruments and the types of documentation requested will be refined as the DCC and others working in this area, such as RLG/NARA and nestor, develop a richer understanding of the information requirements necessary to assess repositories as an outcome of pilot audits.
Each of the three DCC pilot audits will produce three types of output, each meeting a particular need:
4. What Is the Evidence Base?
Significant intellectual effort has been committed to the identification of the necessary technological, organisational, and financial characteristics repositories must have if they are to be granted a kite-mark of trustworthiness. This is perhaps realised most notably within the nestor and RLG-NARA audit checklists [6, 8]. The issue of the categories of evidence necessary to facilitate audits and enable certification needs to be given adequate consideration: any tool that omits to describe the evidence that will contribute to the audit process is incomplete. If an audit checklist has aspirations of practical applicability, its criteria must detail not only the expected and required standards, but also the means by which their attainment can be demonstrated and assessed. Similarly, if such tools are to promote self-assessment of a rigorous and reliable kind, and to be likely to provide a good predictor for the outcome of an independent external audit, then they must be comprehensive, either independently or in combination with one or more linked resources. With no indication of its acceptable evidence base a checklist for structuring and guiding the audit and certification of a repository has mainly theoretical value. It lacks practical applicability and does not support unbiased measurement. It becomes too open to interpretation, and a risk arises that it will be extrapolated to endorse even those repositories with recognisable shortcomings. Current work does not, so far, focus adequately on the evidence base; a further development stage is necessary to conceive a document that is practically useful within an audit. Efforts must probe for evidence of concrete processes, structures, and functionality.
In reviewing the audit tools that are being developed [6, 8] we have identified and reported a gap in the documentation requirements necessary to provide an evidence base for measuring repository compliance with the expectations for best practices as outlined in the checklists.
The types of evidence likely to be of value to an auditor will be drawn from a range of sources: information services, finance, human resources, and many others. The methods for selecting and evaluating the evidence need to be regularized. For example, presence or absence of a particular class of evidence is not necessarily a sufficient metric. Here we only seek to highlight the relevant questions and concerns, offer a series of common sense solutions, and prompt further exploration; we do not aspire to examine the issue as comprehensively as it needs to be. So we have not suggested methods for evaluating the documentary evidence. The most immediate barrier is establishing an understanding of the kinds of documentary and testimonial evidence that an auditor would seek to accumulate in considering a repository's case for certification. From this, a series of sub-questions follow:
The initial starting point though is the evidence itself; we accept the checklist format that has become de rigueur, but propose that at all times evidence requirements ought to be detailed inline alongside each certification criterion. Needless to say, the means of their satisfaction will be determined in part by the character and services of the particular repository undergoing audit. We favour a simple system of classification of the evidence, with conformance information categorised as documentary evidence, observation of practice evidence, or testimonial evidence. Here we would propose that observation can be much more than a passive activity, it might include such proactive steps as sampling, scenario sequencing, tests, and simulations.
5. Documentary Evidence
Some repository characteristics can be objectively assessed through the provision by the repository of documentary evidence and its analysis by the auditors. Insights into technical infrastructure, financial management, resource allocation, and user relationships can all be gained from the existence and analysis of a range of documentary evidence. Numerous types of documentation of value to the audit and certification process exist within repositories; for some, their presence alone will be encouraging, and in other cases their content will require scrutiny if its role in fostering organisational compliance is to be assessed. To promote an improved understanding of the kinds of documentation that might be used to support audit and certification we suggest that the following be considered as an initial list:
Even the processes by which these documents are managed will provide auditors with valuable insights into the running of the repository. It will be useful to know how decisions are taken to draft them, how their change is reviewed, how new versions are approved, and how staff are made aware of changes to procedures and policies.
Associated with many forms of physical documentary evidence will be fears over confidentiality and privacy every organisation has documentation that it regards as sensitive, whether for financial or business planning reasons, or those more attributable to relationships with depositors. Auditing teams need to put appropriate non-disclosure agreements in place to secure the confidence of repository representatives that privacy and confidentiality agreements will not be breached by the audit process. The processes and practices of conducting audits are, moreover, governed by a range of professional practices.
These types of documentation must be subjected to consistent and unbiased evaluation. The methods for evaluating documentation and reporting on their evaluation require further consideration. Moreover the evaluation of the submitted documentation may result in subsequent requests for additional written documentation, or the collection of evidence through observation of practice or by means of interviews.
6. Observation of Practice Evidence
Witness accounts describing existing processes within the repository represent a key means of determining whether certification criteria are being met. In most cases this represents the most straightforward way in which repository workflow and good practice can be evidenced. Within the context of an institutional audit, auditors themselves will expect to be exposed to the processes of ingest, archiving, and dissemination. This might be most fruitfully achieved by following the passage of a single digital object throughout the full process, or through the selection of different evidence points related to the passage of different objects through the process while observing what happens at each of these points. As well as processes, technical characteristics of the repository can be assessed in this way. While observation may appear less objectively quantifiable than documentary evidence, it nonetheless represents an important part of the organisational assessment and is used in other types of audit. In such areas as procedures and workflow models, auditors can test how well the repository understands what it does and does what it says it does, its relationship with its users, and how it has planned to handle disasters. These observations might include walkthroughs or testing and measurement of the presence of essential characteristics of digital objects against anticipated or projected characteristic survival after preservation action (e.g., after regularisation, migration, emulation) has been carried out. Here again our community needs to define what practices it should adopt to document these audits; there are practices we can adopt from other communities.
7. Testimonial Evidence
Interviews with stakeholders and repository staff will allow the auditor to assess internal mechanisms and organisational processes. Inevitably, documentary evidence offers incomplete insights. In most organisations there is a degree of knowledge that is locked away 'in the heads' of experienced repository staff. This in itself is a concern in terms of certification, and every repository should have mechanisms in place to mitigate risks posed by this situation. Interviews are an effective means of highlighting the omissions that exist within formal documentation and to validate whether the aspirations of the documents are achieved in reality.
A key step is to identify the particular individuals to be included in the interview process. Initial considerations suggest that interviews with staff fulfilling a representative sample of roles within the repository are sensible. This could stretch from Director to janitors (cleaners). For instance, a casual chat with a janitor might reveal that, although the repository's documentation states that to avoid data leakage all media from CDs to tapes are shredded or crushed before disposal, this does not happen in practice. Of course, in many smaller organisations different activities that might in larger organisations be handled by different individuals might be handled by a single person. The discussion should certainly be structured in terms of roles, and not people. A series of example roles are described below:
As vital as determining the list of potential interviewees is the identification of the core set of questions that will direct the dialogue and effectively marry it with the checklist being used. Current DCC research is committed to the development of a semi-structured interview template to facilitate the process of interview and personal engagement. This will be designed in a way that makes it extensible so that when issues arise as part of the pre-audit planning or the documentary review, the template can be adjusted on an audit-by-audit basis to allow the auditors to probe for the necessary information.
8. Conclusion and Next Steps
Evidence will play a crucial role in the process of repository certification. Without an agreed base of evidence against which to validate the checklist criteria, audits are likely to lack consistency and will depend too much on judgement(s) that may prove difficult to replicate, substantiate, or validate. Unless, therefore, a checklist is associated with a defined evidence base, its usefulness is diminished. Here we have considered the kinds of evidence that might provide auditors with necessary information to assess the levels of risk associated with a particular repository and to determine whether it should be certified as worthy of trust. In order to conceive an 'objective' and usable resource it is vital that any checklist offers repositories and auditors the means to understand the criteria necessary to achieve a 'worthy of trust' status in measurable (although not necessarily quantifiable) terms, and offers clear insights into how they might determine whether their own institution meets the criteria. This approach will not only facilitate the audit process, but will also assist institutions engaged in establishing new repositories in defining the processes and types of documentation they should put in place and maintain if they are to ensure that their organisation is 'working smart' and that it is 'audit ready'. Even existing repositories may benefit from guidance on the kinds of documentation that auditors are likely to seek when assessing levels of trust. Although the community is a year or more away from spinning out audit and certification procedures, it is not too early to consider the kinds of documentation that a repository should be keeping.
While we have suggested kinds of evidence that might underpin the use of repository checklists, metrics for measuring compliance or the level25 at which a repository meets a particular criterion require more research. As a corollary we might consider Chapin and Akridge's reflection that '[t]raditional security metrics are haphazard at best; at worst they give a false impression of security that leads to inefficient or unsafe implementation of security measures' . This is a scenario that the repository community should wish to avoid. And, if we are to avoid it, we need to establish a secure evidence base and agree metrics for evaluating it.
There is a downside to considering the repository audit and certification process from the point of view of evidence it makes it readily apparent how much effort will be involved in the audit process and how high the cost is likely to be in a way that checklists alone do not. On the other hand, by considering the evidence and underlying processes, at an early stage, repositories will be able to contain costs through adopting best practices.
Finally as a community we need to consider how other audit and certification tools might be integrated with our emerging checklists or tailored to meet our needs. As Hans Hofman, of the Dutch National Archives, has observed on many occasions, any new methods need to be placed in the larger audit context that includes such approaches as the COSO framework for audit,26 COBIT framework (Control Objectives for Information and Related Technologies),27 ITIL (IT Infrastructure Library) service management,28 ISO 9000 family of quality management and assurance standards,29 and ISO 17799 for information security.30 The digital repository building and management community, such as libraries and archives, does not appear so far to have paid sufficient attention to these other strands of activity and tools. They do have much to offer us and, we believe, we have much to learn from them. This work should be undertaken alongside further refinement of evidence requirements and development of metrics for assessing and measuring checklist compliance.
Both authors participated in the definition of this work, analysis and synthesis, and drafting of the manuscript. They agreed the final version of the manuscript.
10. Conflicts of Interest
We declare that we have no conflict of interest.
The Digital Curation Centre is supported by a grant from the UK Joint Information Systems Committee (JISC) and the UK e-Science Core Programme of the Engineering and Physical Sciences Research Council (EPSRC). We are grateful to Niklaus Bütikofer, formerly of the Swiss Federal Archives, who is working with us to define the DCC's audit and certification working practices, for discussions. We wish to thank Robin Dale of RLG, and David Giaretta of the DCC and Principal Investigator in CASPAR31 for discussion and for providing us with early access to the RLG/NARA Checklist . We are grateful to Stefan Strathmann, Göttingen State and University Library (Germany) who kindly supplied us with an early draft of the nestor audit and certification guidelines . Hans Hofman of the Dutch National Archives has over the past couple of years shared crucial thinking with us in this area, and we are beginning to work with him in the context of the European Union funded Sixth Framework Programme activity DigitalPreservationEurope32 (DPE) (FP6 priority IST-2005-2.5.10 contract number: IST 034762) to take repository audit and certification initiatives forward in the European context. Some DPE partners are also partners in DCC and nestor. Helen Hockx-yu (JISC) and Helen Tibbo of the University of North Carolina kindly offered valuable comments of the penultimate draft of the article. The opinions are those of the authors.
12. Web Site Citations
All citations of websites were validated on 18 July 2006.
1. Presented at the JCDL Workshop on Digital Curation and Institutional Repositories, 15 June 2006, Chapel Hill, North Carolina, © University of Glasgow for the Digital Curation Centre (DCC).
10. See presentations at the ERPANET workshop on Audit and Certification in Digital Preservation held in Antwerpen from 14-16 April 2004, <http://www.erpanet.org/events/2004/antwerpen/index.php> and in particular the report of that meeting <http://www.erpanet.org/events/2004/antwerpen/Workshop_Antwerpen_report.pdf>.
22. See the paper by Dobratz, S, A Schoger, and S Strathmann, 2006, 'Repository Evaluation and Certification' offered at the JCDL Workshop on Digital Curation and Institutional Repositories, 15 June 2006, Chapel Hill, North Carolina <http://www.ils.unc.edu/tibbo/JCDL2006/Dobratz-JCDLWorkshop2006.pdf>.
25. It might even be worth asking about 'the way' in which checklist criteria are satisfied.
27. COBIT, <http://www.isaca.org/cobit/>, a reference framework for measuring performance, ascertaining success factors and using maturity models for benchmarking. It was released by the IT Governance Institute (ITGI).
29. See for instance, <http://www.iso.org/iso/en/iso9000-14000/understand/selection_use/selection_use.html>.
30. ISO/IEC 17799:2005: Information technology - Security techniques - Code of practice for information security management, <http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER
 S. Anderson and R. Heery, 2005, Digital Repositories Review. <http://www.jisc.ac.uk/uploaded_documents/digital-repositories-review-2005.pdf>.
 D. A. Chapin, and S. Akridge, 2005, 'How Can Security Be Measured?', Information Systems Control Journal, Volume 2 2005, <http://www.isaca.org/Template.cfm?Section=Home&CONTENTID=24174&TEMPLATE=
 DINI Working Group "Electronic Publishing, 2003, DINI-Certificate, Document and Publication Repositories" <http://www.dini.de/zertifikat/dini_certificate.pdf>.
 ERPANET, 2004, 'The Role of Audit and Certification in Digital Preservation', Stadsarchief Antwerpen, Belgium 14-16 April 2004 (2004)
 Die nestor-Arbeitsgruppe 'Vertrauenswürdige Archive Zertifizierung', 2006, Kriterienkatalog vertrauenswürdige digitale Langzeitarchive--ENTWURF, (March 2006), Berlin and München. (privately circulated).
 Reference Model for an Open Archival Information System (OAIS) ISO 14721, 2002. <http://public.ccsds.org/publications/archive/650x0b1.pdf>.
 RLG/NARA Task Force on Digital Repository Certification, 2005, Audit Checklist for Certifying Digital Repositories, <http://www.rlg.org/en/pdfs/rlgnara-repositorieschecklist.pdf>.
 S. Ross and A. McHugh, 2005, 'Audit and Certification: Creating a Mandate for the Digital Curation Centre', Diginews, Vol. 9 No. 5, ISSN 1093-5371, <http://www.rlg.org/en/page.php?Page_ID=20793#article1>.
 C. Rusbridge, P. Burnhill, S. Ross, P. Buneman, D. Giaretta, L. Lyon, M. Atkinson, 2005, 'The Digital Curation Centre: A Vision for Digital Curation', In Proceedings IEEE's Mass Storage and Systems Technology Committee Conference on From Local to Global: Data Interoperability--Challenges and Technologies, an online version is at: <http://eprints.erpanet.org/archive/00000082/01/DCC_Vision.pdf>.
Copyright © 2006 Seamus Ross and Andrew McHugh