Volume 18, Number 11/12
Table of Contents
CurateGear: Enabling the Curation of Digital Collections
Alex H. Poole, Christopher A. Lee, and Angela P. Murillo
University of North Carolina at Chapel Hill
Point of contact for this article: Alex H. Poole, email@example.com
CurateGear: Enabling the Curation of Digital Collections took place on January 6, 2012, in Chapel Hill, North Carolina. It was a highly interactive, day-long event focusing on digital curation tools and methods, involving presentations and demonstrations by a variety of experts. It will be followed by a second interactive day-long public symposium, "CurateGear 2013", continuing the same themes, to be held on January 9, 2013 at the William and Ida Friday Center for Continuing Education in Chapel Hill, North Carolina. The event had the support of the Institute of Museum and Library Services (IMLS), the Andrew W. Mellon Foundation, and the School of Information and Library Science at the University of North Carolina at Chapel Hill. CurateGear 2013 will not only inform participants, but also provide numerous opportunities to encounter and discuss the latest world-wide developments in, and applications of, digital curation in diverse professional contexts. It is hoped that this summary of the highlights of CurateGear 2012, and introduction to CurateGear 2013, will spark further interest in attending this event.
CurateGear: Enabling the Curation of Digital Collections took place on January 6, 2012, in Chapel Hill, North Carolina. It was a highly interactive day-long event focusing on digital curation tools and methods and involved presentations and demonstrations by a variety of experts. The Institute of Museum and Library Services (RE-05-08-0060-08), the Andrew W. Mellon Foundation (BitCurator project), and the School of Information and Library Science at the University of North Carolina at Chapel Hill provided support for the event. Approximately 60 students (doctoral and masters), faculty, and professionals were in attendance. Indicating the growing sense of community among digital curation professionals and the awareness of digital curation more generally, attendees traveled from as close by as Chapel Hill, and as far away as the University of California, Los Angeles. Speakers came from across the United States, as well as Australia and Canada.
CurateGear 2012's presentations comprised four broad topics Curation Needs and Behaviors, Repository Management Environments, Metadata and Documentation, and Data Transformation, Processing and Access and were interspersed with relevant demonstration-based and discussion-based workshops. CurateGear 2013, to take place on January 9, 2013, will follow a similar interactive format.
Session 1: Digital Curation Needs and Behaviors
After introductory remarks by Helen Tibbo and Cal Lee of the University of North Carolina at Chapel Hill, Carolyn Hank (McGill University) described scholars' beliefs and practices in preserving their blogs in whole or in part, whether for their own or for the public's later access and use. Maryland Institute for Technology in the Humanities's Matt Kirschenbaum next underscored the risk of losing early drafts, notes, and related documents written by popular authors such as Steven King and Tom Clancy, because of format changes over time in word processing software, and advised that the digital curation community must embrace the responsibility of preserving these literary resources. Doug Reside of the New York Public Library then concentrated on digital curation in the performing arts, underscoring curators' responsibility to make digital collections available online, provide contextual information and software tools to facilitate the use of digital collections, and refine methods for preserving and providing access to born-digital materials. Hank and Reside will return to CurateGear in 2013.
Session 2: Specific Repository Management Environments
Jonathan Crabtree (Odum Institute for Research in Social Science) dealt with the challenge of ensuring data preservation through replication and thus the challenge of managing copies. He described an audit system (SafeArchive) and highlighted ways to facilitate collaborative preservation, define policy, and generate audit reports. Next, Mike Thuman of Tessella presented on Safety Deposit Box (SDB), which provides services to ingest, identify, characterize, extract metadata, effect basic safety checks, store, preserve, and analyze a collection for risks. Mark Evans, also of Tessella, illustrated various features of SDB during the demonstration section.
Chien-Yi Hou (University of North Carolina at Chapel Hill) explored "Policy-Driven Data Management at Scale." He concentrated on the capabilities that a preservation environment should have, the salient policies that should be implemented and how to make them easy to follow, and the metadata that should be recorded. Last, Peter Van Garderen spoke about Artefactual Systems digital preservation consulting service and its provision of open-source software for archives and libraries. Van Garderen's presentation segued into a demonstration and discussion session on repository management environments. Crabtree, Thuman, Evans, Hou, Van Garderen will return to stimulate further conversation in 2013.
Session 3: Metadata and Documentation
Doug White of the National Institute for Standards and Technology (NIST) gave a presentation on the National Software Reference Library (NSRL). Comprising a software library, a metadata database, a publication, and a research environment, the NSRL collects metadata that describes every file on all media in a given physical collection. White discussed a variety of NSRL activities of potential relevance to digital curation professionals. During the demonstration session, Barbara Guttman from NIST joined White to answer questions and discuss potential areas of collaboration.
In "Accessioning-Based Metadata Extraction and Iterative Processing: Notes from the Field," Yale University's Mark Matienzo focused on metadata extraction as part of an accessioning workflow. He advocated the use of Digital Forensics XML (DFXML) and described his current efforts with Gumshoe, a prototype based on Blacklight.
David Pearson talked about recording and sharing digital preservation knowledge on formats, software and dependencies at the National Library of Australia (NLA). Over the past three years, the NLA system has concentrated on preservation intent, significance, and the level of support of formats and therefore the overall access to the content. Seamus Ross from the University of Toronto concluded the session with a description and a demonstration of the Digital Repository Audit Method Based on Risk Assessment (DRAMBORA) toolkit and the Data Audit Framework. White and Guttmann, Matienzo, and Pearson will describe their continuing progress at CurateGear 2013.
Session 4: Data Transformation, Processing, and Access
Greg Jansen from the University of North Carolina described Curator's Workbench, which helps capture and stage files, generate manifest with fixity, arrange folders and objects, migrate custom metadata, and export submission packages. Next, Trevor Owens of the Library of Congress presented on Viewshare.org. Viewshare is a free and open platform that allows the creation of dynamic and interactive interfaces to cultural heritage collections through bibliographic data.
Duke University's Seth Shaw explained his progress on a four-year effort to streamline the accessioning process at Duke's Special Collections and Archives. He spoke specifically about adding fixity checking, generating machine-processable metadata, and exploiting additional tools like the JSTOR/Harvard Object Validation Environment (JHOVE) and Digital Record Object Identification (DROID).
In "Tools for File Format Identification, Validation and Characterization," Georgia Tech Research Institute's Bill Underwood pointed out the limitations of tools such as Linux File Command and Magic File. Subsequently, he described extensions that could remedy them, namely the File Format Library and Magic for individual file formats.
Kam Woods (University of North Carolina at Chapel Hill) addressed "BitCurator: Tools for Digital Forensics Methods and Workflows in Real-World Collecting Institutions," a collaboration between the University of North Carolina's School of Information and Library Science and the Maryland Institute for Technology in the Humanities (MITH) supported by the Andrew W. Mellon Foundation. BitCurator is constructing tools, methods, and approaches for collecting professionals to exploit open source digital forensics tools, nurture professional connections and community building, and generate and disseminate supporting documentation. Jansen, Owens, Shaw, Underwood, and Woods will present once again in 2013.
Building on CurateGear 2012, CurateGear 2013: Enabling the Curation of Digital Collection will be another interactive day-long public symposium and will take place on January 9, 2013. It will draw similarly diverse attendees. It has also received support from the IMLS, the Andrew W. Mellon Foundation, and the School of Information and Library Science at the University of North Carolina at Chapel Hill.
Again focusing on current digital curation tools and methods, it will inform participants but also encourage conversation about the latest worldwide developments in, and applications of, digital curation in diverse professional contexts. Nineteen returning experts will provide updates on their digital curation projects and describe other current work. Five new experts will also contribute to the program: Lisa Gregory, Digital Collections Manager at the State Library of North Carolina; Leslie Johnston, Chief of Repository Development at the Library of Congress; Richard Marciano, Professor at the School of Information and Library Science at the University of North Carolina at Chapel Hill and Director at the Sustainable Archives & Leveraging Technologies group (SALT); Ryan Scherle, Digital Data Repository Architect at Duke University; and Katherine Skinner, Executive Director of the Educopia Institute. To obtain further information on CurateGear 2013, please visit http://ils.unc.edu/digccurr/curategear2013.html, where details will be posted.
About the Author
Alex H. Poole is a third-year doctoral student at the School of Information and Library Science at the University of North Carolina at Chapel Hill. A Fellow on the DigCCurr II Project: Extending an International Digital Curation Curriculum to Doctoral Students and Practitioners, he focuses on digital curation, the digital humanities, and all things archival.
Christopher A. Lee is Associate Professor at the School of Information and Library Science at the University of North Carolina, Chapel Hill. His primary area of research is the long-term curation of digital collections. He is particularly interested in the professionalization of this work and the diffusion of existing tools and methods into professional practice. Lee edited and provided several chapters to I, Digital: Personal Collections in the Digital Era. He is Principal Investigator of the BitCurator project, which is developing and disseminating open-source digital forensics tools for use by archivists and librarians.
Angela P. Murillo is a third year doctoral student in the School of Library and Information Science at the University of North Carolina at Chapel Hill. Her research areas include digital curation and preservation, scientific data management, reuse of data, and scientific metadata. She is the project manager and a doctoral fellow of the DigCCurr II Project: Extending an International Digital Curation Curriculum to Doctoral Students and Practitioners, University of North Carolina-Chapel Hill.