D-Lib Magazine
The Magazine of Digital Library Research
transparent image

D-Lib Magazine

September/October 2013
Volume 19, Number 9/10
Table of Contents


2013 Open Repositories Conference Highlights: Repository Island in Sea of Research Data

Carol Minton Morris



Printer-friendly Version



The Eighth International Conference on Open Repositories 2013 was held July 8 - 12, 2013 on Prince Edward Island, Canada. The annual conference offers attendees an opportunity to learn about new ways to access information, innovative repository tools, and emerging community initiatives. More than 300 attendees came to OR2013 to meet with colleagues, keep up with fast-paced development goals, and hear expert speakers who are attuned to current repository issues.



If life is a beach then the gentle landscape, red sands and ideal climate of Prince Edward Island, site of the Eighth International Conference on Open Repositories 2013 in July, was an ideal summertime setting for sharing ideas in fields related to repositories, archives and scholarly communities. The annual conference continues to offer attendees a forum to learn about new ways to access information, innovative repository tools, and emerging community initiatives. More than 300 attendees came to OR2013 to hear about formative best practices and technologies, visit with new and old friends who were gathered to share ideas, make progress towards fast-paced development goals, and be inspired by speakers attuned to current repository issues.


Clockwise from top left: red sands on a PEI beach; OR2013 program co-chairs Jon Dunn and Sarah Shreeves, OR2013 keynote speaker Victoria Stodden, and OR2013 host committee chair Mark Leggott; fields in bloom; Hydra Project poster; Green Gables Heritage Place farmstead, and; OR2013 DSpace User Group session.

In collaboration with host committee chair Mark Leggott (University of Prince Edward Island and Discovery Garden), Open Repositories 2013 Conference program co-chairs Jon Dunn (Indiana University) and Sarah Shreeves (University of Illinois at Urbana-Champaign) coordinated a full week of presentations, panels, posters, demonstrations, social events and user group sessions of interest to anyone working with repositories and the digital information lifecycle.

Results from a survey of 92 conference attendees indicate overall satisfaction with key conference components. Pater Audio provided OR2013 audio and visual support and reported this statistic via Twitter that may be indicative of the high level of activity in and around conference venues:

"Last #OR2013 stat via our awesome A/V guys Pater Audio: 550 meters of tape to secure cords."


Data collected by University of Prince Edward Island OR2013 host committee; analysis by Richard Green, University of Hull.


Research Results from Repository Data

The conference theme, "Use, Reuse, Reproduce," was aligned with questions around the role of repositories in managing, preserving and reproducing research results from repository assets. Verifying the results of research—the ability to make the same experiment turn out the same way using the same data—separates proven scientific fact from speculative reporting. Reproducing research from data that is held in a repository is the gold standard in the world of data repositories. Curating data for replication to meet that standard is a complex process that can hold up repository workflows. (See also iBlog, "the role of data repositories in reproducible research".)


Plenary Sessions

Re-use of repository data was on everyone's mind as Victoria Stodden offered the opening plenary presentation on computational methods for utilizing research data held in repositories as a way of addressing what she views as a credibility crisis in science.

Ms. Stodden is an assistant professor of Statistics at Columbia University and co-founder of RunMyCode, an open platform for disseminating code and data. She is a computational scientist who set the stage for several conference sessions about research data in repositories with her presentation about the central role of algorithms and code in the reproducibility of science entitled, "Reuse and Reproducibility: Opportunities and Challenges".

She challenged the audience to do something about the current state of data in repositories by becoming active partners in the scientific process as it performs an "internal validity" check on data and analysis. This process requires a tighter integration of how we communicate our results. Fortunately there are many new tools to help accomplish this. She believes that without access to the data and computer code that underlie scientific discoveries, published findings are not verifiable. Without open data there can be no scientific verification to perfect the scholarly record.

Stodden is an advocate for science policy that would require not only open publications and data, but also open code. "With many eyeballs, all bugs are shallow," Stodden explained. As a reminder of the relationship between published papers and the data that underlie research results Peter Ruijgrok offered this tweet:

"Victoria Stodden: A publication is actually an advertisement. Data and software code is what it is about as proof/reproducing."

Robin Rice, Data Librarian at University of Edinburgh, reflected on Dr. Stodden's keynote address in a blog post entitled Making Research 'Really Reproducible'.

In the closing plenary session Jean-Claude Guédon, professor of comparative literature, University of Montreal, made a case for the role open repositories could play in restoring quality in science. By looking beyond ranking systems to document and add real value to science, repositories can leverage communities, networks and open data to support researchers and scientific publishing. This interview with Dr. Guédon from Casa da Cultura Digital provides background


Conference Sessions & Workshops

A wide range of content was presented in 35 main conference sessions and several workshops that included topics ranging from aspects of repository management, future directions of core technologies, curation strategies and tools, rich media solutions, open access to research use cases, linked data examples, analytics techniques, identifiers, collaborative persistent access initiatives, and more. Interest in the care, handling, preservation and significance of research data in repositories was reflected in several sessions.

A Zotero collection of OR2013 presentations and related resources is available from the University of Toronto Digital Scholarship Unit. Please refer to other reports and blog posts by conference participants included in this report for additional information on specific conference sessions.

Every time a camera turns on and off there are new files, which is why preserving and providing access to media files in repositories is a challenge. The "Repository Solutions for Time-based Media" panel discussion was led by Claire Stewart, Northwestern University, Karen Cariani, WGHBH, Declan Fleming, University of California San Diego, Todd Grappone, University of California Los Angeles, and Brian Tingle, California Digital Library. Panelists focused on explaining criteria and issues around repository solutions for time-based media management and delivery at several institutions. Film, video, and audio files can be large files that are hard to manage and store and come in a variety of formats that change and morph. To keep these files accessible over time both technical and descriptive metadata is required. Karen Cariani suggested that the global addition of metadata that can often be captured in the camera, makes the most sense. Panelists seemed to agree that a convergence in media asset management among institutions would be a good idea.

Advancing knowledge in all fields of research now requires curation, collection, management, access and long-term preservation of digital data sets. Research libraries are currently planning and experimenting with how to put digital data policies, workflows and economic models in place to ensure that data will persist to serve researchers and institutions into the future by adding value to data throughout its lifecycle. In this panel discussion, "Institutional approaches to Research Data and Repositories," Mark Leggott, University Librarian, University of Prince Edward Island and Discovery Garden, Sarah Shreeves, University of Illinois Champaign-Urbana Library, Coordinator for the Illinois Digital Environment for Access to Learning and Scholarship (IDEALS), Andrew Bell, University of Southampton, ePrints Services, Dean Krafft, Cornell University Library Chief Technology Strategist, and Jill Sexton, University of North Carolina Head of Digital Repository Services, discussed techniques and practices at their institutions for curating, collecting, managing, providing access to and preserving digital data sets. See also Yale ISPS blog.


Developer Challenge

The Developer Challenge provides opportunities for software developers to sharpen their skills, showcase their work at a community event, demonstrate repository solutions and compete for more than $8,000 in prizes. The 2013 version of the Open Repositories Developer Challenge competition was judged on evolving criteria that echoed the values behind the Open Repositories Conference:
  • Transparent, fun, open collaboration in diversely constituted teams over individual brilliance and/or groups of like individuals in cutthroat competition.
  • The creation of new professional networks over the ossification of old ones.
  • Effective engagement of non-developers (researchers, repository managers) in development over purely developer driven projects.
  • Work done at the conference over presentation of something prepared earlier.
  • Innovative ideas expressed in running code over wire frames, hand waving and elevator pitches.
  • The development of the Open Repositories movement as a whole over siloed development on particular repository platforms.
  • Entertaining live presentation of challenge projects in a relaxed setting over formal submissions.

Winners included Team Ravens for the "PDF/Eh" project. Kevin Bowrin, Carleton College, explained a RESTful API software solution that enabled varying levels of PDF\A compliance. Team Orcid demonstrated a generic API integration with EPrints that allowed for creation and management of ID's from the repository.

Establishing new networks that facilitate working together as a community is a valuable part of the Developer challenge event. Developer Challenge enthusiast Peter Sefton said in his blog, "It was impossible not to network unless you stayed in your hotel room."


User Groups

Following the main conference, DSpace, ePrints and Fedora User Group Meetings on July 11 and 12 rounded out the week. Community user group presentations echoed the conference focus on supporting research and research data as well as interoperability with other platforms. A Community Discussion on the Foundations of Enhanced Metadata Support for DSpace affirmed the approach of the proposal to establish a foundation for future improvements to DSpace metadata. Fedora presentations on Hydra, Fedora Futures and Islandora offered users a range of solutions and use cases for improving access and digital object workflows echoing the conference theme of "Use, Reuse, Reproduce".

Program co-chair Jon Dunn offered this view on the overall significance of the conference, "The most exciting thing for me about OR2013 was seeing the wide range of collaboration taking place within the repository community—at local, national, and international levels—to advance the state of the art and develop new tools and services to meet the needs of an increasingly diverse set of users and types of content".


Open Repositories 2014

Next year's conference will be held June 9 - 13, 2014 in Helsinki, Finland and hosted by Helsinki University Library and the National Library of Finland. More information will be posted on the OR2014 web site as it becomes available. The "Welcome Presentation" given at OR2013 can be viewed at the OR2014 site.


About the Author

Photo of Carol Minton Morris

Carol Minton Morris is Director of Marketing and Communications for DuraSpace, and is past Communications Director for the National Science Digital Library (2000-2009) and Fedora Commons (2007-2009). She leads editorial content and materials development and dissemination for DuraSpace publications, web sites, initiatives and online events, and helps connect open access, open source and open technologies people, projects and institutions to relevant news and information. She was the founding editor of NSDL Whiteboard Report (2000-2009) featuring information from National Science Digital Library (NSDL) projects and programs nationwide. She is chair of the Open Repositories Conference Steering Committee. Follow her at http://twitter.com/DuraSpace.

transparent image