The growing trend towards online scholarly communication has made digital repositories more important for the preservation and management of our shared knowledge [1,2,3]. Along with this growth has been a corresponding increase in the development of repository standards and protocols [4,5,6]. While this research is important, the ability of repositories to interoperate on a systems level is not sufficient to provide improved information discovery and access the human element must also be addressed [7,8,9,10]. For the majority of users, the primary means of interaction with a repository is through its default user interface which today typically means a web-based application.
Repositories often have requirements that are unique to their content. A repository must be able to adapt its user interface to these specific requirements while also satisfying the individual needs of its users. Manakin is designed to address these issues by creating an abstract framework that provides for the creation of individual, customized repository interfaces.
This article begins with a broad overview of Manakin and its role in the digital library landscape. A more technical discussion then considers Manakin's three major architectural components: Aspects, Themes, and the DRI Schema. Finally, we examine several use cases that have employed Manakin to solve difficult real-world problems. At the conclusion of the article, the reader will understand how Manakin can be used at their institution to adapt a repository's interface to the individual requirements of their users and the unique properties of their content.
2 What can Manakin do?
Manakin introduces a modular interface layer, enabling an institution to easily customize the interface according to the specific needs of the particular repository, community or collection. By modifying the user interface, a repository service can be separated from its implementation. It is often desirable to hide or at least not highlight the underlying implementation of the repository from the end user. The ability to separate the user interface of the repository service from its implementation makes it possible to change repository platforms in the future while maintaining a consistent set of functionalities and "look-and-feel."
Currently, Manakin provides an implementation for DSpace [11,12,13], a popular open-source repository platform that organizes its content to match the hierarchical structure of an academic institution. The current DSpace user interface, based upon JSP technology, is difficult and expensive to modify, and reinforces a cookie-cutter approach to the user interface. Manakin introduces a fundamental change to this approach by adding the ability for each community and collection within DSpace to establish a unique look-and-feel, brand content, visualize metadata, and share newly implemented features with others.
2.1 Modifying look-and-feel
At the heart of Manakin is the ability to change the look-and-feel of the repository through the use of themes. A theme can be applied to a specific item, collection, or group of collections. The cascading nature of themes and their ability to radically change the repository interface provides the mechanism to adapt the presentation of content. This ability allows Manakin to easily accomplish at least two major goals: inheriting theme definitions through the organizational structure and obscuring a particular repository platform from the end user.
Digital repositories are frequently used at academic institutions to store the scholarly output of the university . In this use case the repository may be organized along the lines of the institution's structure: colleges, departments, centers, labs, etc. Each of these units may have several collections arranged into hierarchical groups. A theme defined at the college level can cascade downward (if desired) through units such as departments, centers, or labs.
2.2 Branding Content
Repositories often contain content that is combined from multiple sources. The organizational identity behind these sources is important and needs to be preserved in the presentation of the repository. Content owners will be more likely to accept the library's role in managing their content if they are able to preserve their branding in the repository.
Additionally, many colleges and departments typically maintain a separate web presence that may already include highlighted research or other scholarly work from the unit. Allowing for content branding in the repository enables the existing website to extend into the repository, providing a nearly seamless experience for the end user. Preserving a unit's identity allows the library to provide a service that is valuable to the unit while still fulfilling the library's role to preserve the scholarly output of the organization.
2.3 Visualizing Metadata
Metadata is a critical component to any repository, enabling faster and easier access to content. Leveraging metadata through visualization enables the user to better understand metadata and make connections that are not otherwise easily made. Manakin is able to leverage the potential of atypical metadata such as geospatial metadata or complex item relationships.
For example, items containing geospatial metadata can be plotted on a map while items with dates as the significant characteristic can be depicted along a timeline. Manakin allows the interface to adapt so that any particular facet such as time or spatial relationships can become a central component of a discovery tool. These types of visualization techniques improve the user's ability to understand the content in the repository, while also aiding in serendipitous discovery traditionally one of the weakest features of a digital repository .
2.4 Sharing Extensions
Manakin gives developers several new tools that allow creation of modular extensions to the repository. This extension framework allows existing features to be modified cleanly, or entirely new features to be created. These features can range from new workflow and ingestion capabilities to minor modifications regarding the display of content. For example, a shopping cart could be added to the repository to enable users to purchase content. Using Manakin's extension framework enables these modifications to be packaged together and shared with other members in the community.
3 How does Manakin work?
Manakin is an abstract framework for building repository interfaces that currently provides an implementation for DSpace. The Manakin framework introduces three unique concepts: the DRI schema, Aspects, and Themes. These are the basic components a Manakin developer will use in creating new functionality for a repository or modifying the repository's look-and-feel.
Manakin is built on the Apache Cocoon  web development framework, which uses an XML-based pipeline architecture. The pipelined architecture means that an individual page is generated through the combination of many components arranged together along a "pipeline", each feeding into another until the final page is produced. Using this technique, websites are built through the arrangement of these pipeline components, an approach that is sometimes referred to as a "Lego-like". Manakin builds upon these basic Cocoon concepts to create the DRI schema, aspects, and themes.
3.1 DRI Schema
The Digital Repository Interface (DRI) is an XML schema that describes an abstract representation of a repository page . Since repositories fundamentally interact with artifacts and their metadata, the DRI schema must be able to both encode structural concepts and natively represent metadata in various forms. The structural portions of the schema are derived from the Text Encoding Initiative (TEI)  schema for its simplicity and expressiveness. The metadata portions of DRI utilize the METS  schema for packaging and encoding relationships between an item's components.
The TEI schema, which was originally developed for digitally representing a variety of literary and linguistic texts, has several features that make it advantageous for use in Manakin. Since DRI is an abstract representation, many different output formats are possible: HTML, PDF, SVG, and many others. Other popular encoding schemas such as XHTML lacked necessary expressiveness due to an inability to explicitly represent non-hierarchical relationships. For example, if a heading precedes a paragraph in an XHTML document, the heading is only related to the paragraph by convention because it happens to precede it, not explicitly in the syntax of the language.
Other schemas like TEI or Docbook are domain-specific and were therefore not suitable for our purposes. The decision was made to create a new domain-specific schema meeting the particular needs of repository interfaces. Although TEI was heavily used as a design inspiration, in order to reach a suitable degree of simplicity in the schema's design, many of the advanced features were removed. This means that while the TEI schema is not explicitly used within the DRI schema, the design patterns, common structures, and several of the elements are clearly derived from TEI.
The metadata portions of the DRI schema utilize existing standards with METS as a packaging schema to represent an item. Instead of encoding the metadata for an item inside the structural markup of a page, a simple reference is encoded to a METS document for the item. The descriptive metadata contained within the METS document can be in any one of numerous formats. At the present time Manakin supports DIM (DSpace Intermediate Metadata Format), MODS , and qualified or simple Dublin Core. In the future more advanced or content-specific formats might also be supported.
Extensions in Manakin are called aspects. Aspects are interactive components that modify existing features or provide new features for the digital repository. Aspects may provide functionality such as specialized searches, custom workflows, or even a shopping cart for the repository.
In Aspect-Oriented Programming , programs are broken down into distinct parts (aspects) that overlap as little as possible. Manakin Aspects are the arrangement of distinct and non-overlapping Cocoon components that combine to form the interactive features of Manakin. Each aspect expects a DRI document as input and produces a modified DRI document as output. Through this process, aspects are "chained together" so that whenever the system generates a page, every aspect is able to modify the page by adding its own content; this use of DRI documents as both input and output is what makes aspect chaining possible. Aspect chaining allows new features to be overlaid onto an existing system while eliminating the need to patch or merge files, because all aspects are kept structurally separate.
Manakin Aspects are responsible for the repository's interactive features and may query the DSpace API, possibly changing the state of DSpace. The standard distribution of Manakin/DSpace includes four 'core' aspects:
To gain a better understanding of aspect interaction within Manakin, consider the example of creating an aspect to add a shopping cart to DSpace. This new shopping cart should provide several new features to the system, including the ability to add an item to the user's cart, the ability to manage the cart (adding or removing items), and the ability to purchase and grant access to the item(s).
Once the aspect has been added to the repository, there are three ways in which the aspect can alter the repository's interface: specific pages could be modified, new content could be added uniformly to all pages, or entirely new pages could be created.
The shopping cart aspect only deals with the features necessary to implement a shopping cart, so when the aspect is turned off, there would be no indication of the shopping cart anywhere: no dead links, no dead pages, etc. However, when the aspect is turned on, the links to the shopping cart, the "Buy This Item" button, and checkout pages would all be present without necessitating the merging or patching of any files. The shopping cart would also be packaged into a self-contained JAR library, with all necessary components bundled together.
Manakin themes stylize the look-and-feel of the repository, community, or collection and are distributed as self-contained packages. A Manakin/DSpace installation may have multiple themes installed and available to be used in different parts of the repository. Themes can be applied either to the entire repository or to specific communities, collections or items within the repository. The theme application rules cascade downward, so when a theme is applied to a community, all collections and items contained within the community also inherit the theme's look-and-feel. These collections and items can either use the inherited theme or provide one of their own.
The central component of a theme is an XSL stylesheet that translates a DRI document into a display format. While the format is typically XHTML, nothing prevents the use of a print-oriented format such as PDF or a graphical format like SVG. Since the primary user interface is through a web browser, Manakin provides a base XSL library that fully implements translating a DRI document into XHTML. The library has been designed in a modular fashion that allows theme designers to create new themes by leveraging the existing library. Instead of duplicating previous work, a new theme can import the base library and merely 'override' specific XSL templates when necessary.
Incorporated into the base XSL library is the ability to support multiple metadata formats. Each format has an associated 'metadata handler' that can display the metadata to the user. While, by default, Manakin/DSpace uses the DSpace Intermediate Metadata (DIM) format, it may be configured to use other formats. The default distribution of Manakin contains metadata handlers for DIM, MODS, and qualified or simple Dublin Core. Items referenced in the DRI document occur in one of four contexts, dictated by a need for a summary or detailed view and whether the reference appears individually or in a set. Metadata handlers are libraries of XSL templates that format the display of items in these various contexts.
3.4 Putting it all together
These three components are used by Manakin to produce each view of the interface. In fact, basic repository functionality consists of a few base aspects chained together in a single pipeline. One aspect implements the ability to search and access content, another handles submission and workflow, and yet another aspect handles the curatorial features of the repository. These aspects are combined along with a theme to form a complete interface to the DSpace digital repository.
Two basic stages create a repository view: content generation, implemented through an aspect chain, and style application, handled by a single theme (Figure 1). As each aspect operates on the DRI document, it adds a new set of features to the repository. This result is then handed off to a theme to be transformed with a unique look-and-feel. The three major components of Manakin DRI, aspects, and themes all combine in a pipelined architecture to produce a complete modular interface to the digital repository.
This component-based architecture utilizing themes and aspects enables development tiers. The tier concept allows those who are not experts in the intricacies of Manakin, Cocoon, and XSL to adapt Manakin to their needs. Each tier requires more advanced skills in order to manipulate the interface and allows for progressively more complex manipulation of the repository interface .
4 Who is using Manakin?
Although early adopters around the world have been using Manakin since pre-beta stage, a stable release was announced in January at the Open Repositories 2007 conference in San Antonio, Texas . Manakin's flexibility and modularity has already proven valuable in the following cases.
4.1 Texas Digital Library
The Texas Digital Library (TDL) is providing a digital infrastructure for the scholarly activities of faculty and students at Texas universities . Manakin/DSpace is one of the key technologies being used by TDL to accomplish this goal. One of the services offered is the federated TDL repository (Figure 2a), which hosts collections from across the state of Texas. Currently the flagship collection focuses on Electronic Thesis and Dissertations (ETDs) gathered from four schools within three separate university systems: The Texas A&M University System, The Texas Tech University System, and The University of Texas System. Manakin allows TDL to brand each thesis or dissertation with the originating school's logo.
Manakin has also played a role in establishing a common look-and-feel extended across all the services provided by TDL. The main TDL website (Figure 2b) seamlessly integrates with both the repository service running Manakin and the TDL journal service (Figure 2c) using the Open Journal System . The common look-and-feel enables all the TDL services to be tightly integrated, allowing users to switch between their various functions without feeling as if they are leaving the TDL experience.
4.2 Geologic Atlas of the United States
Texas A&M University Library digitized a complete set of the Geologic Atlas of the United States, (Figure 3a), which consists of 227 map folios published by the USGS between 1894 and 1945 . Each folio consists of mixed content including maps, text, and photographs focusing on the economic geology and geography of the coverage region. Each folio is between 10 and 40 pages in length and is represented in DSpace as a single complex item.
Manakin allowed for a complete overhaul of the user interface when presenting these folios. It was determined that a map-based interface for browsing and searching would enable a user to visually determine the coverage area of a particular folio, as well as place the title in its geographic context. Manakin's access to the native geospatial metadata allowed the integration of the Yahoo! Maps interface directly into the repository.
In addition to the map-based browsing techniques, the interface also modified the way items are displayed (Figure 3b). This new view has thumbnails for each page and lower-resolution surrogates for screen viewing. This combination and the ability to see all pages of a folio at once serve to increase the ease with which the collection is navigated and understood.
Both of these enhancements allow the user to understand the scope of the collection and easily browse through the folios. This application is an example of Manakin's ability to leverage the latest web techniques, such as mashups, and was recognized as an Editor's Pick by Yahoo.com's Developer Network for its use of their Maps API.
4.3 Manakin across the globe
Several repositories outside of Texas have also adopted Manakin to address various challenges. One of Manakin's earliest experimenters is the Instituto Antonio Carlos Jobim (Figure 4a). The Institute used Manakin to support the musical- and image-based collections that span Antonio Jobim's career. Massachusetts Institute of Technology is using Manakin for their internal image collection: Rotch Visual Collections Online (Figure 4b). In Europe, the Doria repository (Figure 4c) is using Manakin for the dissemination of electronic thesis and dissertations from Finnish universities and has adapted Manakin to the multi-lingual needs of its users.
5 Future work
Moving forward from the initial release of Manakin and its incorporation into the standard distribution of DSpace, many interesting areas exist for further development.
First, as DSpace moves closer to adopting Manakin as the default user interface, we expect to see a community formed around the tradition of sharing themes and aspects. As new features are developed, they can be easily shared, thus improving the user experience for everyone by allowing repositories to incorporate the latest developments from around the world.
Second, theme development could be further simplified through the addition of another development tier tier zero. This could further simplify the interface to a point where no technical skills are required for creating a Manakin Theme. A graphical user interface would allow anyone to customize his or her digital collections easily on the fly.
Finally, we believe Manakin has the potential for application beyond DSpace. Manakin could be the tool with which several repository implementations are combined through a single interface, improving a user's ability to discover and access information.
 "Budapest Open Access Initiative." Accessed on July 24, 2007, available at: <http://www.soros.org/openaccess/>.
 Leslie Chan. "Supporting and enhancing scholarship in the digital age: the role of open access institutional repositories", Canadian Journal of Communications, Volume 29, Number 3, 2004. p277-300.
 Clifford Lynch. "Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age." ARL Bimonthly Report, Number 226, February 2003. <http://www.arl.org/resources/pubs/br/br226/br226ir.shtml>.
 Patrick Hochstenbach, Henry Jerez, and Herbert Van de Sompel, "The OAI-PMH static repository and static repository gateway". In Proceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries, Houston, Texas, May 27 - 31, 2003. p210-217.
 Library of Congress. "METS: An Overview and Tutorial". Metadata Encoding & Transmission Standard Home Page, accessed on July 24, 2007, available at: <http://www.loc.gov/standards/mets/METSOverview.v2.html>.
 Jeroen Bekaert, Patrick Hochstenbach, and Herbert Van de Sompel "Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library" D-Lib Magazine, Volume 9, Number 11, November 2003. <doi:10.1045/november2003-bekaert>.
 Vannevar Bush. "As we may think". The Atlantic Monthly, July 1945, p101-108. <http://www.theatlantic.com/doc/194507/bush>.
 SIGCHI's conference series on Human Factors in Computing Systems series, 1995 present, SIGCHI website available at: <http://sigchi.org/>.
 Ellysa Cahoy. "Finding Time in the Penn State Libraries". This YouTube video shows a repository interface, which has not considered the human element in its design. Accessed on July 24, 2007, available at: <http://www.youtube.com/watch?v=tKvR0OC4nYc>.
 Jihyun Kim. "Finding Documents in a Digital Institutional Repository: DSpace and Eprints." In Proceedings 68th Annual Meeting of the American Society for Information Science and Technology (ASIST), Charlotte, NC, USA, October 28 - November 2, 2005.
 MacKenzie Smith, Mary Barton, Mike Bass, Margret Branschofsky, Greg McClellan, Dave Stuve, Robert Tansley, Julie Harford Walker. "An Open Source Dynamic Digital Repository", D-Lib Magazine, Volume 9, Number 1, January 2003. <doi:10.1045/january2003-smith>.
 Robert Tansley, Mike Bass, Dave Stuve, Margret Branschofsky, Daniel Chudnov, Greg McClellan, and MacKenzie Smith. "The DSpace institutional digital repository system: current functionality". In Proceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries, Houston, Texas, May 27 - 31, 2003, p87-97.
 Robert Tansley, Mike Bass, and MacKenzie Smith. "DSpace as an Open Archival Information System: Current Status and Future Directions" Lecture Notes in Computer Science, Volume 2769, 2004. p446-460.
 Clifford Lynch, Joan Lippincott. "Institutional Repository Deployment in the United States as of Early 2005" D-Lib Magazine, Volume 11, Number 9, September 2005. <doi:10.1045/september2005-lynch>.
 Hanna Stelmaszewska, Ann Blandford. "From physical to digital: a case study of computer scientists' behaviour in physical libraries." In Proceedings of the 4rd ACM/IEEE-CS Joint Conference on Digital Libraries, Tucson, Arizona, June 7 - 11, 2004, p82-92.
 Stefano Mazzocchi, "Introducing Cocoon 2.0". O'Reilly xml.com homepage, accessed on July 24, 2007, available at: <http://www.xml.com/pub/a/2002/02/13/cocoon2.html>.
 Alexey Maslov, Cody Green, Adam Mikeal, and John Leggett. "DRI Schema Reference". Accessed on July 24, 2007, available at: <http://di.tamu.edu/projects/xmlui/schemaReference>.
 C M Sperberg-Mcqueen, and Lou Burnard. "Guidelines for Electronic Text Encoding and Interchange". Accessed on July 24, 2007, available at: <http://www.tei-c.org/P4X/>.
 Library of Congress. "MODS: Uses and Features". Metadata Object Description Schema homepage. Accessed on July 24, 2007, available at: <http://www.loc.gov/standards/mods/mods-overview.html>.
 Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Lopes, Joan-Marc Loingtier, and John Irwin. "Aspect-oriented programming". In Proceedings of ECOOP'97, Lecture Notes in Computer Science, Volume 1241, June 1997. p220 - 242.
 Scott Phillips, Cody Green, Alexey Maslov, Adam Mikeal, and John Leggett. "Manakin Developer's Guide". Accessed on July 24, 2007. <http://di.tamu.edu/projects/xmlui/resources/DevelopersGuide.pdf>.
 Scott Phillips, Cody Green, Alexey Maslov, Adam Mikeal, and John Leggett. "Introducing Manakin: Overview and Architecture". Open Repositories, San Antonio, Texas, January 23 - 26, 2007. <http://txspace.tamu.edu/handle/1969.1/5690>.
 Leggett, John, Mark McFarland and Drew Racine. "The Texas Digital Library: A Business Case". Prepared for and published by the Texas Digital Library, July 2005, revised July 2006. <https://sharepoint.lib.utexas.edu/texasdigitallibrary/Shared%20Documents/
 da Fonseca, R.M.S. "Open Journal Systems", ICCC 8th International Conference on Electronic publishing, Brasilia, June 23 - 26, 2004.
 Katherine Weimer, Rusty Kimball, Steen Bereyso, Brian Surratt, Adam Mikeal, and Alexey Maslov. "Access and Preservation of Scientific and Cartographic Literature Using and Institutional Repository". Documents to the People, Volume 34, Number 4, Winter 2006.
 Christiaan Gerard Kortekaas. "Making Fedora easier to implement with Fez". Open Repositories, San Antonio, Texas, January 23 - 26, 2007. <http://espace.library.uq.edu.au/view.php?pid=UQ:11924>.
Copyright © 2007 Scott Phillips, Cody Green, Alexey Maslov, Adam Mikeal, and John Leggett
Top | Contents
D-Lib Magazine Access Terms and Conditions