Stories

D-Lib Magazine
October 1999

Volume 5 Number 10

ISSN 1082-9873

Reference Linking in a Hybrid Library Environment

Part 3: Generalizing the SFX solution in the "SFX@Ghent & SFX@LANL" experiment

blue line

Herbert Van de Sompel
Los Alamos National Laboratory - Research Library
herbert.vandesompel@rug.ac.be

Patrick Hochstenbach
Automation Department of the Central Library
University of Ghent, Belgium
patrick.hochstenbach@rug.ac.be

 

Abstract

This is the third part of our papers about reference linking in a hybrid library environment. The first part described the state-of-the-art of reference linking and contrasted various approaches to the problem. It identified static and dynamic linking solutions, open and closed linking frameworks as well as just-in-case and just-in-time linking. The second part introduced SFX, a dynamic, just-in-time linking solution we built for our own purposes. However, we suggested that the underlying concepts were sufficiently generic to be applied in a wide range of digital libraries.

In this third part we show how this has been demonstrated conclusively in the "SFX@Ghent & SFX@LANL" experiment. In this experiment, local as well as remote distributed information resources of the digital library collections of the Research Library of the Los Alamos National Laboratory and the University of Ghent Library have been used as starting points for SFX-links into other parts of the collections. The SFX-framework has further been generalized in order to achieve a technology that can easily be transferred from one digital library environment to another and that minimizes the overhead in making the distributed information services that make up those libraries interoperable with SFX.

This third part starts with a presentation of the SFX problem statement in light of the recent discussions on reference linking. Next, it introduces the notion of global and local relevance of extended services as well as an architectural categorization of open linking frameworks, also referred to as frameworks that are supportive of selective resolution. Then, an in-depth description of the generalized SFX solution is given.

Rephrasing the SFX problem statement

The problem statement

It is relevant to rephrase the SFX problem statement in the context of the meetings and the subsequent reports and publications on reference linking organized by the Digital Library Federation (DLF), the National Information Standards Organization (NISO), the National Federation of Abstracting and Indexing Services (NFAIS), and the Society for Scholarly Publishing (SSP) (Caplan 1999a; Caplan 1999b; Caplan & Arms 1999; Needleman 1999).

The generic statement of the reference linking problem, as defined by the working group on reference linking was (Caplan 1999a; Caplan & Arms 1999):

Given the information in a standard citation, how does one get to the thing to which it refers?

However, the working group concentrated on a specific variation on this:

Given the information in a citation to a journal article, how does a user get from the citation to an appropriate copy of the article?

The SFX research also addresses these problems, but only as an instance of a more general problem that can be formulated as:

Given bibliographic metadata, how does one present relevant extended services for it?

Bibliographic metadata as a starting point

Clearly, the SFX research is not only concerned about information in a standard citation. Its starting point is bibliographic metadata in general. As such, information entities originating from typical scholarly resources such as records from abstracting & indexing databases, OPAC systems and preprint archives can be used as a starting point in the SFX problem statement. This is also the case for citations to both journal articles and books found in journal articles or books. But even fractional bibliographic metadata such as an author name taken from an e-mail message is a valid starting point in the SFX problem statement.

Extended services as a goal

A similar generalization holds for the target of the problem statement since the SFX research is not only concerned about linking to the full-text that corresponds to a citation in a journal article. It aims at the presentation of a variety of extended services for whichever metadata is used as a starting point. Extended services are services that present an information entity in a digital library -- defined as the link-source -- in the context of the entire information environment (Van de Sompel & Hochstenbach 1999a). For instance, for a given link-source record from an abstracting & indexing database, extended services can -- amongst others -- be the presentation of:

Global and local relevance of extended services

The adjective relevant is of particular importance in the notion relevant extended services as used in the SFX problem statement. It actually has two meanings: relevance as a global notion and relevance as a local notion. In order to explain this, the following types of extended services are considered:

Relevant as a global notion must be interpreted as being opposed to irrelevant in every context. Certain aspects of extended services are independent of the context of an individual collection; they actually apply on a global level:

Relevant as a local notion, refers to the fact that other aspects of extended services are dependent on the boundaries of a certain digital library collection. Local relevance has two manifestations:

While certain services are relevant in a global sense, they can become irrelevant if the digital library collection does not contain the information resource(s) required to implement them. Even if a full-text service is globally relevant for a certain link-source, it might be considered to be irrelevant in the context of a certain digital library collection if the journal referred to by the link-source is not part of that collection. In the same way, an abstract service pointing to a particular abstracting & indexing database for a given link-source can be globally relevant, as described above. Still, such a service is of no local relevance if the user’s digital library does not provide access to an implementation of that particular database, while it can be of local relevance if the digital library does.

The relevance of extended services will also depend on the technical implementation of the information resource(s) required to create the services. When a full_text service is globally relevant -- an electronic edition of an article exists -- as well as relevant in relation to the content of a certain collection -- the users of the digital library are authorized to access the electronic edition -- it can be regarded inappropriate to let the full_text service link to a full-text instance at a publisher’s site, when the digital library holds an instance in its local storage. In the DLF reference linking discussion, this issue was given the name of "the Harvard problem" (Caplan 1999a). Similar problems occur in the broader scope of extended services. For instance, as shown before, an abstract service can be globally relevant -- the journal in which an article was published is abstracted in a particular abstracting & indexing database -- as well as relevant in relation to the content of the collection -- the local digital library does provide access to the particular database. Still, the service might be irrelevant in relation to the implementation, if the actual implementation of the database does not support a mechanism to link into it using the parameters required to do an abstract look-up.

Systems supportive of selective resolution

Both issues regarding the local relevance of extended services indicate the need for open linking solutions that take the context of the local collection into account when links are presented to a user (Van de Sompel & Hochstenbach 1999a). When addressing the Harvard problem the DLF reference linking discussions have referred to open linking solutions as being supportive of selective resolution (Caplan & Arms 1999). From the above, it can be seen that the problem of local relevance of extended services is actually a generalization of certain aspects of the Harvard problem. As such, when a framework is able to present an approach to deal with the broader problem, the approach will also contain valuable elements to address the narrower Harvard problem.

 

Figure 1: Systems supportive of selective resolution

In relation to the Harvard problem, Caplan and Arms divide systems that support selective resolution into two categories:

This categorization can further be generalized by:

CATEGORY    

Category 1

central

central

Category 2

a

central & local

local => central

 

b

central & local

central => local

Category 3

local

local

   
SERVICE COMPONENT
REDIRECTION ORDER

Table 1: categorization of systems supportive of selective resolution

The resulting categorization is represented in Table 1, where 3 main categories of systems supporting selective resolution are shown, based on the nature of the service component and the redirection order:

The "SFX@Ghent & SFX@LANL" experiment

In the "SFX@Ghent & SFX@LANL" experiment (April 1999 - June 1999; henceforth referred to as Ghent&LANL), the Library Without Walls team of the Research Library at the Los Alamos National Laboratory (LANL) and the Automation Department of the Central Library at the University of Ghent have cooperated to illustrate the feasibility of the SFX approach as a means to provide extended services in a realistic and complex information environment.

The information environment in which Ghent&LANL has been conducted is dramatically different from the one of the first Elektron SFX experiment. To illustrate, Table 2 presents an overview of the information resources used in Ghent&LANL. The rows show the names of the information resources used in the experiment, the columns refer to the digital library collection. For each resource/collection combination the table indicates:

RESOURCE GHENT LANL

Type

Authority

Source

Target

Authority

Source

Target

Advance

OPAC

-

-

-

LANL

yes

yes

Aleph 500

OPAC

Ghent

yes

yes

-

-

-

Amazon.com

WWW

Amazon

no

yes

Amazon

no

yes

Antilope

OPAC

UA

no

yes

-

-

-

APS PROLA

FTXT

APS

yes

yes

APS

yes

yes

the arXiv

FTXT

LANL

yes

yes

LANL

yes

yes

BIOSIS

A&I

Ghent

yes

no

LANL

yes

no

Books in Print

A&I

Ghent

yes

yes

Ghent

yes

yes

Compendex

A&I

Ghent

yes

no

LANL

yes

no

Current Contents

A&I

Ghent

yes

yes

Ghent

yes

yes

EconLit

A&I

Ghent

yes

no

-

-

-

Genome base

A&I

NCBI

no

yes

NCBI

no

yes

Inspec

A&I

-

-

-

LANL

yes

no

SP

no

yes

SP

no

yes

Ulrich’s

A&I

Ghent

yes

yes

-

-

-

LiSa

A&I

Ghent

yes

yes

-

-

-

MathSci

A&I

Ghent

yes

no

-

-

-

Medline

A&I

Ghent

yes

no

-

-

-

NCBI

no

yes

NCBI

no

yes

SciSearch

A&I

LANL

yes

yes

LANL

yes

yes

ScienceServer

FTXT

LANL

no

yes

LANL

no

yes

Various

FTXT

various

no

yes

various

no

yes

Wiley InterScience

FTXT

Wiley

yes

yes

Wiley

yes

yes

Table 2: information resources in Ghent&LANL

Some considerations regarding Table 2:

From the above, it can be concluded that from the point of view of the amount of resources that are involved, and given their distributed nature and the availability of multiple SFX service components, Ghent&LANL is a very realistic experiment.

The need for a generalization of the SFX components

Although the fundamental concepts of SFX -- dynamic linking, just-in time linking and conceptual services (see (Van de Sompel & Hochstenbach 1999b)) -- have been left untouched for the Ghent&LANL experiment, the nature of its working environment and its goals have led to a strong generalization of the SFX components. The main impulses that inspired such a generalization and that distinguish the Ghent&LANL project from the Elektron experiment are:

The redesign of the SFX solution for Ghent&LANL leads to an architecture with a clear separation between the redirection component and the service component. Both components obviously interoperate in order to achieve a functional system. But the redirection component can potentially operate in an environment with non-SFX service components, while the SFX service component can equally function with another redirection mechanism, as long as that supports delivery of link-source metadata to the SFX service component. Several functional building blocks in both components have also been generalized in order to address the problems that arise from the complexity of the Ghent&LANL environment. The overall approach of the generalized solution is shown in Figure 2 and will be explained in more detail in the remainder of this paper. Information resources that can interoperate with SFX -- from now on referred to as SFX-aware systems -- insert an SFX-button for each link-source in the result set of a query. The just-in time approach of SFX requires the user to click such an SFX-button when requesting extended services for a specific link-source record. In response to this click, the local SFX redirection component will fetch link-source metadata -- usually -- from the origin resource using whichever protocol it takes to do so. Next, link-source metadata as well as information on its origin will be converted into an interfacing format. At this point, the local redirection mechanism has fulfilled its task and is able to deliver this information in a consistent representation to the local SFX service component.

Figure 2: the local redirection and service components of the generalized SFX solution

The first task of the local service component is to parse the information, handed over by the local redirection component, into a normalized internal representation object. During this process, the original content can be enhanced and/or augmented. The resulting information object is then fed into the SFX evaluation process in which it will be compared to the SFX-database. The SFX-database is a special kind of linking database. Unlike traditional linking services, it does not contain any static links between "documents" (records/citations/full-text/etc.) of a collection. Rather, it contains a collection of conceptual services that express potential inter-relationships between documents at the level of the resource from which they originate. The SFX evaluation process determines the relevance of each of these conceptual services using the -- lack of -- content in the information object. Next, the resulting bundle of relevant services is sent back to the user in the SFX-menu-screen. Consistent with the just-in-time approach of SFX, only when the user decides to use a service from the bundle, will the service be resolved into a URL to which the user is being redirected.

The SFX mechanism for local redirection

The task of the local redirection mechanism is to transport link-source metadata to the local redirection component, that interfaces with the local service component. In order to be able to interoperate with the SFX redirection mechanism, information resources need to be enhanced by the authorities running them in order to make them SFX-aware. The aim of this is to create the ability for information resources to insert an SFX-button targeted at the local redirection component for each link-source in the result set of a query into the resource. In the context of Ghent&LANL, the following are important considerations with this regard:

  1. Many information resources that are involved in the experiment are also used in normal production at the very same time. This means that they are also approached by users that do not have access to an SFX service component. In order to prevent such a user from seeing an irrelevant SFX-button, an SFX-aware resource must be able to recognize whether the user has access to an SFX service component or not. Based on that information, the resource can insert an SFX-button or not.
  2. Some information resources are approached by users from both digital library environments, hence with access to different SFX service components. An SFX-aware resource must be able to target the SFX-button at the appropriate local redirection component, in order for it to be able to deliver the link-source metadata from the origin information resource to the doorstep of the appropriate service component. This means that an SFX-aware resource must be able to parameterize the target of an SFX-button.
  3. Upon receipt of a request for extended services from a user, the local redirection component must be able to fetch the link-source metadata from its origin resource. This means that the local redirection component has to be informed about the origin and the identity of the link-source in order to be able to take the appropriate steps. Given the amount, distribution and diversity of the SFX-aware resources in Ghent&LANL, a consistent manner to communicate such information to the local service components is required.
  4. Link-source metadata must be fetched from a wide variety of distributed information resources that support different access protocols. In addition to that, those resources will respond by sending link-source metadata formatted according to different metadata schemes. In order for the local redirection component to be able to interface in a generic manner with the local service component, a unique metadata interchange format is desirable.

As will be shown, in the detailed description below, these issues are approached by:

Making information resources SFX-aware

The authorities running information resources need to enhance their systems in order to make them SFX-aware. The complexity of the Ghent&LANL environment has urged for a thoughtful exploration of ways to make resources SFX-aware, since only approaches that minimize the overhead in doing so for the authorities running the resources can be acceptable and workable. In the current implementation of the SFX redirection mechanism, they have to do this by:

The CookiePusher

The CookiePusher script is a pragmatic solution introduced to dynamically notify an information resource about the existence and location of a local SFX redirection component in the environment of the user consulting the resource. The underlying idea is that an information resource could at any time access the location of a local redirection component, if its URL were written as a cookie in the browser of the user consulting the resource. The availability of this URL is essential, since the resource must be able to dynamically target the SFX-button at the appropriate local component. However, for reasons of security and privacy, such browser cookies can maximally be read within the Internet domain of the server that has set the cookie (see Shishir 1996 pages 203-204). As such, it is impossible to set such a cookie so that it can be read by all information systems in a digital library collection when it consists of resources distributed over several domains, typically resources that are local and remote to the user’s institution.

In order to solve this problem, the first step in connecting to a resource is to request a server in the domain of the information resource to create an HTTP cookie. This detour is called the CookiePusher. The very simple CookiePusher script is installed in the domain of the information resource that has to be made SFX-aware. Rather than connecting immediately to the desired URL in the information resource, a connection is made to the resource’s CookiePusher first, sending values for the two parameters of the CookiePusher script:

Upon receipt of these parameters, the CookiePusher will first read the URL of the local redirection component and will use it to set a cookie in the user’s browser. Since the CookiePusher is in the domain of the resource, that cookie will be readable by the resource. Next, the CookiePusher will redirect the user to the desired URL in the resource.

As such, once the CookiePusher has been installed for a resource, the URL to connect to that resource will be changed to:

CookiePusher_URL?SFX_location= local_SFX& Redirect= service_URL

Where

For instance:

http://publish.aps.org/edaccess/prolatest/cookiepusher?
SFX_location=http%3A%2F%2Fisiserv.rug.ac.be%2Fcgi-bin%2Fsfx%2Fbin%2Fmenu.cgi
&Redirect=http%3A%2F%2Fpublish.aps.org%2Fedaccess%2Fprolatest%2Ftext%2FPRD%2Fv52%2Fi1%2Fp15_1  

is the URL used to connect to an item in the APS/PROLA domain. The APS/PROLA CookiePusher will read the location of the local redirection component from the SFX_location parameter and will use this to set a cookie named local_SFX with value:

http%3A%2F%2Fisiserv.rug.ac.be%2Fcgi-bin%2Fsfx%2Fbin%2Fmenu.cgi

which is the encoded location of the Ghent local SFX redirection component. Next, it will redirect the user to the desired location in the APS/PROLA:

http://publish.aps.org/edaccess/prolatest/text/PRD/v52/i1/p15_1  

From now on, at any point in the consultation, APS/PROLA will be able to read this cookie and use it to target -- in this case -- the Ghent redirection component.

The consistent SFX-URL structure

The essence of the detour made via the CookiePusher is the ability it creates for an information resource to know at any point whether the consulting user has access to a selective resolution system and, if so, what the location of its redirection component is. Based on that information, the resource can dynamically decide whether or not to insert an SFX-button for search results and if it does, which redirection component to target with the SFX-button. In order to make the many systems involved in the Ghent&LANL experiment interoperable with SFX, authorities running the systems have been asked to make the URL targeted by the SFX-button -- the SFX-URL -- compliant to the following format:

GENERAL

target?serviceDesc&objectDesc

DETAILED

local_SFX?vendorId=<theVendor>&databaseId=<theBase>&objectDesc=<theIdentifier>

Table 3: the syntax of the SFX-URL

In Table 3

vendorId=<theVendor>&databaseId=<theBase>.

serviceDesc information will play a crucial role at later stages of the SFX local redirection mechanism, as well as in the SFX-base which is central to the SFX service component.

Figure 3 to Figure 6 show examples of link-sources taken from Sources in the Ghent and/or LANL collections, mentioning their SFX-URL. For reasons of readability, the parameter values are not shown as being URL-encoded. Rather, it is mentioned that parts should be URL-encoded by enclosing them in a URLencode function.

SFX-URL for this link-source, pointing at the Ghent local redirection component:

http://isiserv.rug.ac.be/cgi-bin/sfx/bin/menu.cgi?vendorId=ERL&databaseId=BX

&objectDesc=URLencode(BX02 A:199900063465 I:0008-543X V:00085 S:000001 P:000065 Y:1999)

In the serviceDesc part of the URL, ERL refers to the SilverPlatter ERL implementation of BIOSIS, while BX is the family name of BIOSIS databases in the ERL environment. The objectDesc component contains several information elements in a tagged and fixed length representation. BX02 is the volume of the BIOSIS database where the link-source originates, while 199900063465 is the accession number, a unique record number of the link-source in BIOSIS. Other elements in the objectDesc are ISSN number, volume, issue, starting page and publication year.

Figure 3: a link-source from the Ghent ERL implementation of BIOSIS and its SFX-URL

SFX-URL for this link-source, pointing at the LANL local redirection component:

http://vole.lanl.gov/cgi-bin/sfx/bin/menu.cgi?vendorId=ADVANCE&databaseId=Biosis

&objectDesc= URLencode(fetchId=21179970&objectId=PREV199800135979&SICI=0016-6731(1998)148:2<645:TIOCTA>2.0.TX\;2-P)

The serviceDesc part of this URL is self-explanatory. The objectDesc component is tagged and fields can have variable lengths. The fetchId is the unique number of the link-source in the LANL implementation of BIOSIS, while the part of objectId after "PREV" is the BIOSIS accession number which is comparable to the A field in the SilverPlatter objectDesc of Figure 3. The SICI part contains a SICI for the link-source, from which ISSN, volume, issue, pagination and publication year can be derived.

Figure 4: a link-source from the LANL Advance implementation of BIOSIS and its SFX-URL

 

SFX-URL for the third reference as a link-source, pointing at the Ghent local redirection component:

http://isiserv.rug.ac.be/cgi-bin/sfx/bin/menu.cgi? vendorId=Wiley&databaseId=WIS

&objectDesc= URLencode(TYPE=JCIT& SNM=Saven&FNM=A&SNM=Piro&FNM=L&ATL= The newer purine analogues for the treatment of hairy-cell leukemia.&JTL=N Engl J Med &PYR=1994&VID=330&PPF=691&PPL=7)

The serviceDesc component now refers to the Wiley InterScience collection. The objectDesc is tagged and starts with an indication on the material type of the reference -- journal citation in this case -- followed by a tagged repetition of the full citation.

Figure 5: a link-source from Wiley InterScience and its SFX-URL

 

SFX-URL for the first link-source in the above result screen, pointing at the LANL local redirection component:

http://vole.lanl.gov/cgi-bin/sfx/bin/menu.cgi?vendorId=LANLTopic&databaseId=arXiv

&objectDesc= URLencode(fetchId=phys-9811004&objectId=physics/9811004)

The serviceDesc refers to the LANL Topic implementation of the Ginsparg e-print archive. The fetchId is the unique key for the record in that implementation, while the -- very similar -- objectId is the unique record number in Ginsparg’s implementation of the archive. No further metadata is available in the objectDesc.

Figure 6: a link-source from the arXiv and its SFX-URL

Fetching link-source metadata from an SFX-aware information resource with SourceParsers

The CookiePusher mechanism enables a resource to insert an SFX-button for each of the link-sources that are transferred to a user consulting the resource. The structure of the SFX-URL targeted by these SFX-buttons has been made consistent across resources to be of the form target?serviceDesc&objectDesc. When a user requests extended services by clicking such an SFX-button, a request is sent to his local SFX redirection component, which will receive serviceDesc and objectDesc values as parameters for the target script. The local component holds a collection of SourceParser scripts with names corresponding to valid serviceDesc’s (see Table 4). Having analyzed the serviceDesc information, the target script will launch the appropriate SourceParser. This serviceDesc-specific SourceParser uniquely implements:

RESOURCE

serviceDesc

SourceParser

Fetch protocol

Fetch key

the arXiv

LANLTopic

arXiv

S::LANLTopic:arXiv

HTTP

fetchId

BIOSIS

ERL

BX

S::ERL::BX

Z39.50

A

BIOSIS

ADVANCE

Biosis

S::ADVANCE::Biosis

Z39.50

fetchId

Wiley

Wiley

WIS

S::Wiley::WIS

none

none

Table 4: Some SFX-aware resources with their serviceDesc, Fetch protocol and Fetch key

The SFX service component

The task of the local SFX service component starts at the point where the local redirection mechanism hands over the metadata container that contains, in a consistent representation:

It is the task of the SFX service component to deliver extended services based on this information. The following are important considerations regarding the SFX service component in Ghent&LANL:

  1. The amount and quality of link-source metadata that becomes available in the metadata container is dependent on the type of resource from which its link-source originated and on the amount of information that the authority running the origin resource allows and/or supports to be fetched. In some cases such metadata can be corrupt or lack information that is essential for the SFX evaluation process to adequately perform its task;
  2. The SFX service component must be easily transportable between different digital library environments and remain easily manageable;
  3. The SFX service component must ultimately deliver service links in a just-in-time manner.

As can be seen from a detailed description of the SFX service component, these problems have been approached by:

The GenericRequest object

The service component will take the metadata container delivered by the local redirection mechanism as input and turn it into a normalized internal representation, called the GenericRequest object. Table 5 shows a representation of the GenericRequest object for the third citation in Figure 5. The GenericRequest object is an intelligent object, that is able to self-check the validity of its information elements based on pre-configured rules. It can also augment/enhance its content using information from a supporting database. For instance, the citation of Figure 5 does not contain an ISSN number nor a journal title, but rather an abbreviated journal title. In this case, the GenericRequest object augments its content, by adding the missing information via communication with a supporting database. Obviously, the GenericRequest object also contains a normalized version of the link-source metadata, as well as information about its origin.

At the time of the experiment, interoperability between the SFX local service component and non-SFX local redirection mechanisms was not an issue, since none were existing. As such, for reasons of simplicity, the metadata scheme of the GenericRequest object has fulfilled the role of interfacing metadata scheme between the local redirection and the local service component in Ghent&LANL.

<perldata>
<hash>
<item key="rec$vendorId">Wiley</item>
<item key="rec$databaseId">WIS</item>
<item key="rec$dbId">Wiley::WIS</item>
<item key="objectType">JOURNAL</item>
<item key="@abbrevTitle">
<array>
<item key="0">N ENGL J MED</item>
</array>
</item>
<item key="journalTitle">NEW ENGLAND JOURNAL OF MEDICINE</item>
<item key="ISSN">0028-4793</item>
<item key="year">1994</item>
<item key="volume">330</item>
<item key="startPage">691</item>
<item key="endPage">7</item>
<item key="@authLast">
<array>
<item key="0">Saven</item>
<item key="1">Piro</item>
</array>
</item>
<item key="@authInit">
<array>
<item key="0">A</item>
<item key="1">L</item>
</array>
</item>
<item key="articleTitle">The newer purine analogues for the treatment of hairy-cell leukemia.</item>
</hash></perldata>       

Table 5: Representation of an augmented GenericRequest object for the link-source of Figure 5

The SFX linking service and the SFX-base

As a result of the above, an instance of the GenericRequest object for the link-source for which extended services have been requested has become available to the SFX service component. It will be the task of this component to deliver the extended services to the user that has requested them. In this sense, the SFX service component is a linking service that, given a certain input "document", outputs "documents" related to the input. The SFX linking service is special, however, since it does not store static relationships between individual documents. Rather, it stores relationships between the resources from which the documents originate. In SFX, these relationships are called conceptual services and they are stored in the SFX-base. The SFX evaluation process will determine the relevance of each of these conceptual services based upon the information and origin of a link-source.

The requirement imposed on the Ghent&LANL implementation of the SFX service component to be easily transportable between different digital library environments has led to an important generalization of the design of the SFX-base. This has been achieved by explicitly reflecting the notion of global and local relevance of services in the implementation. A synthesized representation of the lay-out of the Ghent&LANL SFX-base is given in Figure 7.

Figure 7: Simplified lay-out of the SFX-base

Splitting the Colli table

As in the Elektron version of the SFX-base, the Source table contains the information resources that can be origins for link-sources. They are SFX-aware resources. In the Elektron version, the Colli contained conceptual services, directly coupled with the Target resources. (see Table 2 in (Van de Sompel & Hochstenbach 1999b)). Such a set-up was not adequately generic and, in the current design, this Colli has been split. One table has kept the name Colli, the other has been named the Target table. The Target table contains those resources into which linking is possible. The Colli table that connects the Source and Target tables now expresses the type of service that relates Source with Target resources. Table 6 shows the type of services implemented in Ghent&LANL.

COLLI SERVICES

FUNCTION

abstract

look-up of abstract information in an abstracting & indexing database for the item represented by the GenericRequest object

author

look-up of references by an author of the item represented by the GenericRequest object in an abstracting & indexing database

cited_author

look-up of citations to work by an author mentioned in the GenericRequest object

cited_reference

look-up of works citing the item represented by the GenericRequest object

full_text

link to the full-text of the item represented by the GenericRequest object

genome

look-up of sequence information found in the GenericRequest object

holding

holdings look-up in an OPAC system for the item represented by the GenericRequest object

review

look-up of a book review for then item represented by the GenericRequest object

Table 6: Services in the Colli and their function

Taking advantage of the global relevance of conceptual services

It is not a coincidence that the resources shown as Source and/or Target carry their globally common names rather than those of their local implementations in Ghent or LANL. This is actually a reflection of the conclusion that services relating Source and Target resources have global relevance. It is globally relevant to deliver an abstract service that, given a link-source from BIOSIS shows the corresponding abstract from Medline. Such a conceptual service can be imagined regardless of the implementations of each of these resources in a specific digital library. Therefore, the Ghent&LANL SFX-base expresses the relationships between Sources and Targets at the level of global relevance: there is an abstract service connecting BIOSIS and Medline, regardless of their local implementations. A very limited number of examples of how such services of global relevance connect Source and Target is shown in Table 7.

COLLI

SOURCE

SERVICE

TARGET

APS/PROLA

abstract

Inspec

the arXiv

author

Inspec

BIOSIS

abstract

Medline

BIOSIS

genome

Genome Base

Current Contents

abstract

LiSa

EconLit

review

Books in Print

Inspec

full_text

Springer

Wiley

abstract

Medline

Wiley

cited_reference

Science Cit. Base

Table 7: Examples of service relationships between Sources and Targets

Localization of services of global relevance

While the services shown in Table 7 are of global relevance, they do not take into account issues of relevance in relation to the local digital library collection. This localization of services of global relevance is achieved by:

As shown in Table 8 and Table 9, a key reflecting the serviceDesc values of the local implementations of resources -- found in the rec$dbId field of the GenericRequest object -- is added next to the global common name of the Sources. In the same way, at the Target side, the name of a local TargetParser is added next to the global name of which the local Target is an implementation. The TargetParser procedure implements the link-to syntax into the local implementation of the Target resource. It can be seen from Table 8 and Table 9 that Ghent and LANL use a different SourceParser for BIOSIS, which reflects that they have a different implementation. However, they share a TargetParser to provide the abstract service into Medline, since both have chosen the PubMed implementation as a Target to achieve this.

When the Source or Target resource required to implement a certain service is not available in the digital library collection, when the local implementation of the Target resource does not support the link mechanism required to implement the service, or when local librarians decide the service to be of no use to their end-users, its flag will be set to inactive. The service will no longer be taken into account in the SFX evaluation process deciding on the local relevance of conceptual services. In Table 8 this is the case for services with Inspec as a Source since Ghent does not have an Inspec implementation in its collection. In Table 9, this is the case for services with LiSa as a Target, since LANL does not have access to a LiSa implementation.

SOURCE

COLLI

TARGET

local

global

global

local

S::APS::PROLA

APS/PROLA

abstract

Inspec

T::ERL::IN

S::LANLTopic:arXiv

the arXiv

author

Inspec

T::ERL::IN

S::ERL::BX

BIOSIS

abstract

Medline

T::NCBI::PubMed

S::ERL::BX

BIOSIS

genome

Genome Base

T::NCBI::Genome

S::ERL::CCO

Current Contents

abstract

LiSa

T::ERL:LI

S::ERL::EC

EconLit

review

Books in Print

T::ERL::BOIP

inactive

Inspec

full_text

Springer

T::Springer::LINK

S::Wiley::WIS

Wiley

abstract

Medline

T::NCBI::PubMed

S::Wiley::WIS

Wiley

cited_reference

Science Cit. Base

T::CIC15:SciSearch

Table 8: Localization of services from Table 7 for Ghent

Source

Colli

Target

local

global

global

local

S::APS::PROLA

APS/PROLA

abstract

Inspec

T::ERL::IN

S::LANLTopic:arXiv

the arXiv

author

Inspec

T::ERL::IN

S::Advance::Biosis

BIOSIS

abstract

Medline

T::NCBI::PubMed

S::Advance::Biosis

BIOSIS

genome

Genome Base

T::NCBI::Genome

S::ERL::CCO

Current Contents

abstract

LiSa

inactive

inactive

EconLit

review

Books in Print

T::ERL::BOIP

S::Advance::Inspec

Inspec

full_text

Springer LINK

T::Springer::LINK

S::Wiley::WIS

Wiley

abstract

Medline

T::NCBI::PubMed

S::Wiley::WIS

Wiley

cited_reference

Science Cit. Base

T::CIC15:SciSearch

Table 9: Localization of services from Table 7 for LANL

Global and local Thresholds

The relationships between Source and Target resources expressed by a service connection in the Colli is made subject to restrictions called Thresholds. These Thresholds are the way to fine-tune conceptual services in order to minimize the presentation of services that are considered not to be appropriate to be presented. In order to illustrate this concept, two types of Thresholds are described: