Recombinant Catalog Metadata

This activity is now closed. The information on this page is provided for historical purposes only.

Goals

One of the Metadata Switch activities, this project has two aspects:

  1. Identify recombinant pieces of metadata (e.g. personal names) across bibliographic databases
  2. Create Web services to expose these pieces and their relationships

Overview

  • Analyze and extract metadata from existing databases, both OCLC's and harvested
  • Identify implicit relationships and make them explicit
  • Bring up experimental services based on existing and newly mined relationships
    • names
      • Name Authority Service would:
      • Receive metadata record or full text, and
      • send back, for each name:
      • Authorized form
      • URI representing persona
      • If in interactive mode, would provide a list of possibilities
    • subject classification
      • Service would receive metadata or full text, and
      • send back:
        • List of DDC numbers & captions
        • Subject headings
    • FRBR
      • work
      • expression
      • manifestation
      • item
    • whole/part
    • citation analysis

The concentration, especially for the services, will be on names. Other relationships, such as FRBR and whole/part, will also be identified.

An example of this might be to identify articles in common across databases in FirstSearch, and from this information identify variations in journal titles. This information could then be made available via a Web Service that, given a journal name, would return all known variant forms to help with searching. FirstSearch might be able to use this information directly to normalize journal names across our databases.

Methodology

Collaboration is important to correctly identify the types of services and relationships that users require. We will be working with the e-Prints UK project and Herbert van de Sompel at LANL to identify these, along with the protocols needed for communication.

Much of the work will be analyzing metadata extracted from databases to bring together and label otherwise buried information so that it can be of use in database creation and searching.

Timing

Work will start in September 2002 and finish December 2003.

Research team

We are a worldwide library cooperative, owned, governed and sustained by members since 1967. Our public purpose is a statement of commitment to each other—that we will work together to improve access to the information held in libraries around the globe, and find ways to reduce costs for libraries through collaboration.