Close window


No.14
ISSN: 1559-0011
January 2010

Contents

President's Report

The Ripple Effect

Libraries, archives and museums find more in common

The global cooperative takes shape

Classify

It all comes together in the WorldCat Registry

Metasearch expands the reach of WorldCat Local

Updates

Library statistics

By the numbers


Download this issue (3.3 MB pdf)

Share

research banner
researchbox

Classify: a FRBR-based research prototype for applying classification numbers

A user interface and a machine service for assigning classification numbers and subject headings

By Diane Vizine-Goetz

Classification schemes are used by libraries to provide a systematic arrangement of materials. The classification numbers applied to books and other materials are used to arrange items physically on shelves and to support browsing, filtering and retrieval of bibliographic information in online systems. The Classify prototype is designed to help users apply classification numbers.

A recent scan of WorldCat reveals that nearly 100 million classification numbers have been applied to bibliographic records in the database. The majority are from the Dewey Decimal Classification (DDC), the Library of Congress Classification (LCC) and the National Library of Medicine Classification (NLM) systems. The Classify prototype takes advantage of this vast quantity of classification data. The September 2009 update of the Classify database provides access to over 37 million work-based summaries of classification information. Nearly 66 million bibliographic records, representing many different editions, formats and languages, were grouped using the OCLC FRBR Work-Set algorithm to form the database.

The information in Classify is accessible through a user interface and through a machine service. The user interface is ideal for day-to-day cataloging tasks. The machine service is good for batch processing and has been used successfully in that mode by several users. The user interface is available at classify.oclc.org. Technical information about the machine service is accessible from the user interface.

The Classify database is searchable by many of the control numbers associated with books, magazines, journals and music and video recordings. These numbers include: ISBN (International Standard Book Number), ISSN (International Standard Serial Number) and UPC (Universal Product Code). The database is also indexed by OCLC record number, title and/or author and FAST headings. The prototype is logging about 30,000 unique searches per month. The most common search type is ISBN, followed by title, title/author and FAST subject heading.

A Classify record for a work contains the most frequently assigned DDC, LCC and NLM class numbers, as applicable, based on holdings counts. The user interface presents a tabular summary of this information and pie charts containing the top ten classes for each scheme. The pie charts often highlight cases where multiple class numbers, or a choice among class numbers, may be appropriate for the work. The interface also presents summary information for the work as a whole, including the number of bibliographic records in the set (labeled ‘Editions’ in the interface), the sum of the holdings for all records in the set, and a list of the different formats represented in the work set. For example, print books, eBooks and audiobooks are grouped together.

The Classify prototype provides access to a set of Faceted Application of Subject Terminology (FAST) headings that has been associated with each work. FAST is a controlled vocabulary based on the Library of Congress Subject Headings (LCSH). FAST headings provide additional subject information about a work and enable users to search the Classify database by subject heading.

The application also provides detailed information for each record in a work set. Each entry includes title and author, language, format, holdings count and all syntactically valid LCC, DDC and NLM class numbers. For DDC numbers, Dewey edition information is also given. Class numbers assigned to records by the Library of Congress and the National Library of Medicine are marked in the interface.

The research team responsible for Classify is currently testing a new interface. The plan is to release the interface with the next update of the database. The update will reflect the contents of WorldCat at the end of December 2009.

 


left arrowThe global cooperative takes shape | It all comes together in the WorldCat Registryright arrow