Classify: a FRBR-based research prototype for applying classification numbers
A user interface and a machine service for assigning classification numbers and subject headings
By Diane Vizine-Goetz
Classification schemes are
used by libraries to provide
a systematic arrangement of
materials. The classification
numbers applied to books and other
materials are used to arrange items
physically on shelves and to support
browsing, filtering and retrieval of
bibliographic information in online systems.
The Classify prototype is designed to help
users apply classification numbers.
A recent scan of WorldCat reveals that
nearly 100 million classification numbers
have been applied to bibliographic
records in the database. The majority are
from the Dewey Decimal Classification (DDC), the Library of Congress
Classification (LCC) and the National
Library of Medicine Classification
(NLM) systems. The Classify prototype
takes advantage of this vast quantity of
classification data. The September 2009
update of the Classify database provides
access to over 37 million work-based
summaries of classification information.
Nearly 66 million bibliographic records,
representing many different editions,
formats and languages, were grouped
using the OCLC FRBR Work-Set
algorithm to form the database.
The information in Classify is accessible
through a user interface and through a
machine service. The user interface is ideal
for day-to-day cataloging tasks. The machine
service is good for batch processing and
has been used successfully in that mode
by several users. The user interface is
available at classify.oclc.org. Technical
information about the machine service is
accessible from the user interface.
The Classify database is searchable by
many of the control numbers associated
with books, magazines, journals and music
and video recordings. These numbers
include: ISBN (International Standard
Book Number), ISSN (International
Standard Serial Number) and UPC (Universal Product Code). The database
is also indexed by OCLC record number,
title and/or author and FAST headings. The
prototype is logging about 30,000 unique
searches per month. The most common
search type is ISBN, followed by title, title/author and FAST subject heading.
A Classify record for a work contains
the most frequently assigned DDC, LCC
and NLM class numbers, as applicable,
based on holdings counts. The user
interface presents a tabular summary of
this information and pie charts containing
the top ten classes for each scheme.
The pie charts often highlight cases
where multiple class numbers, or a
choice among class numbers, may be
appropriate for the work. The interface
also presents summary information for the
work as a whole, including the number of
bibliographic records in the set (labeled ‘Editions’ in the interface), the sum of the
holdings for all records in the set,
and a list of the different formats
represented in the work set. For
example, print books, eBooks and
audiobooks are grouped together.
The Classify prototype provides
access to a set of Faceted Application
of Subject Terminology (FAST)
headings that has been associated
with each work. FAST is a controlled
vocabulary based on the Library of
Congress Subject Headings (LCSH).
FAST headings provide additional
subject information about a work and
enable users to search the Classify
database by subject heading.
The application also provides
detailed information for each record
in a work set. Each entry includes
title and author, language, format,
holdings count and all syntactically
valid LCC, DDC and NLM class
numbers. For DDC numbers, Dewey
edition information is also given.
Class numbers assigned to records
by the Library of Congress and the
National Library of Medicine are
marked in the interface.
The research team responsible for
Classify is currently testing a new
interface. The plan is to release the
interface with the next update of the
database. The update will reflect the
contents of WorldCat at the end of
December 2009.
The global cooperative takes shape | It all comes together in the WorldCat Registry
|