Skip to page content

Research : Activities : Audience Level

Audience Level

This activity is now closed. The information on this page is provided for historical purposes only.

There are a variety of ways to characterize library materials. The type of reader believed to be interested in a particular item is one. Such an indicator, generally known as the audience level, is potentially useful for a variety of activities, including the development of new ways to improve information relevance for retrieval, reference services (including readers advisory) and collection development. Audience-level filters could be implemented in existing retrieval systems to assist users in finding content based on their information needs.

The Audience Level prototype and its related research project are part of a broader data mining activity at OCLC Research, which seeks to explore various ways to leverage intelligence from system files, and "make data work harder." Determining a monograph's audience level is a challenge because cataloging rules generally do not require inclusion of this information. Thus, many bibliographic records have no explicit indicator of target audience. OCLC researchers hypothesized that audience level could be inferred from the types of library (such as Association of Research Libraries (ARL), non-ARL academic, public, and school) holding the material.

The data presented in the Audience Level prototype are current as of January 2008.

Background

Determining a monograph's audience level is difficult because there is no bibliographic practice or standard requiring the inclusion of this information in the bibliographic record, except for the fixed field in the Machine Readable Code (MARC) record and the Library of Congress Subject Heading (LCSH) subdivision often used to identify juvenile literature and fiction. Thus, many bibliographic records have no direct indication of the target audience for the item represented.

Impact

The findings from this research will benefit the development of new ways to improve information relevance for retrieval, reference services (including readers advisory) and collection development.

Audience level filters could be implemented in existing retrieval systems to assist users in finding content based on their information needs.

This effort is one of several data mining projects whereby OCLC Research seeks to extract intelligence from the data we have, and use it in different ways that provide value to libraries.

About the Audience Level Prototype

This prototype system, developed in conjunction with the Audience Level research project, uses library holdings data in WorldCat to calculate audience levels for books represented in the WorldCat database.

The audience level is then expressed as a decimal between 0.01 (juvenile books) and 1.00 (scholarly research works).

The Audience Level prototype is accessible in two ways:

  • a user interface
  • as web services that accept either OCLC number, ISBN or ISSN

An initial experiment with Greasemonkey scripts for Firefox proved to be exciting but high-maintenance, so it is no longer supported.

Try out the user interface

Access the Audience Level prototype and input an OCLC WorldCat number, an ISBN (international Standard Book Number), or an ISSN (International Standard Serial Number) for a periodical. (See sidebar for additional information on how to find one of these numbers.)

The system will return an assessment of the likely audience level of the item based on the holding patterns and bibliographic characteristics of the item, as described in the WorldCat record.

This assessment is represented numerically, along with title, author, and a summary of the WorldCat holdings used to calculate the audience level of the item.

The audience-level assessment also is represented graphically by a bar chart.

More information about the audience-level calculation is available by clicking on the "Manifestations" link that appears on the chart. This will display a list of all the different physical realizations of the work used to calculate its audience level.

(Be aware! Some works—such as those near the top of the OCLC Top 1000 list—have thousands of manifestations. Worksets such as these can take several moments to load into your browser.)

Manifestation-level data displayed include OCLC number for each manifestation, language and date of the manifestation, and number of libraries holding the manifestation.

In addition to the stand-alone Audience Level prototype, aggregate data from the OCLC Audience Level will be available in the prototype WorldCat Publisher Pages.

In addition to the user interface described above, the Audience Level prototype is available as a web service:

Web service

The Audience Level web service is available from:

http://audiencelevel.oclc.org/AudienceLevel/webServ/

for returning XML from OCLC database number inputs, and from:

http://audiencelevel.oclc.org/AudienceLevel/webISBNServ/

for returning XML from ISBNs.

Example:

The URL string:

http://audiencelevel.oclc.org/AudienceLevel/webISBNServ/0716601036

will produce the audience-level assessment for The World Book Encyclopedia workset (0.08)responding to the input of the ISBN 0-7166-091936.

Methodology

Recognizing that different types of libraries typically serve different populations, OCLC researchers considered whether library types could be related to audience levels. They decided to explore whether the pattern of holdings of materials in WorldCat might be leveraged to provide an audience-level indicator.

OCLC researchers hypothesized that audience level could be inferred from the types of library holding the material, if the holdings symbols were weighted by a numeric code for library type.

OCLC's WorldCat database provides an excellent data source for this project because it contains more than 50 million bibliographic records and a billion holding locations.

The fixed field in the Machine Readable Code (MARC) record includes a "Target Audience" indicator (008/22), described as: "The intellectual level of the audience for which the item is intended." The following table lists these codes and the audiences they represent, along with the weight we assigned to each code.

If the Target Audience indicator exists in a title's MARC record, the title is assigned the Audience Level as indicated in this table.

MARC code Description Audience Level
a preschool 0.0
b primary (K - 3) 0.1
c elementary and junior high (grades 4 - 8) 0.15
j juvenile (through age 15 or grade 9) 0.15
d secondary (grades 9 - 12) 0.25
e adult N/A
f specialized N/A
g general N/A

If the Target Audience indicator does not exist, an audience level is calculated for the title based on the library holdings data attached to the bibliographic record.

Each bibliographic record in OCLC has some number of holdings symbols attached to it. These symbols represent the individual libraries that are said to "hold" the item represented by the record.

Researchers determined the type of library for each holdings symbol in the database. They used 4 main categories: Association of Research Libraries (ARL) members, academic (non-ARL), public, and school. Any of the library symbols that did not fit into one of these groups were discarded.

After the library type of each holdings symbol was determined, researchers assigned a weight to each library type:

Library type Weight
ARL 1.0
Academic 0.67
Public 0.33
School 0.0

Once the weights were assigned, researchers constructed an indication of audience level by averaging the weights of the holdings symbols on the record. The formula for this averaging is:

(Number of ARL holdings symbols on the record * 1.0)
+ (Number of academic-library holdings symbols on the record * 0.67)
+ (Number of public-library holdings symbols on the record * 0.33)
+ (Number of school-library holdings symbols on the record * 0.0)
/ (Total number of holdings symbols on the record)
= The average library-type weight of libraries holding the item.

For example, say we have a record with the following holdings symbols:

1 ABC DEF GHI JKL MNO

where 1 is the OCLC number for the item, and ABC, DEF, etc. are the holdings symbols. Suppose ABC, DEF, and GHI are academic libraries, JKL is a public library, and MNO is a school library. The formula used to determine audience level for this item would be:

(3 * 1.0) + (1 * 0.67) + (1 * 0.33) / 5 = 0.8.

Furthermore, we can use this method to determine the audience level of a FRBR work by finding all of the items in that work and computing the average (weighted by holdings) of each of their respective audience levels. For example, consider a workset containing the following items:

1 5 .8
2 10 .76
3 7 .94

Where {1,2,3} are the OCLC numbers, {5,10,7} are the holdings counts that were used to compute the audience level, and {.8,.76,.94} are the respective audience levels of each item. The average audience level for the work would then be computed by:

[(5 * 0.8) + (10 * 0.76) + (7 * 0.94)] / (5 + 10 + 7) = 0.826

This approach can be used to calculate overall audience-level measures for collections or other groups of records.

The overall audience-level assessment for the WorldCat database itself is 0.63.

A wrinkle

We believe this approach produces interesing and usable results. For example:

Title Author ISBN Audience Level
Operations Research for Libraries and Information Agencies Kraft & Boyce 012424520X 0.78
The Kite Runner Khaled Hosseini 1573222453 0.43
The Da Vinci Code Dan Brown 0385504209 0.43
Harry Potter and the Sorcerer's Stone J.K. Rowling 0590353403 0.15

These values, which are for the FRBR work, are approximately what one would expect.

Of course, we need to remember what this approach measures. For example, if one were to assign a 'reading level' to Nietzsche's Thus Spake Zarathrustra (ISBN 0394608089) one might expect it to be high - maybe .8 or higher. However, we return a score of 0.61.

As a classic of philosophy this title has a wide potential audience, and is widely represented in public, academic and ARL collections. The manifestation-level records display audience-level measures ranging from 0.33 to 1.0.

OCLC Researchers continue to explore ways to account for and manage such distributional effects.

Feedback

This approach gives an indication of audience level. Is it useful? How could it be used? We are interested in your ideas! Please let us know what you think.

Outputs

  • Lynn Silipigni Connaway, and Timothy J. Dickey. 2008. "Beyond Data Mining: Delivering the Next Generation of Service from Library Data." Presented on panel, "Transforming Data into Services: Delivering the Next Generation of User-Oriented Collections and Services" at the American Society for Information Science & Technology 2008 Annual Meeting, Columbus, OH, October 28, 2008.
  • Edward T. O'Neill, Lynn Silipigni Connaway, and Timothy J. Dickey. 2008. "Estimating the Audience Level for Library Resources." Journal of the American Society for Information Science & Technology, 59(13), 2042-2050.
  • Lynn Silipigni Connaway. 2004. "Estimating Audience Level of Monographs Using Holding Patterns in WorldCat." Presented at Library Research Seminar III: Learning and Growing; Inquiry into librarianship, October 14-16, 2004, Kansas City, Missouri. Available online at: http://www.oclc.org/research/presentations/connaway/lrsIII_audience.ppt (PowerPoint:32MB/29slides)
  • OCLC Research Data Mining activities

Team Members




Most recent updates: page content 25 January 2010, prototype 11 February 08.

ResearchWorks icon (RW)ResearchWorks

This activity is part of ResearchWorks. Use of our prototypes is subject to OCLC's terms and conditions. By continuing past this point, you agree to abide by these terms.

Try the online demo

Project lead: Lynn Silipigni Connaway

Where can I find an ISBN, ISSN, or OCLC WorldCat number?

  • ISBNs and ISSNs appear on records in many library catalogs.
  • ISBNs and ISSNs also are displayed on many WorldCat records, which can be located through WorldCat.org or WorldCat partner sites.
  • Online bookstores frequently display ISBNs for specific titles.
  • OCLC numbers appear on records in WorldCat and other FirstSearch databases.
  • Some library catalogs may display OCLC numbers for individual titles.