Data Science

The Web is the native environment of information seekers. OCLC Research recognizes that to be integrated into the Web, traditional library data must be transformed in various ways.  We are analyzing the data in WorldCat and other sources to derive new meaning, insights, and services for use by libraries and others on the Web.

Current Projects

CatVis: Visual Analytics for the World's Library Data

The CatVis project addresses the following questions: How can librarians use data visualizations to manage, analyze, and present library collections? How can visualizations of large bibliographic datasets and other complex data help researchers in the e-Humanities to ask and answer new research questions?
Learn more »


Ariadne's Thread: Interactive Context Explorer for Bibliographic Data

Ariadne's Thread is designed to visualize the networks of entities associated with bibliographic records and allows users to interactively explore the local context of the interested entities.
Learn more »


Multilingual Bibliographic Structure

This activity is designed to leverage the multilingual content of WorldCat® so that bibliographic information can be presented in the preferred language and script of the user.
Learn more »


Measuring Up: Assessing Accuracy of Reported Use and Impact of Digital Repositories

This project aims to better improve data collection and information sharing for institutional repositories and digitized collections. "Measuring Up" is led by Montana State University and includes partnerships with OCLC Research, the Association of Research Libraries and the University of New Mexico.
Learn more »


Cookbook Finder

Cookbook Finder is a works-based application that provides access to thousands of cookbooks and other works about food and nutrition described in library records. You can search by person, place, topic (e.g., course, ingredient, method, and more) and browse related works by author and topic (supplied by the Kindred Works/Recommender API). Results include links to full-text when available from HathiTrust and Project Gutenberg.
Learn more »


MARC Usage in WorldCat

This project will study the use of MARC tags and subfields in WorldCat and produce reports to inform decisions about where we go from here.
Learn more »


assignFAST

A Web interface for FAST Subject selection, assignFAST explores automating the manual selection of the Authorized and Use For headings based on autosuggest technology.
Learn more »


Kindred Works

Kindred Works is a demonstration interface built upon an experimental content-based recommender service. Various characteristics associated with a sample resource, such as classification numbers, subject headings, and genre terms, are matched to WorldCat to provide a list of recommendations.
Learn more »


FAST Converter

The FAST Converter is a Web interface for the conversion of LCSH headings to FAST headings. Either single headings or small sets of bibliographic records can be converted. The intent of this Web site is to provide a learning tool to help familiarize users with FAST and the differences between FAST and LCSH.
Learn more »


OCLC Linked Data Research

OCLC production units and OCLC Research are supporting the collaborative and the larger community with Linked Data-related research and standards activities, and are exploring Linked Data activities and applications.
Learn more »


WorldCat Identities Network

The WorldCat Identities Network gives users the opportunity to visually explore the interconnectivity and relationships between WorldCat Identities.
Learn more »


Scholars' Contributions to VIAF

This activity explores the potential benefits of collaborating with scholars to enrich the Virtual International Authority File (VIAF) with new names and additional script forms for names already represented. The experience and knowledge gained from working with diverse files may inform third parties’ development of authority tools used by scholars.
Learn more »


Work Records in WorldCat

View rich descriptions for books and other library materials.
Learn more »


Classify

Classify is a FRBR-based prototype designed to support the assignment of classification numbers and subject headings for books, DVDs, CDs, and other types of materials.
Learn more »


WorldCat Identities

WorldCat Identities has a summary page for every name in WorldCat.
Learn more »


FAST (Faceted Application of Subject Terminology)

FAST is an enumerative faceted subject heading schema derived from the Library of Congress Subject Headings (LCSH). FAST is easier to apply and can be successfully used by non-professionals.
Learn more »


Past Projects

Getting Found: SEO for Digital Repositories

This activity is part of an IMLS-funded project to develop strategies for improving the visibility of library digital repositories in Internet search engines through developing an RDF model based on Schema.org for the people, places, organizations, and objects associated with an institutional repository and its contents.
Learn more »


Europeana Innovation Pilots

This collaborative initiative aims to pilot the use of existing and newly developed OCLC Research methods and techniques for cleansing and enriching large aggregations of metadata to identify and create semantic links between heterogeneous objects that are connected.
Learn more »


Sharing and Aggregating Social Metadata

Identify the user contributions that would enrich the descriptive metadata created by libraries, archives, and museums and the issues that need to be resolved to communicate and share user contributions on the network level.
Learn more »


WorldCat Genres

Genre profiles allow users to browse genre terms for hundreds of titles, authors, subjects, characters, places, and more, ranked by popularity in WorldCat.
Learn more »


mapFAST

mapFAST is a Google Maps mashup prototype designed to provide map based access to bibliographic records using FAST geographic authorities.
Learn more »


Name Extraction

This project attempts to develop tools that advance the state of the art in extracting names from unstructured text and disambiguating them using authority files developed in the library community.
Learn more »


Terminology Services

This project provides Web-based services for controlled vocabularies.
Learn more »


Metadata Schema Transformation Services

The goal of the Metadata Schema Transformation project is to develop a simple, web-accessible service that translates metadata records from one publicly defined format into another.
Learn more »


OAICat

OAICat is a Java Servlet implementation of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) v2.0.
Learn more »


SRW/U

The SRW/U Open Source project offers software that implements both the SRW Web Service and the SRU REST model interface to databases. Included are interfaces that support DSpace and Lucene implementation and OCLC's Pears database.
Learn more »