English

WorldCat logo

Data strategy

Support research where it happens.

Research habits have changed. It’s no longer enough for libraries to simply maintain information resources for users to access. They now have to make the information visible to seekers where they begin their research. WorldCat provides libraries with an opportunity to display their holdings on websites beyond library catalogs in a format that search engines, citation management systems, campus platforms, research portals and other information websites can read and repurpose.

three images of individuals on laptops or tablets
Matthijs van Otegem [photo]

“No entry point…”

“First, we had a building and information, and we expected people to come to our building. Then, we digitized, and we expected people to come to our website. There's no entry point anymore. …It's our job to adapt.”

Matthijs van Otegem
Managing Director
Erasmus University Rotterdam
Rotterdam, South Holland, Netherlands

WorldCat’s unique role with linked data

Jay Weitz [photo]

“Most trusted sources…”

“Historically, libraries are among the most trusted sources of accurate information. As we’ve moved toward a linked data future, that institutional legacy of authoritative data has become increasingly important. At OCLC, my colleagues and I take that deeply to heart. We don’t succeed in every case, but our goal is and always has been the continual improvement of bibliographic and authority data for a world increasingly reliant on those data.”

Jay Weitz
Senior Consulting Database Specialist, Data Services and WorldCat Quality Management
OCLC
Dublin, Ohio, United States

The library data in WorldCat encodes some of the most important, unique and authoritative information sources in the world. When this information can be referenced on the web as “entities” with interconnected relationships, the data can be read and embedded in more websites and online tools than traditionally formatted bibliographic data. Librarians can do more with their data to drive attention back to libraries from more sources, increasing their relevance within the wider information ecosystem. WorldCat houses more library data than any other source and is constantly evolving to keep up with the changing nature of online research habits.

OCLC’s research and experimentation, combined with projects underway by libraries across the world, are revealing that many improvements are possible with linked data. Showing what linked data can do requires a true cooperative effort. We are dedicated to working with OCLC member libraries and partners to reach that goal.

Get started with linked data

Using linked data to share the world’s knowledge

OCLC has remained at the forefront of discussions and activities that affect linked data and the future of data integration throughout the web. OCLC Research continues to explore the opportunities that linked data can bring to researchers and libraries.

We conducted an international survey of more than 150 library projects to understand how libraries use linked data and how they want to use it in the future. With this information, we have developed new experimental services, such as WorldCat Identities, that create linked data for exposure to search engines. OCLC Research team members have also published various books, chapters and articles on this topic.

In addition, we have published more than 20 billion triples in the Resource Description Framework (RDF) model. This is the largest aggregation of library linked data resources in the world, and it's possible because of our work with VIAF, ISNI, and other name authority files. For example, the FAST faceted subject heading schema is derived from the Library of Congress Subject Headings (LCSH) with links to LCSH Authorities as well as to other authoritative sources such as VIAF, GeoNames, and Wikipedia.

Learn more about OCLC Research’s linked data efforts

Initiatives to take data into the future

OCLC continues to enhance products, taking full advantage of the power of WorldCat. Based on our linked data research and pilot programs, we are actively exploring ways to embed linked data relationships into WorldCat to ensure that user searches deliver rich and relevant results. To learn more about WorldCat’s downloadable data sets, visit the OCLC Developer Network.

WorldCat entities are a way to group all versions of a work, place, concept, person, organization, event or other type of data together. WorldCat work entities are available now for developers, and WorldCat person entities are in development.

WorldCat works

In April 2014, OCLC released 197 million WorldCat work entities that bring together multiple manifestations of a resource into one authoritative record. As of March 2017, more than 215 million WorldCat work entities are available. WorldCat work entities connect all descriptions of a work, despite variations in titles, publishers, authors’ names, subject headings and other bibliographic information. By linking these descriptions, library resources are made more discoverable on the popular websites where information seekers begin their searches.

See an example of a WorldCat works entity

WorldCat persons

WorldCat person entities connect related information about specific people into a brief description that includes various formats of the person’s name, creative works that the person has produced, and biographic sources of information about the person. As of March 2017, WorldCat persons include more than 117 million descriptions of authors, directors, musicians and others, which have been mined directly from WorldCat. OCLC recently conducted a linked data pilot program in which libraries used WorldCat persons in their regular workflows.

See an example of a WorldCat person entity

Linked data as a cooperative effort

OCLC works closely with other organizations, such as the Library of Congress, W3C and other data standards groups, to participate in linked data discussions and initiatives, ensuring that library data are included on the web. We believe that MARC will eventually be replaced by linked data representations. By cooperating with other organizations and making WorldCat data available to them, we both enhance the value of WorldCat and ensure that libraries have a voice in the future of information management.

BIBFRAME

OCLC remains committed to working with the Library of Congress and the library community to help finalize the BIBFRAME standard, an evolving model to share and connect bibliographic data. As multiple variants continue to evolve, we will continue to evaluate BIBFRAME data to help inform our linked data planning activities with a goal to allow all OCLC members to continue to register their collections in WorldCat.

Schema.org

As a result of OCLC's work with W3C, WorldCat entities are marked up in the schema.org vocabulary to allow search engines and other systems to mine and retrieve information from library data. This standard vocabulary, developed and sponsored by leading technology companies, serves as the data language that modern search engines understand best. WorldCat's use of schema.org has helped drive traffic to library websites and improved selection, acquisition, and licensing workflows.

PCC-URI task group

OCLC actively works with national libraries, universities and publishers on the Program for Cooperative Cataloging (PCC) URI task group to explore ways to efficiently convert MARC to linked data while preserving the cataloger’s original intent. This work includes exploring opportunities to isolate URIs in separate subfields that are currently unused.