Data sets & services
Most OCLC services are powered by WorldCat bibliographic data collected, improved and expanded on by the cooperative’s member libraries. Increasingly, OCLC is making that data available in the form of data services or downloadable data sets which allow programmers and linked data practitioners access to tools and data necessary to create and improve their own applications.
WorldCat Linked Data
Data for WorldCat entities, such as Works, are published using Linked Data Principles. The identifier (URI) for a resource provides direct content-negotiated access to a data description in several RDF serializations (HTML, RDF/XML, Turtle, JSON-LD, N-Triples), the default serialization being HTML with embedded RDFa mark up.
Direct access to individual serializations is also possible by applying the appropriate suffix to the URI (.rdf, .ttl, .jsonld, .nt). For example the Turtle serialization of a Work: http://www.worldcat.org/entity/work/id/1151002411.ttl
Examples of current WorldCat Entities available as Linked Data:
- WorldCat – OCLC Numbers: http://www.worldcat.org/oclc/41266045
- WorldCat Works: http://www.worldcat.org/entity/work/id/1151002411
Schema.org mark-up in WorldCat.org
The Schema.org vocabulary is used to mark-up (using RDFa 1.1) the HTML pages of the Linked Data Explorer and WorldCat.org – for the entire collection (book and journal titles primarily; not journal articles added through third-party providers). This benefits our search partners and those creating widgets that require structured data exposed within HTML pages. Using Schema.org (with extensions, currently being evolved within the Schema Bib Extend W3C Community Group) as a basis for modelling, this mark-up enables the fullest coverage and consumption of WorldCat library data by search engines and other systems. The Schema.org vocabulary is applied consistently across all RDF data serializations available via content-negotiation.
Making WorldCat data available to programs as well as humans means that services can process library data and make more effective connections between it and other Web resources. Please note that this is not a static program, but an initial experimental project that will change over time as we receive feedback from the library and wider Web communities. Read more here.
Download Data Sets
In addition to WorldCat Linked Data, some OCLC data is available in the form of a downloadable file:
|Data Set (download)||Licenses||More information|
|FAST (Faceted Application of Subject Terminology) downloads available as single MARC XML and RDF (Linked Data Format N-Triples) .zip files and by facet||ODC-By for FAST||Research Works page|
|VIAF (Virtual International Authority File) data source
(6 data files available in .gz format)
|ODC-By for VIAF||Service information|
OCLC Web Services
These tools deliver access to data and services on different platforms and through a variety of applications. OCLC’s Web services support functions such as OpenURL resolution and consolidated access to multiple versions of published works. Direct linking provides centralized access to library ownership information and a cooperatively built repository of data about library collections and services. OCLC’s collection of Web services currently include access to WorldCat Search, knowledge base and Metadata APIs, the WorldCat Registry, and connections to the xISBN and xISSN data services. You can also view an application gallery of apps and mash-ups that OCLC Developer Network participants and others have created with these tools.
Dewey.info is an experimental space for DDC linked data, providing tools and information that relate to applying linked data principles to parts of the Dewey Decimal Classification system. You can read more about the project here, or dive in by clicking on one of the ten main classes below:
- 000 Computer science, information & general works
- 100 Philosophy & psychology
- 200 Religion
- 300 Social sciences
- 400 Language
- 500 Science
- 600 Technology
- 700 Arts & recreation
- 800 Literature
- 900 History & geography
FAST (Faceted Application of Subject Terminology)
The Library of Congress Subject Headings schema (LCSH) is by far the most commonly used and widely accepted subject vocabulary for general application. However, LCSH's complex syntax and rules for constructing headings restrict its application by requiring highly skilled personnel and limit the effectiveness of automated authority control. Recent trends, driven to a large extent by the rapid growth of the Web, are forcing changes in bibliographic control systems to make them easier to use, understand, and apply. The purpose of adapting the LCSH with a simplified syntax to create FAST is to retain the very rich vocabulary of LCSH while making the schema easier to understand, control, apply, and use. Read more about FAST here or access the FAST database. A user-friendly search interface is also available, as is user documentation.
The Virtual International Authority File (VIAF®)
VIAF is an international service designed to provide convenient access to the world's major name authority files. Its creators envision the VIAF as a building block for the Semantic Web to enable switching of the displayed form of names for persons to the preferred language and script of the Web user. Learn more about VIAF here, review the list of contributor organizations, or apply to become a contributor.
Global Library Statistics
These statistics include data, if available, for the total number of libraries, librarians, volumes, expenditures, and users for every country and territory in the world broken down into the major library types: academic, public, school, special and national. Learn more about the statistics, see an interactive view of the data, or download the data files.