The OCLC Research Identities work provided valuable insight into how to mine bibliographic data for insight into the People and Organizations that create and serve as subjects for library materials.
We have taken the findings of the research effort and used them to start to build the WorldCat Entities data. These new Entities and their persistent URIs will serve as the foundation for future linked data services similar to those explored in the Identities Research work.
Visit the WorldCat Entities Project
Original Project Description
WorldCat Identities has a summary page for every name in WorldCat (currently some 30 million names) including named persons, organizations and fictitious characters. The pages include information derived from WorldCat and other sources (VIAF, FAST) plus with unique data derived or created through a variety of special processing activities (e.g., WorldCat Identities provides statistical data about how widely held a work is). A typical WorldCat Identities page will include a list of most widely held-by-libraries works by and about the identity, a list of variant forms of name the identity has been known by, a FAST tag cloud of places, topics, etc. closely related to works by and about the person, links to co-authors, and more. Titles listed are linked to WorldCat.org, and in many popular WorldCat Identities pages, links to the corresponding Wikipedia (English language) article are provided.
In WorldCat.org displays, the "Find more information about:" feature provides links to WorldCat Identities pages for the named persons, organizations and fictitious characters associated with the material being described.
To allow OCLC Research to experiment with new features, we maintain a Research version in addition to the pages that are available via WorldCat.org.
WorldCat Identities arose from a "what if?" exploration led by Chief Scientist Thom Hickey with other members of what would become the WorldCat Identities team. The idea was to explore automatically assembling as much information as possible from WorldCat and other resources stewarded by OCLC about a given person, organization or fictitious character and present an algorithmically-built presentation of that information. This work was significantly aided by OCLC Research's work with FRBRizing WorldCat (i.e.. identifying and clustering variant editions of individual works) which allowed OCLC Research to do data mining to identify, say, the most widely held work authored by person and more.
OCLC Research creates the pages used by WorldCat.org, and we plan to experiment with allowing splits and merges of Identity pages in our version of the service.
We create quarterly updates of Identity pages.