Managing Ambiguity In VIAF
by: Thomas B. Hickey and Jenny A. Toves
The Virtual International Authority File (VIAF) is built from tens of millions of names represented in more than 130 million authority and bibliographic records expressed in multiple languages, scripts and formats. VIAF does not replace the source authority data, but creates something new built upon the relations mined from it. A common use of VIAF is in the creation of new 'local' authority records for authors based on information already in VIAF about the entity. VIAF can also be used as an authority file in its own right, for instance OCLC is now using VIAF as part of its identification of works and expressions. In a series of automated steps these names are linked and combined into VIAF clusters. Ambiguity occurs at several stages in VIAF, from the initial matching to cluster creation. VIAF's approach to managing this gives us a great deal of flexibility to deal with additions, deletions and changes to the underlying authority data. VIAF's approach to clustering has several rather novel aspects. The clustering itself proceeds in multiple stages in what could be called progressive refinement. It uses fairly loose matching to bring in candidates and then gradually brings them into the finished clusters using the information that can be gleaned from the rough groupings to make more informed decisions than could be made a priori. Another aspect is that all the information from all the records is used during the clustering. This results in a more fluid view of identity than hand-built authority files provide, while giving VIAF the ability to react to refinements in the clustering algorithms and new data on a regular basis. Finally, just the scale of VIAF provides opportunities the library community has not previously had to analyze and use authority data in machine processing. The problems and approaches used by VIAF may have implications in the use of linked data for other information services.
Hickey, Thomas B., and Jenny A. Toves. 2014. "Managing Ambiguity In VIAF" D-Lib Magazine 20 (July/August). doi:10.1045/july2014-hickey. http://www.dlib.org/dlib/july14/hickey/07hickey.html.