ArchiveGrid

ArchiveGrid is a collection of over seven million archival material descriptions, including MARC records from WorldCat and finding aids harvested from the web. It is supported by OCLC Research as the basis for our experimentation and testing in text mining, data analysis, and discovery system applications and interfaces, and it provides a foundation for our collaboration and interactions with the archival community.

ArchiveGrid provides access to detailed archival collection descriptions such as documents, personal papers, family histories, and other archival materials held by thousands of libraries, museums, historical societies, and archives. It also provides contact information for the institutions where these collections are kept.

The majority of archival material descriptions in ArchiveGrid are from WorldCat and primarily represent archival collections held by institutions in the United States. This reflects the contribution patterns for descriptions of materials under archival control in WorldCat. We may extend ArchiveGrid beyond its current scope if it is necessary to support OCLC Research experimental objectives.

Background

ArchiveGrid was offered as an OCLC subscription-based discovery service from 2006 until it was discontinued in 2012. At that time, OCLC Research released this freely-available ArchiveGrid interface that shares some of the same attributes as the original subscription service. Although it is not a full production service, researchers can expect to use it for discovery of archival materials, and archives can work with OCLC Research to have their materials represented in the aggregation in a reliable and persistent way.

Impact

From our work with ArchiveGrid, we expect to share the results of MARC and EAD tag analysis, provide discovery system analytics for contributors, document investigations of text mining and data visualization, participate in community working groups pursuing improvements to description and discovery, and more. To support those interests and objectives, we'll continue to build this extensive and current aggregation of archival material descriptions, within the constraints of OCLC Research's committed and on-going support for this project.

Updates

The ArchiveGrid index is updated regularly every six weeks or so with a fresh extraction of MARC records from WorldCat and with finding aids that we harvest from contributor websites.  

With the ArchiveGrid index update in early October 2013, the steadily increasing number of archival material descriptions passed the two million mark. This is largely due to the on-going support of the archival community adding their material descriptions to WorldCat (which forms the basis of most of ArchiveGrid’s content) and, in some cases, supplementary finding aids that are directly harvested into ArchiveGrid.

The ArchiveGrid user interface also got a makeover in October 2013 and now utilizes Twitter Bootstrap front-end framework. Bootstrap's "mobile first" front-end framework enables ArchiveGrid to work well on smartphones and tablets (which currently represent about 15% of ArchiveGrid visitors), as well as many other responsive design and layout features.  

The most noticeable change in the new ArchiveGrid design is that the layout of individual collection pages has been modified to provide a more engaging user experience. Access points in these descriptions now include more information about how to get in touch with the archival institution. They also promote links to related materials such as finding aids or digital images. Since 76% of all ArchiveGrid visitors see a single archival material record page first, rather than ArchiveGrid’s default home page, these collection pages essentially serve as the "home page" for most ArchiveGrid visitors. 

We're evaluating ArchiveGrid’s analytics to learn how these changes are improving its visibility and utility, and we're also evaluating additional new features that could extend its reach.

Outputs

Application

Webinars: 

Publications

  • "Thresholds for Discovery: EAD Tag Analysis in ArchiveGrid, and Implications for Discovery Systems."
    Code4Lib Journal
    , 22 (2013-10-14)
    by: Marc Bron, Merrilee Proffitt and Bruce Washburn

    The ArchiveGrid discovery system is made up in part of an aggregation of EAD (Encoded Archival Description) encoded finding aids from hundreds of contributing institutions. In creating the ArchiveGrid discovery interface, the OCLC Research project team has long wrestled with what we can reasonably do with the large (120,000+) corpus of EAD documents. This paper presents an analysis of the EAD documents (the largest analysis of EAD documents to date). The analysis is paired with an evaluation of how well the documents support various aspects of online discovery. The paper also establishes a framework for thresholds of completeness and consistency to evaluate the results. We find that, while the EAD standard and encoding practices have not offered support for all aspects of online discovery, especially in a large and heterogeneous aggregation of EAD documents, current trends suggest that the evolution of the EAD standard and the shift from retrospective conversion to new shared tools for improved encoding hold real promise for the future.
  • Social Media and Archives: A Survey of Archive Users, An OCLC Research Report by Bruce Washburn, Ellen Eckert, and Merrilee Proffitt, OCLC Research

    This report details findings from a survey of users of archives to learn more about how researchers find out about systems like ArchiveGrid, and the role that social media, recommendations, reviews, and other forms of user-contributed annotation play in archival research. It will be of interest to those working with archival discovery services, or those investigating the utility of social media in discovery environments.

Presentations

  • The Problem of Interoperability: Archive Grid as an Archival Discovery Platform.
    1-26-2018
    Amy H Chen, University of Iowa
    Bruce Washburn, OCLC

    All too often, archive websites dissuade users from recognizing, let alone using, digital content. Those of us who work within the archive and library sector may say that we are service-oriented or patron-driven, but in practice what that usually means is that we value the accuracy and timeliness of our answers and the warmth of our interactions, not how users engage with us online. After all, we are not software and web designers. As a result, we rely on external web-based platforms like ArchiveGrid, internal discovery platforms like Aeon, and internal IT teams to facilitate digital access to our content. But poor UX results in discovery that requires more support by archivists and librarians if users even make it to us to ask for help. While extensive UX testing is needed to provide better access on ArchiveGrid, Aeon, and individual repository websites, this contribution will specifically discuss the complexities of finding born digital content through ArchiveGrid, the closest thing the United States have to a national archival discovery tool. Structured as a mock conversation between Bruce Washburn, a software engineer at OCLC, and Amy Chen, a researcher and librarian at the University of Iowa, this conversation will show what it would take to improve ArchiveGrid by improving extent data, providing uniform file types, fostering linked open data, and more.