CATViS: Visual Analytics for the World's Library Data

OCLC Research, the Technical University of Eindhoven (Faculty of Mathematics and Computer Science) and the University of Amsterdam (Faculty of Humanities) are conducting a 4-year project, entitled "CatVis: Visual Analytics for the World’s Library Data," from 1 September 2015 through 2019. The project is supported by a Creative Industry grant from the Netherlands Organisation for Scientific Research (NWO).

The CatVis project addresses the following questions:

  • How can librarians use data visualizations to manage, analyze, and present library collections?
  • How can visualizations of large bibliographic datasets and other complex data help researchers in the e-Humanities to ask and answer new research questions?
NWO logo TUE logo Amsterdam logo




The purpose of the project is to develop a cutting-edge visual analytics toolkit, to answer both the pressing needs of humanities researchers and concrete demands of the library industry. Our tools will provide visual interfaces for:

  1. data cleaning, clustering, and enrichment,
  2. data analysis and
  3. intuitive and interactive (geographic) representation of search results.

We will accompany the toolkit development with extensive expert user testing of humanities researchers and other expert users.

You can learn even more about the CatVis project at the Technical Universtiy of Eindhoven website.

Event

Visualization Techniques for Librarians and eHumanities Researchers - 26 November 2015

CatVis kickoff - This workshop brings together librarians and researchers within the fields of algorithms, visual analytics and humanities to discuss the design, utility and practical applications of visualization within the library sector and the eHumanities.

Outputs

Anticipated deliverables

Semantic indexing/similarity clustering and visualization algorithms, use cases, user evaluations, prototypes, demonstrations, presentations and publications.

Impact

  • The outcomes of the research project will feed into the further development of the GlamMap and Ariadne’s Thread visualization prototypes—which are designed to improve the browsing experience of users, through geographical and semantic mapping respectively.
  • The multidisciplinary research collaboration in the fields of computer science, visual analytics, data science, e-humanities will be mutually beneficial and will open up new avenues for collaboration.
  • The use cases from e-Humanities and library data scientists will inform the development of these tools and will support follow-up activities in more innovative directions and opportunities that will benefit academia, the OCLC enterprise and the wider library community.

Background

WorldCat is an ever growing heterogeneous data collection, becoming more information rich as new libraries join—but at the same time, also more difficult to comprehend for its users. The textual search interfaces are no longer adequate. The result lists are too long to scroll through and provide insufficient insight into the collections. The enrichment of bibliographic descriptions with semantic structures, such as Linked Data, is lost in traditional presentations. It is becoming increasingly painfully obvious how limited the interfaces are with which one can delve into the rich data collections of cultural heritage institutions.

Data visualization offers the promise to overcome these limitations. Graphical visualizations can present bibliographic data at a glance. Scrolling through lists is replaced by sophisticated and intuitive interfaces that facilitate the identification of relevant metadata, and display a collection from different angles. In order to be able to handle massive amounts of data, new, sophisticated algorithms need to be developed. The research areas of Information Visualization, Visual Analytics and Algorithms in Computer Science develop computer-based, interactive and visual methods, which enable users to extract meaning from large and heterogeneous data sets. Although such (visual) techniques have become essential in the sciences, they are still little used by libraries, despite comparable increases in the volume of data they work with on a daily basis.

OCLC research scientists in Leiden (Rob Koopman and Shenghui Wang) and researchers from the TU/e (Prof. Bettina Speckmann) and the UvA (Prof. Arianna Betti), decided to join forces to address these issues. They successfully applied for an NWO Creative Industry grant.

Related work

Ariadne's Thread – an interactive context explorer designed to visualize the networks of entities associated with bibliographic records.

Team members

OCLC Research

Rob Koopman

Titia van der Werf

Dr. Shenghui Wang

Technical University of Eindhoven

Prof. Dr. Bettina Speckmann
Principal investigator

Dr. Michel Westenberg

Dr. Kevin Verbeek

Thom Castermans
Ph.D. candidate
Algorithms for visual analytics

University of Amsterdam

Prof. Dr. Arianna Betti

Dr. Hein van den Berg
Digital Humanities use cases and testing