Too much metadata?

Stephen Hearn

metadata

As a metadata manager, much of my career has been focused on catalog management and authority control. Or, to put it another way, on the connections and commonalities that records share. I’ve observed the slow emergence of standards for describing authority control entities—topics, places, persons, bodies, works, etc.—as entities in their own right, with their own descriptions and their own connections to other entities.

Part of what makes my job interesting—and challenging—is that it’s not something I can do in a vacuum, on my own. Metadata without good standards is almost useless. And standards require cooperation.

That’s what I love about the Metadata Managers Focus Group of OCLC’s Research Library Partnership. I get a chance to meet with others excited by metadata challenges and really dive deep into the issues that are at the forefront of our daily working lives.

For example, while one problem that we often face is a lack of good metadata, sometimes—just like with holiday eggnog or Halloween candy—we can get too much of a good thing. So how much is “too much” when it comes to metadata?

Information overload

In the digital realm, we tend to think of our search tools as potentially limitless. But how much information is useful and worth our time? During a recent meeting of the group, my colleagues discussed a record for a five-page scientific article with more than 300 author access points and no authority review. Should all of these authors be included in the catalog record? Then why not include every person involved in a film’s production or in an orchestral recording?

It really comes down to how valuable it would be to have this information surface in the library catalog. Keep in mind that some discovery systems show only the first two lines of contributor names anyway, so these additions may not be visible to most library users. On the other hand, adding so many uncontrolled names will tend to make search results less precise. Since some libraries may have different local requirements, it’s likely that one size can never fit all. But should we consider standards to guide these decisions?

Same name, different text

We discussed how having an excessive number of contributors can be especially challenging when you’re cataloging with authorized access points—the preferred name and spelling of a person or organization formulated to be unique. In a browse-index-based world, these unique access points perform an important role in differentiating between contributors. But today, contextual information might better differentiate authors. Displaying “Smith, John; born 1950; field of activity: Geology; associated with …” etc. could be more useful than just “Smith, John, 1950-“ for someone searching for a particular John Smith.

We will rely on systems like VIAF to provide these “same as” relationships for us. As more contributors get unique identifiers and are associated with identifying metadata, the problem gets a little easier, but we’ll still need more systems to aggregate these identifiers so we can know who we’re talking about.

That’s the conclusion I came away with from the Metadata Managers Focus Group: let’s focus on ways to make the relationships between items more apparent to help information seekers find what they need.

The joy of collaboration

Metadata managers from more than 50 institutions in ten countries meet in-person and have online conversations about questions like these all the time. Other recent topics have included

  • creating metadata for equity, diversity, and inclusion;
  • metadata for audio and videos; and
  • coverage of Identity Management work.

It’s a great opportunity to exchange ideas with people like me who are fascinated by library metadata.

We don’t always leave these discussions with clear-cut solutions. But as we talk through the issues together, we approach a better understanding of how our challenges overlap and interrelate. Which is great, because making connections between libraries and librarians has always been the best way to hone best practices for working with library metadata.