2008

RLG Programs 2008 Annual Partners Meeting
Modeling New Service Infrastructures
Breakout Session Recap Sent to Participants

June 2-3, 2008

Discussion notes by Bruce Washburn and Roy Tennant

The first day attendees (over 20) introduced themselves and said why they had selected this breakout. Roy gave an overview of the next day and what we hoped to accomplish. The floor was thrown open to a beginning set of ideas and concerns that could be followed up the next day.

Flip chart notes from Monday's discussion:

  • Cooperative development through common interest
  • Engaging other disciplines where appropriate
  • Model for use of human resources—team building of not just libraries
  • Services framework, data services layer
  • Define the problem
  • ROI, assessment tools for services
  • What level? Local, group, global
  • What's the next big thing we can do better than anyone
  • Capacity & culture barriers

The following day, we had 30 participants and it was clear that we needed to be explicit that there were two quite different views of "services" in this context. Roy explained that this part of the work agenda was meant to focus on the new kinds of software and procedural infrastructure needed for new modes of working, but that the desire of a number of attendees to discuss public services was acknowledged and would be accommodated. Any projects coming out of those discussions might end up in another area of the work agenda, but they would be welcome here.

Roy gave an overview of existing work in this part of the agenda, followed by presentations on specific projects.

Enhancing disclosure at the network level (Roy Tennant, OCLC)

This includes projects to improve linking/ranking of WorldCat records in Google as well as enhancing the discovery of partner content that may not be easily discoverable through web search engines. Related to this are WorldCat Grid Services that enable new methods of exposure, and use, of WorldCat records. Roy is drafting best practices on making library content, sometimes segregated in local databases, visible at the network level.

Action: Partner institutions with segregated data that could be promoted to the network level were asked to provide usage statistics before and after engaging in specific activities to increase exposure of that content. The following institutions expressed interest in participating: UCLA (Sarah Watstein); CDL (Patti Martin); Penn State (Jack Sulzer); Univ. of Minnesota (John Butler); CalTech (Eric Van de Velde); NYU (Lucinda Covert-Vail)

Collective services, Collective collections and the standards needed to realize this (Janifer Gatenby, OCLC)

Q: Standards appear to still be library-centric. Are there plans to accommodate archives and museums?
Need to tink more about LAM. Nancy Gwinn—problem is discovery. Records of items held are less discoverable for archives and museums. The Museum Data Exchange Project was referenced as one group effort to learn more about museum data, how it can be shared, and how it could be utilized. In a subsequent discussion, it was suggested that the MDEP data, once aggregated, might be tested as a discovery base with LibraryFind (rebranded) or something like it.

Possible services for archives at the delivery stage: Unique objects, once digitized, are no longer unique, and can be dispersed. We need an authentication service for a digitized archival object.

And, there was an expressed need for linking to reproduction services for digital masters. Jim Neal: authentication issue is not limited to digitized archival materials. Version control needed for text as well. What does authenticity mean? Look to the recent Educause article on this topic.

Danuta: there will be an impact on traditional ILL by increased visibility and improved delivery request mechanism; more work is needed on working functional models for responding to increased delivery demands. This includes improved access to information about users and their roles.

Partner institutions interested in helping to determine where library data should logically reside (local, group, global) were asked to get in touch with Janifer.

Terminology Services: Experimental Services for Controlled Vocabularies (Diane Vizine-Goetz, OCLC)

The service has been out a month in general to the OCLC developer's network, Google is using it, not so much others. HTML is Google's preferred format. It includes links to other Google services (books, scholar). Google is searching with identifiers. We're not sure of their intent, it might just be Google experimentation.

Q: What are prime use cases?
Broaden or narrow a user's search terms. Lead to a preferred term. Statistical mappings based on co-occurrence of terms in bib records. Providing alternatives to users in a tagging system.

Q: Jeremy: how many sites have systems that could be modifiable to incorporate the terminology services layer?
Scattered show of hands.

Q: What about maintenance of evolving vocabularies? Is there an update service?
LC headings are updated about weekly, others are more static.

Q: Are there translations available?
Diane said we are expecting to bring in a multi-lingual vocabulary, and are thinking about how to implement it in the Terminologies architecture. However, this question may have been about translating vocabulary terms to other languages, rather than about including sources already representing multiple languages.

Q: Jim Neal described similar work for the Avery Index at Columbia. Is there potential for overlap?
This needs a follow up discussion.

Merrilee reported on partner involvement with Terminologies: Calisphere is interested in integrating Terminologies. Indiana has integrated it, and was able to in short order as they were familiar with SRU. End user search expansion was their use case and it seems effective for that. Behind the scenes, a text-encoding markup could use this service for quality control for TGN, and could automate inclusion of authority file IDs. If you're interested and have ideas, talk to us.

Action: Jim Neal to follow up with Diane on Avery integration.

LibraryFind (Jeremy Frumkin, Oregon State University)

Jeremy presented on this project of OSU, and provided an opportunity for partner institutions to get involved.

Jeremy noted the need for more consistency across library sites in the way that results are returned and how behaviors and actions are modeled. He suggested we should work together towards consistency that allows habits and expectations from one system to be applied in another.

Jeremy noted the need for more consistency across library sites in the way that results are returned and how behaviors and actions are modeled. He suggested we should work together towards consistency that allows habits and expectations from one system to be applied in another.

Relevancy remains an issue in metasearching implementations, as not all metasearch targets will support relevance, and all certainly will not define it in the same way.

Some collaboration already underway with Internet Archive, Univ. of Houston, and others. Ways to be involved: develop, implement/test, document, support, use it. Jeremy suggested that an implementation for a more specific, non-undergrad audience, or for visual materials, etc., may be interesting.

Actions: Participants were encouraged to follow-up directly with Jeremy. Jeremy said that formal usability study results are available and that he could direct us to these.

Yale may have a cross-collection search service in development or in place. More information about that would be of interest to the group.

Modeling New Service Infrastructure
Suggested Actions
The floor was thrown open to discussion and suggestions for new projects. The following potential projects were suggested. The results of a subsequent voting round (each attendee could vote up to four times for their top priorities) are listed in parentheses following the brief title of each suggestion. Institutions willing to take a specific suggestion further by writing up a draft project prospectus are listed for items that emerged as top priority, with the individual to contact in parentheses. Items requiring further action are in bold.

  • Content mining (13)—There are concerns about the prospective domination of the commercial sector for data mining and analysis.
    CDL (Patti Martin), IISH (Titia Van der Werf)
  • Content verification services (12)—Scholarship relies on authenticity, verifiability. When scholarly work is deposited in multiple places, and manipulation of that data can occur online, the ability of scholars to reliably turn to information sources is subject to the risk that original works cannot be found and verified.
    IISH (Titia van der Werf), Columbia (Jim Neal), George Washington University Law Library (Deborah Norwood)
  • Delivery of digital masters (0)—Low-res digital images are available online. The process for ordering high-res images is not automated. Some would like a general-purpose web service to automate ordering.
  • User profile management and exposure beyond the institution (1)—Is there an xml schema that could describe affinities and would transcend institutional boundaries?
  • Database recommendation services (5) [Subsumed under metasearching below]
  • Common UI patterns: display & use (4)—Share use-based best practices for the design of systems for undergraduate, cross-collection searching, leading to greater consistency across LAM systems, efficiency of the design process, and portability of users' searching habits and expertise.
  • Metasearching for LAMs—what needs to happen? (17)—Includes database recommendation services
    Cleveland Museum of Art (Betsy Lantz), Oregon State University (Jeremy Frumkin), Smithsonian Institution (Martin Kalfatovic)
  • ROI assessment tools for services (8)—Build on the work from UIUC on assessing the value of collections. We don't have a model for demonstrating the value of services. There is a body of ROI literature, but mostly about public libraries. But there is a gap in the literature for academic libraries. We'll need to be cautious with whom we partner. Duplication of research in other environments is important.
    UCLA (Sarah Watstein), CDL (Patti Martin)
  • Rigorous inter-library loan user id support across institutions and systems (0)—For some, there is currently a good service but on an aging and fragile platform.
  • Web curation and archiving (10)—UCLA (Sara Watstein)
  • Artificial intelligence metadata generation / searching (15)—There is too much of a demand on staff. Are there tools that can examine unstructured data and do some of the spade work to produce metadata.
    Univ. of Minnesota (John Butler), Penn State Univ. (Jack Sulzer)
  • Develop a discovery and retrieval mechanism for non-text, visual multi-media (7)—This could include query by image content, etc.
  • Large-scale digital object storage and preservation model (5)—This should include service models.

For a complete list of breakout session participants, click here.

For more information

Roy Tennant

Senior Program Officer

roy_tennant@oclc.org