Metasearch Survey Among RLG Members
Objective: This survey had three goals: To determine how RLG members are using metasearching; to learn more about their expectations for these searches; to use this information to help make RLG databases good metasearch targets.
Overview: In May-June 2005 we surveyed a representative cross-section of RLG member institutions. Most respondents were enthusiastic about metasearch. Although their definitions of it varied, most tended to focus on undergraduate students, use of a simple search box, and full-text resources in the results. (Metasearches allow users to search across multiple catalogs, search engines, and commercial databases. Frequently, these searches merge and de-duplicate results and unify access to a variety of information resources.)
The survey report includes:
- What do institutions and users expect from metasearch?
- How are these members implementing metasearch now?
- Do current implementations meet expectations?
- Can we quantify the likely spread of various tools for types of information?
- What might make existing metasearch implementations moot?
In May and June 2005, RLG conducted an informal survey in order to learn more about our members' expectations and experiences of metasearch. RLG staff conducted guided discussions with ten institutions drawn from RLG's members. The selected respondents represented a mix of institutional types, metasearch tools, and services targeted. At the time, five described their metasearch implementations as in production, and five described them as tests; since then, one has moved from test to production.
This sample provides only a limited basis for generalization. (MetaLib may be underrepresented relative to its adoption by our members and customers. In addition we did not talk with faculty, graduate, or undergraduate students, the intended beneficiaries of metasearch.) Nevertheless, these discussions did give us a valuable view of members' concerns, and will help guide our choices about how to design and develop RLG services. We hope they will be of interest to others.
We set out to answer these questions:
- What are institutions and users expecting from a metasearch capability?
- How are they implementing it now?
- Are the current implementations fulfilling their expectations?
- Can we quantify the likely spread of various tools within our user base by users of different types of service—bibliographic, citation, digital?
- Are there other ways of providing the same functionality or new ways of researching that might make existing metasearch implementations moot?
Most of our respondents were very enthusiastic about metasearch, although they had various ideas about what itis. Most of their definitions had in common a focus on undergraduate students, use of a simple search box, and full-text resources in the results. At the same time, the present level of satisfaction is low, success measures are not yet clear, and implementers are glad to look to vendors and others to define strategies and goals—few felt able to be very active participants in that process. Interest on the part of librarians, administrators, or system vendors could dissipate if results aren't more satisfactory. Interest on the part of undergraduate students may be difficult to attract if students have more familiar alternatives they perceive as adequate. We think the future of these metasearch efforts is still uncertain.
From this survey, we concluded:
- Undergraduates using metasearch tools will get value from RLG citation resources. RLG should work with leading vendors of metasearch tools to make them convenient targets. What is critical for targeted services is ability to provide fast-enough keyword searching.
- Customers who are implementing metasearch tools need support from target systems such as RLG.
- The RLG Union Catalog is a lower priority. We didn't hear it cited as a desirable target, which surprised us. Institutions would rather target their own OPAC (online public access catalog).
- Making RLG Archival Resources a good target is a low priority because its audience—advanced researchers, genealogists—isn't a good fit with the current audience for metasearch efforts. This resource is better exposed through search engines and genealogy sites.
- Interoperability with other image collections, rather than metasearching per se, will be most useful for image aggregations. This, in addition to exposing images to Web search engines (through services like Trove.net™), is what will serve image searchers now.
- Deduplication and ranking were not among the most important considerations for respondents.
- A simplified interface was mentioned more often as a goal than merged results or a single search against multiple resources.
- The most important criterion in tool selection was an established relationship with the vendor.
- Despite their enthusiasm, most respondents indicated they had limited time and attention to invest in either shaping or appraising metasearch efforts at their institutions; expediency mattered more than standards.
- It's about full text. Citations are a step along the way, not the destination.
- No respondents regarded services like Google as the way of reaching their metasearch goal, despite the fact that most said their own users see them that way. Students are moving to search engines to get metasearching done. Can librarians change that behavior? Can they offer an environment students regard as better?
What are institutions and users expecting from a metasearch capability?
Undergraduates were identified as the principal audience for metasearch by most respondents (8/10). "We really see this as a tool to help students get started finding stuff—it's not a tool for advanced research." Two mentioned graduate students. A few mentioned that faculty or librarians need metasearch in order to discover relevant resources. There were suggestions from some respondents that more advanced users need more advanced interfaces. Our other studies suggest to us that faculty too may need ways to discover what licensed resources are available to them; however, that wasn't ordinarily viewed as a purpose of metasearch.
A simplified user interface was the goal most often identified by respondents (7/10). Half mentioned promotion of lesser-used resources. Somewhat surprisingly, fewer than half (4/10) mentioned as their goal merged results, or a single search, against multiple resources ("all scholarly resources" or "image and other databases" or "local and licensed resources").
Nearly all respondents (9/10) mentioned keyword as the preferred form of search. ("That's what metasearch is.") The tenth respondent said that the preferred form of search depends on the target. No respondents identified parametric searching by title, author, or subject as important, although one lamented, "we're missing an opportunity to teach students about using advanced interfaces."
More than half of respondents (6/10) expect users to stay in the metasearch tool environment, rather than seeing the target system. The question of what they meant by "target interface" needs further investigation. Is it an interface (like RLG's Eureka®) that a metadata vendor provides for searching specific sets of metadata? Our respondents might have thought so. But is that what a student thinks of as the "target?" The only target worth going to might be the interface for data—that is, full text—rather than for metadata.
Few respondents (2/10) mentioned ranking as a factor in their selection of a metasearch product. None mentioned the ranking algorithm of the target system as a factor in selecting targets. Few respondents (3/10) mentioned deduplication as an important feature in a metasearch product. Of these, one felt successful deduplication was unlikely in the near term. Another felt deduplication would depend on local loading of data. Poor handling of ranking or deduplication by metasearch products is not an important failing from a customer point of view.
How are institutions implementing metasearch now?
MetaLib from Ex Libris claims dominance among respondents belonging to the Association of Research Libraries, but there was some level of dissatisfaction with all metasearch vendors. Still, most respondents (6/10) said they'd work with their current vendor if the current implementation doesn't meet expectations. Two said they'd select a new tool, two said they'd try a new approach, and two said they'd develop a tool on their own, though one of those described this as the "worst-case scenario." A few of our respondents see themselves as partners in development efforts with vendors. Most, however, didn't seem to feel they could devote many resources to such efforts.
Some respondents (4/10) preferred Z39.50 among search protocols. A larger number (5/10) stated that it doesn't matter to them because the tool vendor "takes care of everything." Another said that aggregators ought to supply gateways/connectors. In addition to Z39.50, two other protocols were mentioned: one respondent mentioned SRU/SRW, and one mentioned HTTP. None identified NISO working group recommendations as something what would affect their decisions regarding implementation.
More than half of respondents (6/10) felt that the record format doesn't matter to them, and that mapping is the tool vendor's responsibility. Two mentioned Dublin Core. One mentioned MARC.
Among other ways of reaching the metasearch goal, four mentioned leaving it to vendors to innovate. An equal number mentioned aggregating metadata locally. (Another characterized this approach as impossible.) Only two mentioned looking beyond the library sector for solutions. Asked if their users saw other ways of reaching the metasearch goal, however, six mentioned Google Scholar, three mentioned Google, and two mentioned Google Print.
The process for tool selection involved systems staff or committees without library representation(6/10) slightly more often than library staff (5/10). Most respondents (8/10) understood metasearch as a natural extension of their access objective. The most important criterion in tool selection was an established relationship with the vendor (3 respondents). Adherence to standards was an important criterion in vendor selection for only one respondent. Only one respondent mentioned student demand as a driver for metasearch initiatives as a whole, and only one mentioned student demand as a factor in target selection.
Can we assess the spread of various tools for different types of service?
Can we quantify the likely spread of various tools within RLG's user base by users of different types of service—bibliographic, citation, digital?
Nearly all respondents (9/10) are targeting citation databases. The one that isn't is focusing on full-text e-journals. Half are targeting bibliographic databases. One that isn't initially targeted both the RLG Union Catalog and OCLC's WorldCat, but found students were confused by the results—books mixed with articles, and virtually no full text. Half are targeting image databases. Most of those (3/5) target local rather than third-party image databases. Since ARTstor content can't be federated, federation of other image resources may become a lower priority for them. Lack of support for thumbnail display—critical for image resource discovery—was also mentioned as a limitation of MetaLib and other federated search tools.
Some respondents (3/10) resisted our citation/bibliographic/image classification, and specifically mentioned full text (e-books and full text electronic journals).
Most respondents had organized their metasearch efforts around disciplines, rather than making them cross-disciplinary or comprehensive. They reported that this was not because they saw this as preferable, but because they saw it as expedient: the relevant sets of resources were already identified. Some (generally smaller institutions) had not yet been selective at all.
Most felt selection would change, though they felt it was too soon to say how. One said that if they had it to do all over again, they'd include fewer targets and focus more on the needs of undergraduates. All respondents identified subject specialists or selectors as the people responsible for selecting targets for metasearch. Three equated user demand with what librarians demand. None mentioned students or faculty.
Half of the respondents mentioned site licenses or unlimited searching as important factors in selecting targets. One asserted that limits based on license restrictions (simultaneous users, for example) are already having a serious impact on those wishing to search directly in a particular target system interface.
Are the current implementations fulfilling expectations?
Only three respondents were satisfied with their current implementation. Four said they were dissatisfied (though one of those was hopeful) and four thought it was too early to say.
The success measure most often mentioned was a shift in traffic from direct access (4/10). Half the respondents mentioned as vendor-supplied statistics that matter to them both a report of searches and a report of sessions in the target interface. Half were not yet looking at statistics. One noted that they'll need to weigh searches on preconfigured profiles differently from searches based on intentional database selection by users.
Everyone is still struggling with the definition of success and how to measure it. It may be that the important measure of success will not be a shift in how citation metadata is used, but an increase in discovery and use of full-text resources.
Other ways of providing the same functionality?
Are there other ways of providing the same functionality or new ways of researching that might make existing metasearch implementations moot?
If metasearch efforts seek to provide a comprehensive, general-purpose tool for undergraduate or beginning research, such local efforts may be in a contest with emerging services such as Google Scholar and Google Print. (If these can provide access to local licensed resources—which is already happening through Google's use of OpenURL—and if more abstracting and indexing data is made available to Google.) This would be a difficult contest for library metasearch systems to win, either in terms of inclusiveness or in terms of visibility.
This difficulty did not seem to be very much on the minds of librarians we spoke to: they planned further steps down the road they're on. But the unevenness of this contest may lead to a change in direction from the administrators who drive many metasearch efforts, or from the tool vendors who shape them. Alternatively, if metasearch efforts end up differentiating themselves from Google Scholar by focusing on various specific disciplines or specific audiences, then these efforts would be a good vehicle for exposing resources like RLG's to the researchers who need them.
All of these conversations were instructive for us. We thank all our respondents for their time and candor.
How some RLG members have applied metasearch:
- "Libraries Australia: Metasearching the RLG Union Catalog from Down Under"
( RLG Focus, August 2005)
- "Stanford Groks RLG Union Catalog"
( RLG Focus, August 2005)
Please note: Archived versions of RLG Focus are available from the OCLC Corporate Library Collection in the OCLC Digital Archive. Choose the preferred index and browse from here.
- "Mad About Metasearching"
( RLG TopShelf, March 2005)
Archived versions of RLG TopShelf are available from here.