Close window
labs banner
labs box

It’s cloudy in research, but with little chance of rain

New tool brings clarity to Web sites with on-demand tag clouds

By Diane Vizine-Goetz and J. D. Shipengrover

Love them or hate them, tag clouds are everywhere. OCLC Research has been exploring ways to create and use these popular visualizations in the OCLC environment. One outcome of this work is a new interactive tool that allows users to generate tag clouds on demand.

The Web-based tool, Tag Cloud, provides a quick and easy method for building clouds from a collection of terms. With the Tag Cloud tool, almost any text can be used to create a cloud—books, papers, presentations, Web pages, blogs, social tags, search terms—as long as the text can be typed or pasted into an input box or harvested from a public URL.

A tag cloud allows you to see common terms in a text by grouping like terms together and emphasizing frequent terms. The cloud shown here, based on an excerpt from Alice’s Adventures in Wonderland, is a good example. It is easy to see that ‘cat’ and ‘said’ are frequent words. One staff member noted of the excerpt,“the cat does a lot of talking.” The resulting cloud highlights this point, underscoring one additional benefit of tag clouds—they often reveal broad themes and patterns that might otherwise go unnoticed.

When the ‘Cloud-it’ button is clicked, the software generates the cloud by removing punctuation, calculating term frequencies and selecting font sizes to display. The terms are presented alphabetically in paragraph-style, with more frequent terms in larger fonts. The tool provides options for controlling the font colors and the number of terms to display. There are also options for grouping similar terms and for ignoring common words. Once created, the cloud can be printed or saved.

A set of experimental cloud services was also developed for clouds that cannot be easily created with the interactive tool. The cloud services are used for clouds that involve large amounts of WorldCat data, and for clouds that require interaction with other systems, like search interfaces. Some examples are: FictionFinder, WorldCat Identities, DeweyBrowser and WorldCat languages.

The clouds built for FictionFinder and WorldCat Identities are interactive, which means that clicking on a term in the cloud leads to resources associated with the term.

For the DeweyBrowser, clouds are generated dynamically from searches conducted against the database. The clouds contain current searches and searches over time and are interactive as well. These clouds allow users to see readily what is being searched and how the DeweyBrowser is being used.

The WorldCat languages cloud is the largest cloud produced so far. It presents 470 languages and dialects found in WorldCat and their associated frequencies.

As WorldCat grows, OCLC researchers will continue to look for new ways to analyze and share WorldCat data and to share those methods with the community.

Project team: J. D. Shipengrover, Diane Vizine-Goetz, Harry Wagner


leftA del.ico.us directory | A WorldCat communityright