Latin America and Caribbean

  • English

EMEA Regional Council Meeting connects members to the latest in library data research

The 2013 OCLC EMEA Regional Council Meeting in Strasbourg, France was the product of the EMEA Regional Council Executive Committee, chaired this year by Anja Smit, Library Director, Utrecht University. The committee developed the two-day event, which combined the business of membership with compelling keynotes on topics of interest to the community. Attendees explored opportunities that data aggregations present under the theme “Dynamic Data: A world of possibilities.” The Committee was pleased to announce that attendance had doubled to some 300 attendees from 28 countries since the first meeting in Leiden three years ago.

Mining the gems of the library world

Roy Tennant, Senior Program Officer, OCLC Research, discussed insights from his research on mining WorldCat’s 290 million records. A list of the most widely held works in libraries around the world points to significant cultural influences and how they are distributed. He discussed the shift from “cataloguing” to “catalinking,” with the growing influence of Linked Data. “Instead of having all the data in the record,” said Roy, “you have linkages out to these authority sources, and you can pull data in, process it, index it, but you do not manufacture it.” He shared the video, “Cataloging Unchained,” which shows how Linked Data can make library data work harder. “We believe that widespread collaboration is absolutely essential,” said Roy. “We’re moving into a whole new world now; we have the tools to do widespread collaboration well, but we also have the imperative to do it well.”

Mining insights from 50 million books

In the opening keynote, Harvard Fellow Jean-Baptiste Michel spoke about his research with collaborator Erez Lieberman Aiden. He began by comparing two cartoons, one from 100 years ago, to show that the verb conjugation ‘burnt’ has gradually regularized to ‘burned’ over the past century. By consulting two grammar textbooks, they concluded that over time,very important verbs remain irregular, but infrequently used verbs tend to become regular. “The more a verb is used, the more it’s protected against change,” said Jean-Baptiste.

Having used only two books to uncover something new about how language evolves, they asked Google for access to all the books it has digitized, which resulted in a databank of “50 million books, 12 percent of all books ever written anywhere−a huge chunk of human culture.” They slimmed the dataset down to 5 million−those with annotated publication dates and attributed authors−and extracted ‘n-grams,’ chunks of words and phrases, counting the occurrences of each in books published between 1500 and 2008. The data is available at

“Although the telephone was invented in 1848,” said Jean-Baptiste, “several years elapsed before the word started to appear in books. By 1895, when the radio was invented, that time lag had reduced to two years.”

“The Web is the system,” said Titia Van der Werf, Senior Program Officer at OCLC Research. “The Web is where our users are.” She emphasised, “The paths users choose yield powerful usage data that we can mine to better understand their behaviour and meet their needs. So we should not try to attract our users back to our own systems. No, we need to be a bigger part of the system itself.”

He noted that the word 'sustainable' did not appear until the late 20th century, and used the openly available xkcd tool to show that if usage continued to increase at its current rate, by 2061 it would be the only word in use in the English language. “It’s an instrument to help prove or disprove historical debates,” said Jean-Baptiste, who is now extending the data to include newspapers, in his quest to cover cultural history from as many perspectives as possible. “Libraries and text repositories are at the front lines of a real revolution in the social sciences and humanities, which changes the way we’re approaching questions about the human experience.”

Libraries in the vanguard of the Linked Data revolution

OCLC’s Technology Evangelist, Richard Wallis, began his presentation, “For any Linked Data representation, you start with an identifier. The URL uniquely identifies that resource in any dataset. We can then interrogate that identifier to find out more about the resource.” In a Linked Data world, multiple datasets covering similar areas can happily coexist; meaningful interlinking denotes relationships and disambiguates entities with identical names. Wallis added that all WorldCat records now have embedded linked data.

Silver Oliver, Information Architect, told the audience about the BBC’s adoption of Linked Data. The real starting point was the Programmes Project, which established a Web page for every programme broadcast, generating the UR L identifiers that underpin Linked Data developments. “People started to realise that they could point to them, talk about them, share them and link to them from both inside and outside the organisation,” said Silver. This has led to surprisingly pre-Google patterns of navigation, where “people arrive at one page and navigate to the areas that they’re interested in, starting their journeys from within the site.”

Markus Geipel from the German National Library vocalized how radical the Linked Data approach is. “We are witnessing a paradigm shift,” he said. “To apply metadata to knowledge today, we connect entities together to form a web of knowledge.”

The Web is where our users are

Klaus Ceynowa shared the latest location-based services from the Bavarian State Library. He started with Treasures of the Bavarian Library application, which provides mobile access to 50 medieval manuscripts. The Library has combined its content with augmented reality browsers and similarity-based image searching to deliver an app that accesses historical images of buildings that previously stood where users are currently located. “Content is king, and librarians have the content,” said Klaus, “but context is queen, and we must focus the delivery of content to the contexts in which our users find themselves.” Klaus’ presentation brought the OCLC EMEA Regional Council Meeting to a close.