|
|
|
|
Research : Activities : Books as Expressions of Global Cultural Diversity
Books as Expressions of Global Cultural Diversity A Data Mining Research ProjectData from the global bibliographic database WorldCat reveal transnational patterns in literary publishing, the preservation of individual countries’ literary heritage, and the cultural diversity present in the books. BackgroundGlobally and nationally, books represent a central kind of cultural heritage. The UNESCO Institute for Statistics has been exploring library statistics for worldwide book consumption, and helped to found the European Expert Meeting on Book and Library Statistics. These bodies, as well as the International Federation of Library Associations, are especially interested in any global patterns in the book world as expressions of cultural diversity and heritage. Such data, however, are not widely collected by any national publishing organizations or library statistics agencies. UNESCO maintains a database (http://www.unesco.org/culture/xtrans) of translated works worldwide, but is unable on its own to access worldwide monographic statistics. The increasingly global reach of the WorldCat database, on the other hand, makes it an obvious source to mine such data. OCLC’s bibliographic database represents more than 142 million items, with 1.43 billion copies held by libraries worldwide (numbers which are ever increasing); in addition, the database becomes increasingly more global in scope with the ingest of dozens of national libraries’ bibliographic data. OCLC researchers have already produced a prototype application which graphically displays worldwide patterns in bibliographic holdings (http://worldmap.oclc.org). ImpactThe output will include a large set of statistics compared by country and an accompanying report of findings. These results will provide a global overview of the transnational literary arts, and a wealth of case studies in single countries’ practices in both literary publishing and the preservation of their literary heritage.
DetailsThe basic objectives of the project, then, are: to mine WorldCat's "overwhelmingly" monographic records, to parse these data by date, country of publication, and language. On the importance of language, there is an axiomatic concept in Cognitive Anthropology that (in the words of Benjamin Lee Whorf), “Language shapes the way we think, and determines what we think about.” In other words, the language(s) spoken by a culture help to determine that culture’s perception of the world, and its expression of itself within that world. Within these broad scopes, we set the following limits. We are gathering details of non-serial textual materials (also excluding dissertations and government documents). The date of publication must be a valid number less than 2010. We exclude works with publication dates of, for instance, “19xx,” since we cannot fold them reliably into the rest of the data; we included books whose "date of publication" was as early as 1000 A.D., believing that WorldCat may be well-represented by archival and special collections data. A pre-test that profiled six countries (deliberately highlighting non-English works and non-English cataloging) was followed by refinements to the data extraction techniques, and then by proceeding to the rest of the world’s data. The project is producing a rich data portrait of the global literary arts (as reflected in the WorldCat database), with emphasis on cultural literary heritage by country and region. Researchers are able to track the overall annual publishing for every country of the world, the libraries that collect and even import a country’s works, the “foreign” monographs their libraries import, and the proportion of publications in various official and native languages; in addition, the interaction between a culture’s languages and the rest of the world is captured in data on translated works across time. The results provide a global overview of the publishing arts, and a wealth of case studies in single countries’ practices in both literary publishing and the preservation of their literary heritage. Relationship to Other OCLC Research ActivitiesThis effort is one of several data mining projects whereby OCLC Research seeks to extract intelligence from the data we have, and use it in different ways that provide value to libraries. Duration
Outputs
Team MembersLast update: 11 August 2009. |