Canada

CONTENTdm®

Member stories

Picture-South-Perth_thumb

Pull your archives out of the basement and share them with the world

Discover how South Perth Public Library made its archive accessible to the public while still protecting the original artwork and historic photographs.

seattle-public-rs

Increase accessibility to digitized special collections

Hear how The Seattle Public Library balanced online accessibility for its digital collections with limited IT and staff resources.

Browse collections


Search collections:


Frequently asked questions

Database and metadata

Is CONTENTdm a relational database? What is the underlying management system?

CONTENTdm uses a text-based search engine built using Internet standards and protocols. It is optimized for fast text querying capability. This provides great flexibility in metadata support and fast performance for large collections. CONTENTdm supports text searches within or across multiple text-based metadata fields, enabling rich metadata searching within or across collections. You do not need to purchase or support any additional databases to run CONTENTdm.

Does CONTENTdm allow data to be imported from an existing application?

Data can be imported from other systems using a tab-delimited format. This facilitates batch import of existing collection items and metadata from Microsoft Excel, Access or other common data programs that support export of their data in delimited format. Newspapers, monographs and e-books with metadata in METS/ALTO XML format can be imported using the CONTENTdm Flex Loader.

Does CONTENTdm allow data to be exported into a nonproprietary format such as XML?

Yes. CONTENTdm supports the export of data to XML or tab-delimited text format. CONTENTdm provides custom XML export options so users can define specific fields to be exported and designate the format for each exported field, including repeating fields and customization of XML tag names. In addition, CONTENTdm collection metadata is accessible through OAI, if you choose to enable this function. This gives you a number of flexible options to link to other software, such as online catalog systems.

Can records be integrated into our local Web OPAC?

Yes. You can upload the metadata from your CONTENTdm server to WorldCat using the Digital Collection Gateway, a service included with your CONTENTdm license. Each time you upload metadata to WorldCat using the Gateway you are provided a WorldCat Sync report with the OCLC numbers of the records in WorldCat that correspond to the items in your CONTENTdm collection. You can use Connexion client and the list of OCLC numbers from the WorldCat Sync report to create a local save file of MARC records from WorldCat to load into your local system. If your ILS supports it, OAI can also be used to harvest from CONTENTdm into your ILS.

Does CONTENTdm support OAI?

CONTENTdm is fully compliant with OAI-PMH version 2.0. This includes collection-level selection to enable you to choose collections for harvesting on designated servers. Support is also provided for OAI flow control (resumptive token), which permits large harvests of collections to be broken into smaller batches for more reliable network transmission.

Does CONTENTdm support the Metadata and Encoding Transmission (METS) schema?

Yes. XML data in the METS/ALTO format can be imported using CONTENTdm Flex Loader. For export, CONTENTdm has an open and accessible data structure, and all items stored in CONTENTdm are described in XML, so it is easy to export data and use in many ways. For example, one of our users, the California Digital Library, developed a tool that transforms CONTENTdm XML exports into METS files conforming to a specific METS profile that they use for import into their state-wide system. The tool can be customized to produce METS files from any kind of standardized XML document.

How is the metadata stored and indexed? Have you developed your own search software?

All metadata in CONTENTdm is stored in XML. It is indexed using a text-based database developed by OCLC. The database uses an optimized search engine (indexing words and phrases) and has been designed to scale to handle billions of records. The CONTENTdm search engine is the same search engine that powers WorldCat and is fast, flexible and accurate.

Does CONTENTdm support controlled vocabularies or thesauri?

Yes, CONTENTdm offers controlled vocabulary for consistent, uniform metadata entry. The software includes ten (10) integrated thesauri from OCLC Terminologies Service:

  • Art & Architecture Thesaurus (AAT) ®
  • Canadian Subject Headings (CSH)
  • Dublin Core Metadata Initiative Type Vocabulary
  • Getty Thesaurus of Geographic Names (TGN) ®
  • Guidelines On Subject Access To Individual Works Of Fiction, Drama, Etc., 2nd ed., form and genre
  • Māori Subject Headings / Nga Ūpoko Tukutuku
  • Medical Subject Headings (MeSH®) 2010
  • Newspaper Genre List
  • Thesaurus for Graphic Materials
  • Union List of Artist Names (ULAN) ®

Additionally, you can import or develop your own, custom controlled vocabularies.

How are the digital files stored?

CONTENTdm stores digital items in file directories on the server. The individual items are accessible through a text-based index (database) that points to the items. This allows CONTENTdm to serve the items quickly, make them available with a unique URL, and allow extensive metadata records for each item.

Can databases be distributed?

Collections can be stored on distributed drives and can be administered remotely through a Web-based Administration interface. This enables multiple, distributed groups to collaborate on digital collection building. Users can submit new items and metadata descriptions through remote CONTENTdm Project Clients, Connexion client using Connexion digital import, through a Web browser using a simple Web form, through remote CONTENTdm Flex Loaders, or through the CONTENTdm Catcher Web service.

Are there any limitations on the number of collections, files, number of metadata fields or field lengths?

CONTENTdm can scale to handle billions of items. The maximum number of collections is 400. The maximum number of metadata fields a user can create for each collection is 125. The maximum number of characters supported in a single metadata field is 128,000. Other practical limitations may result from your hardware limitations, such as available disk space.

Search functions

What is the CONTENTdm searching method?

CONTENTdm provides text search capability across user-defined fields and multiple collections. CONTENTdm also has a browse capability that allows users to view all the items in a collection. Searches can be performed on a single field or multiple fields in a collection and across multiple collections on a CONTENTdm Server. Each collection item is identified by text-based metadata. Additionally, CONTENTdm offers Unicode searching, relevancy sorting, spelling suggestions and faceted searching. For more information about searching, see end user experience.

File formats

Does this software handle books, periodicals and other documents?

CONTENTdm allows you to create items consisting of multiple elements such as books with multiple pages, postcards, newspapers and multiple views of an object so user’s search results will bring up entire entities rather than just individual elements. Full text searching of documents is also supported.

What file types does CONTENTdm support?

CONTENTdm can store any file format. It can also display any file format that can be displayed in your browser either natively or via a plug-in. This includes all common formats such as JPEG, GIF, or TIFF images, WAV or MP3 audio files, AVI or MPEG video files, PDF files, EAD Finding Aids and even URLs. Large-format image collections also benefit from the JPEG2000 capability available with CONTENTdm. XML data in the METS/ALTO format can be imported using CONTENTdm Flex Loader. Audio and video files that are h.264 encoded and Flash compatible will play in-line when selected by an end user. Other audio and video file formats can be stored but will be played either via a plug-in or browser capability to support the format.

Can CONTENTdm be customized to accommodate slides as well as digital images?

CONTENTdm can handle different media types within a single collection so you can combine slides and photos, or other media such as text, video, or audio. You also can create separate collections, if necessary. Additionally, CONTENTdm can search across collections if they reside on a single server.

Authentication

Does CONTENTdm have a provision for encrypting or protecting images from being copied without permission?

Images can be protected by restricting access to them using the CONTENTdm security features. Additionally, the Image Rights options enable you to band, brand, or watermark images with copyright information or a logo. There is no facility in CONTENTdm to prevent images from being saved by users viewing them in a Web browser. In general, any image that can be viewed in your browser can be captured and saved.

What kind of security or access controls does CONTENTdm offer?

CONTENTdm supports both collection- and item-level security. Access to items and collections can be restricted based on operating system user names or IP addresses. You also can choose to require permissions for items and metadata or to set permissions so that metadata is available to all users but permissions are required to view the associated file.

Does CONTENTdm support authentication via LDAP?

CONTENTdm relies on the underlying Web server for authentication services. The Apache LDAP authentication module enables authentication via LDAP. Consult the Apache/LDAP documentation for details.

Platforms

What platforms are supported?

You can run CONTENTdm either on your own hardware or via CONTENTdm Hosting Services. If you choose to run CONTENTdm on your own hardware, both Windows and Linux are supported. Note that regardless of whether you go with Hosting Services, Windows, or Linux, you will still need a Windows machine to run the Project Client.

Hosting Services

What does CONTENTdm Hosting Services include?

 

OCLC staff will install your software and configure your initial CONTENTdm Web site for you on OCLC servers. Your CONTENTdm collections are then accessible through Web browsers for administration and access, as well as additional collection creation.

The Hosting Services option is for CONTENTdm organizations that prefer not to allocate the personnel or hardware needed to run CONTENTdm on their own local servers. Hosting Services offer operational support and reliability at an affordable price.

What does Hosting Services include?

  • A CONTENTdm Web interface for your end users, which you can customize to meet your needs using the website configuration tools
  • Upgrades to new releases of CONTENTdm
  • Daily backups of your hosted data and site customizations
  • 24 hours a day, every day monitoring of all system hardware
  • Proactive, automated website site availability checks

All this is provided within OCLC's enterprise data centers, which operate in an ISO 27001 certified data center environment. More information about OCLC’s data management and operations is available on the OCLC security policies page.

 

Miscellaneous

How open is the API? Can we develop custom interfaces to the CONTENTdm Server?

CONTENTdm has a well-defined, web-based query API that allows for development of custom interfaces. If your interface is used simply for accessing your media collection through basic queries, customization is very straightforward. As of version 6, CONTENTdm provides a configuration tool that lets you brand and tailor your website without programming. The API is available to any user who wants to make advanced customizations to their website. To see examples of the thousands of collections and ways the CONTENTdm community uses CONTENTdm, see CONTENTdm in action.

If our images are already in a database, how difficult will it be to move them to CONTENTdm?

It’s simple to load an existing database into CONTENTdm. If you can export your existing text description information into a tab-delimited file (most databases have this capability) and can identify one of the fields as the file name of the corresponding image, you can easily load data and items into CONTENTdm using the data import tools.

What standards does CONTENTdm support?

OCLC's adherence to commonly accepted standards allows CONTENTdm to be open and extensible, as well as provide functionality that meets a wide range of needs. CONTENTdm supports numerous industry standards including Unicode, Z39.50, Qualified Dublin Core, VRA, XML, JPEG2000, OAI-PMH and METS/ALTO.

  • Unicode—CONTENTdm fully supports Unicode, an industry standard that allows computers to represent and manipulate text in most of the world's non-Western languages, including Chinese, Japanese, Korean, Greek and Hebrew, among others.
  • Z39.50—CONTENTdm is Z39.50 compatible through a free open source software application called ZContent developed by the University of Utah Marriott Library. ZContent provides access to digital collections on CONTENTdm servers from library portals and local catalogs and can be downloaded for free from the CONTENTdm User Support Center.
  • Dublin Core and VRA Core—Use of the Dublin Core and VRA (the Visual Resource Association) Core within CONTENTdm allows for a common language when describing media and searching across collections. Collection Administrators also can apply their own field descriptions and map back to the Dublin Core standard to provide flexible searching.
  • OAI-PMH—CONTENTdm Servers support OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) by functioning as OAI repositories for those who wish to make their metadata available for harvesting.
  • XML—XML is used for all internal metadata and structure description. CONTENTdm also offers custom XML export of metadata that supports user-defined fields and formats for greater compatibility with local catalog systems and other applications.
  • METS/ALTO—The import of XML data in the METS/ALTO format is supported for NDNP newspapers, CCS newspapers, and CCS monographs and ebooks.

Do you support Z39.50?

The University of Utah Marriott Library has developed open source software that adds Z39.50 compatibility to any CONTENTdm Server. The purpose of this software is to provide access to digital collections on CONTENTdm Servers from library portals and local catalogs. ZContent is a Perl script and module that provides a Z39.50 target for the CONTENTdm Server. ZContent processes incoming Z39.50 requests, converts them to appropriate CONTENTdm requests, and returns the results in either USMARC or XML format.

Does CONTENTdm support languages other than English?

CONTENTdm fully supports Unicode and thus, fully supports entering, storing, displaying, and searching in all Unicode languages.

You can easily localize CONTENTdm websites to support languages other than English. Currently, localizations are available in English, French, Spanish, German, Dutch, Chinese (simplified and traditional) and Thai. Users can localize to support other languages by editing an XML file in Translation Memory eXchange (TMX) format.

Does CONTENTdm support EAD?

Yes, EAD finding aids can be added to CONTENTdm collections. Metadata is automatically extracted from EAD files based on an organization's custom metadata map. End users have two viewing options for the EAD items, one with a navigable table of contents, or display of the full EAD record in a single view. This full EAD view can be customized with an XSL file. EAD records are fully text searchable and search terms are highlighted within the EAD content.

What is an item? What is a compound object? How are they counted against my license level?

An item is any digital file that has been added to a CONTENTdm collection, such as a photograph, a page in a book, a dissertation in PDF format, or one side of a postcard. The metadata describing a single item accompanies the single item, and together they are counted as one in the total number of items.

For example, if you have 500 photographs, each photograph (image with associated metadata) counts as one item. Therefore, 500 items are added to the collection and counted in the CONTENTdm license level total.

A compound object consists of two or more files bound together with an XML structure that enables the end user to retrieve them as a single object. Compound objects can be documents, books, the front and back of postcards, or picture cubes (six-sided views of three-dimensional objects). Each of the individual images or pages, as well as the resulting compound object itself, has associated metadata and is included in the item count.

For example:

  • If you have 10 two-sided postcards that would be counted as 30 items in the CONTENTdm license level total (i.e., 2 images with metadata plus 1 compound object with metadata = 3 items per postcard).
  • If you have scanned 20 diaries of 100 pages each, that would be counted as 2,020 items in the CONTENTdm license level total (i.e., 100 images with metadata plus 1 compound object with metadata = 101 items per diary).

A special case is a multi-page PDF. Regardless of whether a multi-page PDF is added as-is, or converted to a compound object when added to CONTENTdm, it is counted as 1 item. Though the collection administrator may decide to convert a multi-page PDF into a compound object for better page-level discovery, it still counts as only 1 item.

How can I tell how many items and objects are in a collection?

CONTENTdm provides reports that give collection administrators information about the item count, compound object count, file types, and build history.

What functionality does the OCR Extension include?

The CONTENTdm OCR Extension enables you to integrate OCR (Optical Character Recognition) with your digital collection building. The OCR process converts a text-based image file (either a TIFF or JPEG file) to a corresponding ASCII text file, which is then full-text searchable.

Use the OCR Extension to generate full-text transcripts from text-based image files. The OCR Extension can be added to any new or existing CONTENTdm license and is included with the purchase of some license levels.

It also includes support for 184 languages, including Chinese, Japanese, Korean, Greek, Russian and Hebrew, among others.

The OCR Extension uses ABBYY’s award-winning FineReader OCR software to capture text for addition to searchable metadata fields within CONTENTdm collections. With this feature, end users’ search words are highlighted in the image when viewed.

Additionally, if you want to make printable PDFs available to end users for easy printing, you can choose to generate a PDF of an entire compound object using the OCR Extension. Whether applied to select items in a collection, or extensive document archives, the integrated OCR capability makes collection building more efficient.


System requirements

Please note that you can also choose to have OCLC host your CONTENTdm collections.

CONTENTdm Project Client

The CONTENTdm Project Client requires the following:

  • Microsoft® .NET Framework 3.5 or higher (or Internet connection for downloading .NET 3.5).
  • Microsoft® Windows XP with SP2 or SP3 32-bit, Windows Vista 32-bit or 64-bit, or Windows 7 32-bit or 64-bit. For sites processing large volumes of files, a 64-bit operating system is recommended.
  • Windows XP: 512 MB RAM minimum; 1 GB RAM recommended. Windows Vista and Windows 7: 1 GB RAM minimum; 2 GB RAM recommended. For sites processing large volumes of files, 4 GB RAM is recommended.
  • 2 GB of available hard-disk space for installation (a portion of this disk space will be freed after installation if the original download package is removed from the hard drive).
  • Minimum display resolution of 1024 × 768.
  • 256Kbps or faster connection.
  • Adobe® Reader 8 or later.

CONTENTdm Website

The CONTENTdm Website requires the following:

  • Dedicated Web server (IIS 7 or 7.5 with Windows® 2008 or 2008 R2, Apache with Linux).
  • PHP 5.3 with Linux.
    Note:
    PHP 5.3 is required for both Server and Website if installed on the same Linux machine.
  • 1 GB RAM minimum. 2+ GB RAM recommended.
  • 1 GB of available hard-disk space for installation.

CONTENTdm Server

The CONTENTdm Server requires the following:

  • Microsoft Windows Server® 2008 or 2008 R2; Linux (2.6 kernel). Operating system must be 64-bit. Dual-core processor is required but quad-core is recommended. CONTENTdm has been tested on Red Hat Enterprise Linux/CentOS 5 & 6, Ubuntu 10.04 LTS 64-bit, and SUSE Linux Enterprise Server 10. The CONTENTdm Server has been successfully installed on other Linux distributions based on the 2.6 kernel; however, not all distributions based on that kernel are supported.
  • Dedicated Web server (IIS 7 or 7.5 with Windows® 2008 or 2008 R2, Apache with Linux).
    Note: CONTENTdm can, and usually does, coexist on systems with other websites and applications. The CONTENTdm Server may be installed on the same or on a separate machine from the CONTENTdm Website. It should have its own Web server (IIS or Apache) instance if it is installed on the same machine.
  • PHP 5.3 with Linux.
    Note:
    PHP 5.3 is required for both Server and Website if installed on the same Linux machine.
  • 1 GB RAM minimum. 2+ GB RAM recommended. 4 GB RAM required for Level 3 licenses, but 8-12 GB recommended for large installations, especially those with full-text transcriptions.
  • 1 GB of available hard-disk space for installation.
  • Adequate disk space to hold your collection. (For example, if you have JPEGs with an average file size of 100KB, 500 JPEG images would require about 50MB of disk space. Larger image files or audio/video files require additional disk space.)

Need professional digitization services?

OCLC’s digitization partners, Backstage Library Works and Creekside Digital can manage a variety of digitization projects and both provide a wide range of additional services from data entry to OCR (optical character recognition) processing.


OCLC Connexion digital import

If you’d like to enable catalogers to use OCLC’s Connexion cataloging service to add digital items to CONTENTdm collections during standard cataloging workflows, consider the Connexion digital import


Training

Training for CONTENTdm is available in both instructor-led and tutorial formats.  As part of our commitment to controlling costs and providing value for our members, this training is offered for free to users of the service. 


Documentation | Online Forum

Comprehensive documentation, including help files, tutorials, a knowledgebase and other tools, is available online through the CONTENTdm User Support Center (login required).

The CONTENTdm User Support Center also serves as an online forum for discussions among current users. It also serves as a way to learn about quarterly releases, updates, upcoming user group meetings and other news about CONTENTdm.


Learn more

Download the brochure
Learn more about how CONTENTdm can help make your digital collections available to everyone, everywhere.

Hear from library members using CONTENTdm

Request a free, 60-day evaluation