United States

  • English

OCR Extension

Integrate OCR with collection building

The CONTENTdm OCR Extension enables you to integrate OCR (Optical Character Recognition) with your digital collection building. The OCR process converts a text-based image file (either a TIFF or JPEG file) to a corresponding ASCII text file, which is then full-text searchable.

Use the OCR Extension to generate full-text transcripts from text-based image files. The OCR Extension can be added to any new or existing CONTENTdm license and is included with the purchase of some license levels.

It also includes support for 184 languages, including Chinese, Japanese, Korean, Greek, Russian and Hebrew, among others.

Make your text-based images full-text searchable

The OCR Extension uses ABBYY’s award-winning FineReader OCR software to capture text for addition to searchable metadata fields within CONTENTdm collections. With this feature, end users’ search words are highlighted in the image when viewed.

[screen capture]

Highlighted search terms display in an image when metadata is prepared with the CONTENTdm OCR Extension.

Create printable PDFs

Additionally, if you want to make printable PDFs available to end users for easy printing, you can choose to generate a PDF of an entire compound object using the OCR Extension. Whether applied to select items in a collection, or extensive document archives, the integrated OCR capability makes collection building more efficient.

We are a worldwide library cooperative, owned, governed and sustained by members since 1967. Our public purpose is a statement of commitment to each other—that we will work together to improve access to the information held in libraries around the globe, and find ways to reduce costs for libraries through collaboration. Learn more »