Skip to page content

Integrated OCR: facilitating full-text searching

The new CONTENTdm OCR Extension provides the ability for users to integrate optical character recognition (OCR) with collection building. The feature uses ABBYY’s award-winning FineReader OCR software to capture text for addition to searchable metadata fields within CONTENTdm collections. When viewed, items prepared with this feature will display highlighted search terms within the image. Additionally, the OCR Extension provides the option to create a PDF file of an entire compound object for easy printing. Whether applied to select items in a collection, or extensive document archives, the integrated OCR capability makes collection building more efficient.

screen capture of the CONTENTdm interface

Highlighted search terms display in an image when metadata is prepared with the CONTENTdm OCR Extension.

The OCR Extension can be added to any new or existing CONTENTdm 4 license and is included with the purchase of some CONTENTdm license levels.

OCR Extension system requirements

OCR Acquisition Station

La estación de adquisición OCR requiere:

  • Microsoft Windows 2000 Professional o Windows XP.
  • 32-bit x86 processor (Intel® Pentium® 4 class compatible processor or higher).
  • Microsoft Internet Explorer 6.0 o posterior.
  • 256 MB de RAM como mínimo.
  • 100 MB de espacio de disco duro disponible para la instalación.
  • Resolución mínima de despliegue de 1024 × 768.
  • Conexión de 128 Kbps o más rápida.
  • Acrobat Reader 7.0 o posterior.