Guides to Quality in Visual Resource Imaging
2. Selecting a Scanner
� 2000 Council on Library and Information Resources
- 1.0 Introduction
- 2.0 Source Material Characterization
- 3.0 Background and Definitions of Image Quality Features for Scanner Selection
- 3.1 Tone Reproduction
- 3.2 Resolution
- 3.3 Color Reproduction
- 3.4 Noise
- 3.5 Artifacts
- 4.0 Understanding Product Specifications
- 4.1 Resolution: DPI or Image/File Size?
- 4.2 Bit Depth: Gray Levels, Shades of Gray, Millions of Colors
- 4.3 Dynamic Range: Maximum Density, # f-stops
- 4.4 Examples: Interpreting Product Literature
- 5.0 Resources and Methods for Image Quality Verification
- 5.1 Tone Reproduction or Tonal Fidelity
- 5.2 Color Reproduction or Color Fidelity
- 5.3 Resolution or Modulation Transfer Function (MTF)
- 5.4 Noise
- 5.5 Artifacts
- 5.6 Relative Importance of Image Quality Features for Different Document Types
- 6.0 Scanner Review
What is a scanner? It is more than a beige desktop box or copy stand camera. It includes the related driver software and application programs that manage it. Some may consider this a technicality, but if the history of desktop printers is a harbinger for capture devices such as digital scanners and cameras, then one must treat the triad of hardware, driver software, and application as the scanner. When choosing a scanner, all of them need to be evaluated and treated as a unit. Table 1 presents common attributes associated with each scanner component. Some of these functions may migrate between categories or from device to device. Most of them affect image quality in some way.
Table 1. Attributes of scanning components
|Number of bits per pixel
|Optics and optical path
|Page format retention
|Auto-document feed (ADF)
A scanner must be selected in the context not only of the characteristics of the object to be scanned but also of the intended use of the scanned image. There is no sense in purchasing an expensive scanner when the resulting images will be used only for Web site postings. On the other hand, creating digital master files for unknown future uses requires strict attention to detail and an understanding of how image information manifests itself and can be properly captured.
Section 2 of this guide reviews the salient categories of the source materials; namely content, format, and optical characteristics. Section 3 contains definitions of image quality features. These definitions are used as a basis for a discussion of setting minimal scanning requirements to achieve suitable image quality according to source and intent. Resources and methods to measure or judge these image quality features are described at length. Because not everyone is willing to perform image quality measurements on their own, Section 4 presents information on the interpretation of manufacturers' scanner specifications. Examples of such specifications, along with explanations, are included. The guide concludes with a review of scanner types in terms of image quality and implementation features.
Knowing your collection and understanding the priorities for digitizing it will help you determine the type of scanner to choose. There are four classes of scanners from which to select: film scanners, cameras, flatbed or sheet-fed scanners, and drum scanners. Except for film scanners, there can be considerable overlap in the content, format, and optical characteristics that each type of device can scan. Table 2 presents source material categories according to these three features.
Table 2. Characteristics of scannable source materials
Film (roll or sheet)
Flexible (film)/inflexible (glass-plates)
(gloss, texture, flat/
Spatial detail content
Color dye/pigment gamut
scratched, fragile, torn, bent,
Some types of scanners are better at capturing certain of these features than others. Benchmarking a scanner with respect to image quality features will delineate these differences. Definitions of these features and techniques for evaluating their quality are covered in the remainder of this guide.
In its purest form, image quality can be judged by the signal and noise characteristics of the image or scanning device under consideration. The ratio of signal to noise is often used as a single measure for image quality; that is, the greater the signal-to-noise (S/N) ratio the better the image quality. However, because one person's signal is another person's noise, the use of SNR as an image quality metric is difficult to manage. The interpretation of signal and noise becomes too broad and, in turn, ambiguous. S/N can be a useful measure for characterizing scanner performance; however, translating this measure into absolute image quality is difficult.
Consequently, image quality features are dealt with by more tractable imaging performance categories. There are five such categories: tone reproduction, color reproduction, resolution, noise, and artifacts. All yield objective measures that contribute to overall image quality in complex ways. For instance, a viewer does not perceive tone reproduction or resolution but rather the psychophysics of lightness, contrast, and sharpness. He or she then creates a mental preference for the image. Although these categories cannot measure image quality directly, they do serve as a good high-level model for evaluating image quality. The remainder of this section is devoted to detailed definitions of these image quality features. It will serve as a basis for further discussions on specifications and tools for scanner selection.
Tone reproduction is the rendering of original document densities into luminances on softcopy displays or into densities in hardcopy media. It is the foundation for the evaluation of all other image quality metrics. It determines whether a reproduced image is too dark, too light, and of low contrast or of high, and implicitly assumes the evaluation of neutral gray tones over large areas of an image.
The seductive beauty of a photograph by Ansel Adams or Irving Penn is primarily due not to the image content, composition, sharpness, or low noise but rather to the remarkable reproduction of tones-from gleaming highlights to deep-shadow details, with all tones in between. Tone reproduction is the welcome mat to the evaluation of all other image quality metrics. Although on the surface, tone reproduction seems a simple job of tone management, the subtleties of the viewing environment and cultural and professional preferences often make it an art.
For scanned image files, tone reproduction is somewhat of a misnomer unless a final viewing device is assumed. This is because the capture process in a scanner is simply that-a capture step. It is not a display that reproduces light. Tone reproduction, by contrast, requires both capture and display. How then does one select a scanner to accommodate the best possible tone reproduction when the scanned data generally may be reproduced on any number of display types and for a number of viewing preferences?
Three objective image-quality attributes of a scannerthe opto-electronic conversion function (OECF), dynamic range, and flareultimately and universally affect all tone reproduction. The scanner's driver software often controls the OECF; dynamic range and flare are inherent in the hardware.
The OECF is a term used to describe the relationship between the optical density of a document and the average digital count level associated with that density, as detected by the scanner. The OECF is the first genealogical link between an original object and its digital offspring and is usually controlled by the software driver. The extent to which the driver software allows the user to control the OECF and documentation on how the driver software accomplishes this are important features to consider when selecting a scanner.
Dynamic range is the capacity of a scanner to distinguish extreme variations in density. As a rule, the dynamic range of a scanner should meet or exceed the density extremes of the object being scanned. Because specifications for dynamic range are frequently overstated, the means to objectively verify these claims should be at hand. This will be covered in Section 4.3.
Flare is non-image-forming light with little to no spatial detail content. It manifests itself by reducing the dynamic range of a device and is generally attributed to stray light in an optical system. Documents in which low densities predominate and devices requiring large illuminated areas, such as full-frame digital cameras, generally suffer from high flare. These two conditions should be kept in mind when selecting a scanner. Whenever large amounts of light, even if outside the scanner's field of view, are involved in imaging, flare may become a problem.
See also, Tone Reproduction, Guide 4.
Resolution is the ability to capture spatial detail. It is considered a small-area signal characteristic. Before the advent of electronic capture technologies, resolution was measured by imaging increasingly finer spaced target features (that is, bars, block letters, circles) and by visually inspecting the captured image for the finest set of features that was detectable. The spatial frequency of this set of features was considered the resolution of that capture process. Measured in this way, resolution values depended on the target's feature set, image contrast, and inspector's experience. The units used for reporting resolution were typically line pairs per millimeter.
The resemblance of these units to the spatial sampling rate units of a digital capture device is unfortunate and continues to be a source of confusion about just what resolution is for a digital capture device. For digital capture devices, resolution is not the spatial sampling rate, which is characterized by the number of dots per inch (dpi).
The type of measurement described above is considered a threshold metric, because it characterizes the limiting spatial detail that is just resolvable. It reveals nothing about how the lower spatial frequencies are handled in the capture process; in other words, the extent to which they are resolvable. It is largely a pass/fail criterion. Because of this shortcoming, as well as feature, contrast, and inspector dependencies, resolution measurement done in this way is not robust. A supra-threshold metric is needed that removes not only the feature set and contrast dependencies but also the inspector's subjectivity.
The modulation transfer function (MTF) is a metric that allows one to measure resolution in a way that satisfies these criteria. The MTF is a mathematical transformation of the line-spread function (LSF). The LSF is a fundamental physical characterization of how light spreads in the imaging process, and for spatial resolution measurements, it is the Holy Grail. A detailed explanation of MTF, its value, and how it is used can be found in Image Science by J.C. Dainty and R. Shaw (1974).
Color reproduction, like tone reproduction, is a misnomer for scanners because colors are only being captured, not reproduced. A more accurate term has been coined for the potential color performance or fidelity of a digital capture: the metamerism index . International Standards Organization (ISO) work is under way to propose a metamerism index that would quantify the color-capture performance of a device relative to that of a human observer. The goal would be for the scanner to "see" colors in the same way as humans do. A metamerism index of zero would indicate equivalence between the scanner's color performance and that of a human observer. Calculation of the metamerism index requires knowledge of the device's color channel sensitivities as well as the illumination type being used, two pieces of information not normally provided by scanner manufacturers. In the absence of such a measure, a suitable surrogate for color capture fidelity, called average Delta E*, or E*, is often used.
E* makes use of a standardized perceptual color space called CIELAB. This color space, characterized by three variables L*, a*, and b*is one in which equal distances in the space represent approximately equal perceptual color differences. L*, a*, and b* can be measured for any color and specified illuminant. By knowing these values for color patches of a target and comparing them with their digitized values, a color fidelity index, E*, can be measured.
Finally, gray-scale uniformity may be considered a form of color fidelity. Gray-scale uniformity is a measure of how well neutral tones are detected equivalently by each color channel of a scanner. Although it can also be measured with the CIELAB metric, there are often occasions where the L*a*b* values are not available. In such cases, a first step in measuring color fidelity is to examine how well the average count value of different density neutral patches matches across color channels.
For photographic film, noise is often referred to as "grain" or "granularity," since its appearance is granular or random in nature. Like film, digital scanners and cameras have sources of noise related to signal detection and amplification. The nature of this noise is similar to that of film and can be defined as unwanted pixel-to-pixel count fluctuations of a random or near-random nature.
Digital capture devices, unlike film, may also be associated with non-random or fixed-pattern noise sources. For area-array sensors, these include pixel, line, and cluster defects from the detector. For better cameras, these defects are identified at manufacturing and digitally masked in the finished image file. For linear or line-array scanners, poorly corrected sensor defects manifest themselves as streaks in the image. While these are often classified as artifacts, their effects are ultimately integrated into the noise measurement.
Just as a scanner's resolution performance can be characterized via the MTF, noise measurements can be characterized according to spatial frequency content. The term for such a measurement is noise power spectrum (NPS). The photographic community implicitly uses NPS to calculate a singular granularity noise metric by requiring that noise measurements be done under conditions that weight the noise at spatial frequencies consistent with the human visual response ( Dainty and Shaw 1974).
See also Noise, Guide 4
Artifacts are best categorized as a form of noise-correlated noise, to be specific. Because artifacts do not appear as random fluctuations, they do not fit most observers' perceptions of noise and hence are given their own image quality category. Most artifacts are peculiar to digital imaging systems. The most common are nonuniformity, dust and scratches, streaks, color misregistration, aliasing, and contouring/quantization. At low levels, for short periods of viewing, artifacts are considered a nuisance. At moderate levels they can render a digital image defective, especially once the observer has become sensitized to them. The most common types of artifacts may be described as follows:
Nonuniformity is a large area of fluctuation in illumination caused by uneven lighting or in-camera light attenuation such as vignetting. Nonuniformity across an image is extremely hard to detect without image processing aids; the illumination can vary as much as 50 percent from center to corner before it can be detected without aids. Flatbed scanners, drum scanners, and film scanners using linear arrays tend not to suffer from nonuniformity problems, in part because their illumination source is often accounted for at scan time. Digital cameras, however, can suffer considerable nonuniformity because of lens performance or improper illumination set-up by the user.
3.5.2 Dust and scratches
While dust is a function of scanner, document, and environment hygiene, the extent to which scratches in film or on a flatbed platen are hidden is often overlooked as a scanner selection criterion. Scratch suppression in film scanners is dependent on proper illumination design. Scratches are increasingly being suppressed after capture through scratch-detection methods and then digitally corrected with interpolation algorithms.
Streaks are localized line nonuniformities in a scanned image. Because of the rectangular grid format of digital images, streaks usually occur in horizontal or vertical directions and are often more dominant in scanners using linear-array detectors. Occasionally, repetitive streak patterns, called rastering, can occur across a scanned image.
3.5.4 Color misregistration
Color misregistration is the spatial misalignment of color planes. It can occur because of poor lens performance or the optical-mechanical methods used to capture the image. It is best recognized by color fringing at high-contrast sharp edges and color scans of halftone images. It is most often a problem with inexpensive linear-array scanners. Several years ago, this artifact was not worth considering because it rarely occurred at a significant level. With the advent of less expensive parts and manufacturing shortcuts, however, color registration has become more of a problem and should be monitored.
Aliasing occurs because the sampling rate is insufficient for the spatial frequency content of the object being scanned. It occurs only in digital images. For repetitive features such as halftones or bar patterns it manifests as a moiré pattern. It is also recognized in nonrepetitive features by jagged-edge transitions ("jaggies"). The potential for aliasing can be detected by slanted-edge MTF measurements that are described in Sections 5.3 and 5.5.
Contouring is defined as the assignment of a single digital count value to a range of densities that vary by more than one just-noticeable difference in density. It occurs because of insufficient bit depth in a captured image. It is most noticeable in slowly varying portions of an image and manifests itself as an abrupt and unnatural change in density. Contouring is prevented in most digital capture devices with internal bit depths of 10 bits or greater.
After the requirements for a scanner have been defined, it would seem a simple task to review several manufacturers' product specification sheets and choose the scanner that best meets those criteria. This is certainly true in the case of easily verifiable items, where there is no ambiguity about definition (for example, power requirements, physical dimensions, and sensor type). However, for most criteria related to image quality, this is not the case. Because there are few strict or unique standards for digital capture imaging performance criteria such as resolution, dynamic range, noise, or color fidelity, a manufacturer can choose how it markets a device's capabilities. In the absence of means by which to independently verify specification sheet claims, buyers should remember two rules:
1) Approach all imaging performance claims on specification sheets with caution. They are often confusing, inflated, or misleading.
2) Generally speaking, you get what you pay for. It is wise to consider a manufacturer's reputation when selecting a scanner, although there are always exceptions.
The remainder of this section describes how commonly cited imaging performance criteria are presented in specification sheets and how to interpret this information.
Specification sheets can offer resolution for digital capture devices in terms of spatial sampling rate or of image or finished file size.
4.1.1 Spatial Sampling Rate
Where document imaging is the presumed application, resolution is in terms of the spatial sampling rate, which may be defined as spatial frequency of pixels on the document. This is the case for flatbed document scanners, drum scanners, copy stand cameras, and microfilm scanners. The rate is cited as dpi, ppi (pixels per inch), or, infrequently, spi (samples per inch). The sampling rate is a necessary, but not sufficient, condition for actual detail capture in sampled imaging systems. Knowing the extent to which light spreads in a capture device by way of the LSF or MTF provides this sufficiency. This is why product specifications for resolution do not enable the user to draw any meaningful conclusions regarding resolution performance. The common terms for sampling rate in specification sheets are optical resolution and addressable resolution .
Occasionally, document scanner resolution is specified differently in the two different directions of the scan; for instance, 600 x 1200 dpi. Although both values are considered optical resolutions, the higher one is usually achieved through a higher sampling rate that outpaces the MTF performance. The lower of these two values, associated with the sensor pixel pitch, is probably a better indicator of true detail capture abilities. Most resolution claims greater than 600 dpi should be viewed with suspicion.
One of the ways of inflating true resolution is the use of interpolated resolution . Interpolation is a powerful and appropriate tool for many image-processing needs, (e.g., isolated defect concealment or benign image scaling); however, using it as a "pixel-filling" utility to inflate resolution claims is misleading at best. This is because practical interpolation methods are imperfect predictors of missing pixel values. Resolutions of 1800-9600 dpi, sometimes touted by manufacturers, are possible only with the most expensive laboratory equipment or with customized devices such as drum scanners or microdensitometers.
Prudent and successful interpolation methods are found in color filter array (CFA) digital cameras. Unlike the interpolation technique cited above, which fills in pixels where none existed before, CFA interpolation schemes rely on correlated knowledge of the color that actually was sampled at that location. Because resolution between color channels often correlates well, these methods have been shown to be almost lossless for moderate image interpolation.
For digital cameras having no document reference, resolution is specified in terms of finished file size (e.g., 18 MB), intermediate file size (e.g., 6 MB), or image sensor size (e.g., 2048 lines x 3072 pixels). The path for relating one to another requires knowledge of the number of bits per pixel per color and the total number of colors, as well as some familiarity with the sensor technologies used. This method of resolution specification can be confusing to interpret. The calculation for a finished file is as follows:
|(# lines x # pixels) x (# bits/pixel) x (# colors) x (# bytes/bit) = finished file size
|(2048 x 3072) x ( 8 ) x (3) x (1/8) = 18,874,368 bytes � 18 MB
File size determination is an imperfect discipline, largely because of the loose definition that the imaging community applies to the term megabyte. Technically, a megabyte is one million bytes. The imaging community, however, has taken the nearest integer power of two and used this as a basis of calculation. Under this system, a megabyte is (220), or 1,048,576 bytes. Using this number as the divisor in the above equation will yield exactly 18 MB.
Occasionally, cameras with CFA color sensors capture a small intermediate file that is later processed into a larger finished file on a computer. The smaller intermediate file is often specified for purposes of file storage advantages. For example, in the above calculation, there is effectively only one color channel in the intermediate file. Therefore, the intermediate file size is only 6 MB.
Sometimes, very large file sizes are specified that are not consistent with the calculation in the equation just presented. This often occurs when 12 bit/pixel files are created. Since 8-bit (i.e., 1 byte) file storage is standardized, an extra byte is required to store the remaining 4 bits. This leaves four remaining "empty" bits. Although there are ways to "pack" these bits efficiently, it is sometimes more convenient not to do so. Therefore, the extra 4 bits/pixel tag along. They have no useful image information associated with them, but they do inflate the finished file size.
As a tattoo for this section, many digital cameras with resolutions lower than 1 Mpixel often cite resolution in terms of equivalent monitor resolution. Common examples are as follows:
|# pixels x # lines
|640 x 480
|800 x 600
|1024 x 768
|1280 x 1024
Specification sheets commonly refer to several different bit depths or to the associated number of gray levels or colors. This can be confusing. The source of this confusion often lies in whether the manufacturer is citing
- the number of bits for one color channel,
- the number of bits for all color channels, or
- the number of internal bits versus the number of finished-file bits.
Although the first two are easily defined, the third needs to be tracked carefully. The primary relation between the number of potential gray levels per color channel and the number of bits per color channel (N) is
# gray levels = 2 N
For example, an 8 bit per color channel device would potentially yield a maximum of
256 gray levels = 2 (8 bits/color channel)
The relation between the number of potential colors and the number of bits per channel (N) and number of colors channel (C) is
# of potential colors = 2 CxN (e.g., more than 16 million colors = 2 8 bits/ channel x 3 color channels )
For artifact-control purposes, almost all digital capture device manufacturers capture the initial raw data with more internal bits than will be reported to the user in the finished file. This is common engineering practice. As an example, internal captures (that is, A/D conversion) and processing at 10 bits/pixel/color channel are common. It is not until the end of the internal processing chain that the data are converted to 8 bits/pixel channel. For a three-color scanner, this means that 30 bits/pixel (10 bits/pixel x 3 color channels) are maintained initially and finally reported as 24 bits/pixel (8 bits/pixel x 3 colors).
Increasingly, manufacturers are citing internal bits as a means of distinguishing their product without revealing that the bits are inaccessible. This means, for instance, that billions (e.g., 230) of potential colors are claimed for some scanners even though users cannot realize them. This approach holds even for binary scanners (1 bit/pixel). The initial internal capture is done at 8 bits/pixel. This extra bit depth is then used to make intelligent thresholding decisions for optimal binary image quality. This has always been the practice, but only recently has it been cited in specification literature.
The greater number of bits accessed brings with it not only the obvious trade-off of increased storage requirements but also the less obvious trade-off of scan time. Some manufacturers that allow access to imagery at various bit depths cite fast scan times associated with the lowest bit depth. Access to higher bit-depth imagery will require longer scan times and will lower productivity.
Bit-depth specifications do not necessarily provide information about the quality of the signal being digitized. Are the bits being used to digitize image data or noise? In all scanners, portions of the bit capacity are used to correct for nonuniformities in the detector. Scanners using inexpensive parts often require a larger portion of the total bit depth for detector compensation. The bits used for this compensation are not usable for image data, but the user has no way to know this.
Dynamic range is the density range over which a capture device is operational. Two device characteristics, flare light and detector noise, limit dynamic range; however, nearly all scanner manufacturers specify dynamic range as if neither existed in their product. They do so through the following equation, which relates the number of bits per color channel (N) to dynamic range:
Dynamic range (density) = -log 10 [1/(2N-1)]
This equation is a theoretical calculation that assumes no practical imaging effects such as flare or imager noise. Table 3 lists dynamic ranges and their corresponding number of bits per finished file according to the equation given above.
Table 3. Relationships between dynamic range and bits per finished file
|# bits (# f-stops)
One should be skeptical whenever these numbers are cited as "dynamic range values." They are probably unachievable given that they are based on theoretical calculations. Values slightly removed from these (e.g., 3.1 or 2.8) may be better indicators of performance because of their nonconformity.
Manufacturers sometimes list maximum density values alongside dynamic range specifications (e.g., "Dynamic Range = 3.0, Maximum Density 3.1"). These maximum density values are biased slightly higher than the stated dynamic range. The manufacturer takes advantage of minimum film and paper base densities and calibrates the scanner so that gray levels are not wasted on densities below them. In this way, the gray levels can be used to encode higher densities. Density biases typically range between 0.10 and 0.30.
What follows are actual product specifications selected from digital camera or scanner promotional literature. They are used as examples of how to interpret certain vendor claims. The ones cited are common in many performance specifications.
From a digital camera specification:
- 6-million-pixel (2036 x 3060) resolution
- 18-MB image file
- 12 bits/color
The product of 2036 x 3060 yields 6,230,160 pixels, or roughly 6 million pixels. This camera uses a CFA array, so the intermediate, unprocessed file is 6 Mpixels. Assuming 8 bits/pixel bit depth and three colors, the 18-MB file is confirmed. The confusing part is that 12 bits/pixel, not 8 bits/pixel, are specified. This is because 12 bits/pixel is meant to indicate the internal bit depth. The finished file bit depth is 8 bits/pixel.
From another camera specification:
12-bit ADCs for 8+ f-stops of dynamic range
This suggests that there are 12 bits of internal precision but only 8 bits (i.e., 8 f-stops) are available in the finished file.
From a 35-mm film scanner:
2592 x 3894 pixels (24.3 x 36.5 mm)
For an 8" x 11'' document that is photographed to just fill the 35-mm frame, what is the equivalent sampling rate on the original document?
The maximum coverage for the document requires that the 8" dimension match the 24.3-mm dimension of the film. The quotient of (2592 / 8") calculates as 305 dpi.
From an inexpensive flatbed desktop scanner:
- 36-bit color quality
- (700 x 1400) dpi and (1400 x 2800) dpi optical resolution with dual-lens design.
- 3.3 dynamic range
Most flatbed scanners that quote different sampling rates in the horizontal and vertical dimensions have a lower limit of 600 dpi. The lower figure is typically a good indicator of real resolution performance. The fact that a 700 dpi is claimed as the lower value is cause for an MTF analysis. Because of the incremental increase relative to 600 dpi, it is likely the claim is true. When manufacturers overstate product performance, they usually do so strongly. The even higher claim of 1400 dpi is likely to have something to do with the dual-lens design. Sampling rates higher than 600 dpi are largely unnecessary for most reflection objects. The spatial frequency content is just not available. Only for transmissive objects, such as film that can support higher spatial frequencies, would one need sampling rates above 600 dpi.
The 36-bit color quality indicates that 12 bits/pixel (i.e., 3 colors) are used for data encoding. No clues are given whether this is available in the finished file. However, the 3.3 dynamic range is consistent with 11 bits/pixel, not 12. It is unclear why the lower dynamic range is quoted, although it makes the data more believable. Only testing with gray patch densities can verify this claim.
Reflection densities greater than 2.5 are extremely hard to find. Only for reflection objects with gross ink laydowns, such as silk-screened graphics or ink-jet documents do densities reach these levels.
As pointed out in the previous section, users should generally read manufacturers' product specifications with a high degree of skepticism. Naturally, this leads one to ask, "What resources and methods are available to measure or monitor image quality features of digital scanners?" This section attempts to answer this question by suggesting target, software, standards, and literary resources to do so. Many tools for monitoring image quality features are incomplete, subjective, or nonexistent. In such cases, suggestions are made on targets that can be captured now and analyzed when the tools become available.
The resources needed to monitor image quality for digital capture devices are no different than those needed for conventional imaging methods, namely, appropriate targets and a means to evaluate the images of those targets. As with all targets, the image characteristics should meet or exceed the particular characteristic being tested. Gray-scale targets for characterizing tone reproduction and dynamic range should be neutral and have a wide density range. Resolution or MTF targets need to contain high amounts of detail. Color-reproduction targets should have a wide gamut of colors. When using targets of any kind to characterize imaging performance, the user should keep in mind that any shortcomings in the target itself will be reflected in the final scanner measurement. To mitigate this, the supplier or user should characterize the target with respect to the image quality feature being measured. To the extent possible, targets should also be consistent with the characteristics of the originals that will be scanned since, for example, the color characteristics of photographic dyes are very different from those of printers' inks.
When capturing target images for image quality analysis, one must document all scanner and driver software conditions. Failing to do so makes the results ambiguous, because the driver software can manipulate data from digital capture devices in nearly infinite ways before it is available for use. (This is another reason to think of a scanner as a "hardware-driver-application" triad.) The engineering, scientific, and standards communities prefer to have all image quality measurement captures done with all driver settings in null states because the "raw" nature of the data is fundamental to the imaging process. Typically, these null conditions are as follows:
- gamma = 1.0
- no sharpening
- no compression
- no descreening or blurring
- no automatic scene adjustments
- no tone or color manipulations
These settings will probably not result in a visually pleasing image and are useful only for unfettered analysis. For more practical field situations, the user may prefer to standardize a set of default driver settings that are consistent with everyday use. Whatever choice is made, it is essential to document the settings.
The captured digital images of the targets should be evaluated both qualitatively and quantitatively. Qualitative evaluation on a high-quality, calibrated display is especially useful for quickly checking obvious scanning errors and for monitoring image quality features for which quantitative measures are nonexistent or unreliable. Quantitative evaluation is done with software that allows for unhidden, unaltered, and easy access to the data or that has been tested to yield reliable image quality metrics. Few specific or dedicated software tools are currently available to measure many of the image quality features discussed here; however, several are being planned. In the meantime, generalized image tools such as Adobe® Photoshop, NIH Image (http://rsb.info.nih.gov/nih-image/index.html), IP Lab Spectrum (http://scanalytics.com), and Scion Image (http://scioncorp.com) provide a great deal of flexibility for image evaluation, albeit with more tedium.
The following section provides suggestions on the levels of target and software resources to properly evaluate specific features. A good reference for performing either alternate or similar tests of these features can also be found in Desktop Scanners: Image Quality Evaluation ( Gann 1999). When performing the evaluations, it is essential to keep track of the software driver settings used.
OECF , dynamic range, and flare can all be characterized by capturing and analyzing neutral gray-scale patches that vary from dark to light. The software tools required for each feature are the same; the differences lie in the target format.
A target for characterizing OECF may be found in a tool that is available at photographic supply stores. It is a simple row or matrix of gray patches like that in a Macbeth® Color Checker, a Kodak® neutral gray-scale, or an IT8 target that tracks the generalized gray level response of a scanner.
Figure 1: Kodak's version of IT8.7/2
By knowing the optical densities of each patch and interrogating the digital file for the average count value associated with it, the user can derive the OECF by plotting the patch density-count value data pairs. To avoid ambiguity, a minimum of 12 patches, spaced nearly equally in density, should be used. Generally, the OECF curve should be smooth, such as that pictured in fig. 2. If not, illumination nonuniformities may be the cause.
With a little planning, a similar target can be used to extract dynamic range. Finely incremented gray patches in the high- and low-density portions of the scale need to be included because these extreme densities are responsible for determining the dynamic range. For reflection copy, the densities should range from 0.10 to 2.50. For transmission applications, they should range from D min - 3.50. The high-density increments should be about 0.10, while the low-density increments should be about 0.02. Because of the extreme densities, targets of this nature are not readily available commercially and may have to be generated by the user. This can be done for reflection media by obtaining individual neutral Munsell patches or by generating these densities onto photographic paper, ink-jet media, or dye-sublimation media and pasting up one's own matrix of density patches. The densities of the patches must be characterized on a densitometer before being used.
Once the dynamic range target is made, the image is captured, and the data are examined, a plot of patch density-count value data pairs is done as described above. From this plot, one generally finds that no change in count value occurs at the extreme densities, although densities continue to increase or decrease. These are called "zero-slope" conditions. The difference between the extreme high and low densities where this zero-slope condition occurs is called the "dynamic range." A zero-slope condition can be seen at the high-density portion of fig. 2. No further decrease in average count level occurs above a density of 1.35. This is an indication of the effective dynamic range for that scanner at the settings used.
The best way to maintain maximum dynamic range is to have access to the raw internal data at the internal bit-depth level. Whether or not these data are available to the user depends on the vendor. Some vendors supply special firmware that makes the data available. The disadvantage of this proposition is that all processing of the data becomes the responsibility of the user. Nevertheless, its value as an archive file is substantial.
Flare can be measured with a single gray patch, but it needs to be captured in two image frames. The patch should have a density of about 2.0 and should be no larger than 2 percent of the area covered by the capture device's capture area of interest. One frame is captured with the density patch in a white surround (high-flare) condition, while the other is captured with the density patch in a dark surround (low-flare) condition. The difference in the patches' average count value between the two frames is an indicator of flare. The greater the difference, the greater the flare. Flare limits dynamic range and is typically not a major problem for document scanners that illuminate small portions of the document at one time. For scanners that illuminate the entire document, flare is a potential problem. To the author's knowledge, no suitable targets are commercially available to measure flare.
See also Flare, Guide 4.
The best way to measure color fidelity for capture devices is to use a metamerism index. This particular color metric, however, is still under development. The E* metric cited in Section 3 is recommended as a substitute. This requires the capture of a color target with known L*a*b* values, such as an IT8, or Macbeth® Color Checker targets. The methods and tools to evaluate E* are outlined in Desktop Scanner: Image Quality Evaluation ( Gann 1999). At a minimum, some sort of gray-scale balance should be ensured by performing channel-specific OECF curves on neutral gray patches and checking for equivalence between each color channel's OECF curve ( see fig. 2). The individual color channel OECFs do not align with one another; for the red and blue channels, they cross. In many ways, crossed curves are worse than misaligned curves because the color changes at the crossover point. It is this change in color that is most noticeable upon viewing
The spectral sensitivity of the imaging sensor and the spectral distribution of the illumination source drive color fidelity. These items are available from the manufacturer, but the user needs to ask for them. Once received, they should be recorded as image quality features of essential value for future use. Alternately, the vendor may have International Color Consortium (ICC) color profiles available. These profiles ease the job of color reproduction for any type of supported output device or display.
There are several techniques and associated targets for measuring MTF. Only the two supported through publicly available software are considered here. The first is the sine wave technique. Sine wave targets of varying spatial frequencies and formats on both reflection and transmission media can be purchased through Sine Patterns (http://www.scioncorp.com). The software and documentation for analyzing the images of these targets can be found on the Web site of the Mitre Corporation (http://www.mitre.org).
The second technique that is an accepted standard ISO 12233 (PhotographyElectronic still picture camerasresolution measurement) for measuring electronic camera MTFs is the slanted-edge technique. Documentation on its benchmark testing has been published ( Williams 1998). Targets for applying this technique can also be purchased through Sine Patterns. Analysis software in the form of an Adobe® Photoshop plug-in or Matlab® code can be found at the Photographic and Imaging Manufacturers' Web site (http://www.i3a.org/). A tutorial document on the utility and purpose of MTFs can be found in RLG Diginews , Volume 2, Issue 1 ( What is MTF....and Why Should You Care?).
Measuring noise can be as simple as capturing a single digital image of a grayscale step tablet and calculating the standard deviation of the pixel count values contained within each gray patch. A plot of pixel standard deviation (Y-axis) versus mean count value (X-axis) within that patch, similar to the plot in fig. 3, is a good starting point.
Listed here are typical image-processing and application operations associated with scanner drivers and their tendency either to increase or decrease the measured noise. While these are general trends, exceptions do apply. For example:
|Increases measured noise
|Decreases measured noise
|Aggressive color management
|Median or low-pass filtering
|Contrast or gamma decrease
|Contrast or gamma increase
Any target nonuniformities or textures will inflate the measured noise. This is true for document scanners, which are likely to resolve these textures and, to a lesser extent, for digital cameras, which are less likely to resolve them. This can be considered fixed pattern noise associated with the target and should not be assessed to the scanner. Specific ways to separate noise sources are outlined in proposed ISO noise measurement standards ( ISO 15739) for digital cameras (PIMA/IT10). The principles can also be applied to any digital capture device.
The targets used for identifying scanner artifacts are rather simple. They are usually uniform gray patches of varying densities that cover the entire scan area or repetitive feature patterns found on various resolution charts or halftone patterns. To date, qualitative analysis by a trained observer using a good display, along with flexible software having features such as zoom, threshold, histogram, false-color, and movie options, is a suitable methodology for detecting artifacts. Though quantitative values can be placed on these artifacts, robust software with which to do so is unavailable. Perhaps more than anything, the lack of artifacts and the ability to handle those that do exist distinguish good scanners from excellent ones.
Illumination nonuniformity and sensor defects can be detected by examining a capture of a uniform gray patch that spans the entire scan area. Histograms allow one to analyze the data objectively. The wider the histogram, the greater the nonuniformity. Threshold or false-color tools give one a visualization of the rate of nonuniformity or the emergence of defects. Grayscale ramps or wedges are extremely useful for identifying streaking or contouring artifacts in conjunction with the threshold function, contrast adjustment and false-color tools.
While analytical color misregistration tools are available, a visual assessment of misregistration at vertical and horizontal edge transitions can be made by flickering between color channels with movie modes or toggle switches (PIMA/IT10). By noticing how much edges move as the channels are changed, a quick assessment of color misregistration can be made. Misregistration should be measured at several sample rates.
Aliasing, which manifests itself as moiré patterns in repetitive image features can be detected by scanning these features and noting the observability of the moiré. The user must ensure that the resulting image is displayed on the monitor at 100 percent or greater enlargement. Evaluations at other enlargement positions will lead to false impressions of moiré because of the display, not the scanner. The potential for aliasing can also be determined analytically through use of the MTF. Any significant MTF response beyond one-half the sampling frequency should be considered a potentially aliased condition.
A document's content will largely determine the relative importance of image quality performance for any particular capture device. For instance, documents containing only bitonal black-and-white information require a low level of grayscale or color registration performance from a scanner. Table 4 rates the relative importance of different document types relative to the image quality features discussed earlier in this section. It is meant only as a guide. Individual situations and environments may require some tuning of these ratings.
Table 4. Relative Importance of Image Quality Features in Various Document Types
*Rated on a scale of 1 to 5, where 5 is "very important," 3 is "moderately important," and 1 is "not important."
There are five types of digital capture devices from which to choose: flatbed scanners, sheet-fed scanners, drum scanners, cameras, and film scanners. Each has been mentioned in this guide. This section is meant as a review of pros and cons of each device with respect to image quality as well as practical issues such as productivity, cost, and skill level.
Table 5. Pros and cons of digital capture devices
| Highly addressable
Many units can handle both transmission and reflection materials
Flexible software drivers
Most good up to 600 dpi of real resolution.
Low learning curve
| Low productivity, frequent document handling
Tendency toward streaking and color misregistration
Prone to inflated marketing claims
| High productivity As good as or better than flatbed scanners
Many automatic features
| Unsuitable for fragile, bound,
wrinkled, 3-D, or inflexible objects
More expensive than flatbed scanners
May not handle all sizes of documents
| Very high image quality
high dynamic range
good tone/color fidelity
Very flexible software drivers
Variable sampling rate
High operator skill level
Handles limited document types; must be mountable on drum
| Can handle a variety of document/ object types (3-D, bound, glass plates, non-flat, oversized)
Unlimited field size
Rapid capture for area arrays
May have interchangeable lenses
Generally good image quality
|- Good models expensive.
Limited sensor size
Low productivity for linear array types
Nonuniformity artifacts common
Area array devices prone to low dynamic range due to flare
Moderate skill level required
| Highly productive for roll film
Low flare/ good dynamic range for linear arrays
| Low productivity for sheet film or slides
Potential for high flare in area-array devices
Dust/scratch artifacts common
Image quality characterization difficult due to lack of targets