Success stories: case studies
Library of Congress, George Washington Papers
The Papers of George Washington at the Library of Congress are among the most important and valuable collections held by the Library. This one-of-a-kind collection was reformatted from the existing microfilm by Preservation Resources using techniques which improved legibility and access both to the microfilm and the original documents. The paper originals are often fragile, with bleedthrough, faded text, and annotations, characteristics which the digital technology has mitigated. The digital images now appear on the web where the Library can provide increased public access to the collection while further ensuring the safety and preservation of this national historical collection.
Background
Since 1990, the Library of Congress, through the American Memory Pilot and the National Digital Library Program (NDLP), has been researching and producing digital collections of Americana from primary sources. In 1996, funds became available to digitize the papers of George Washington in their entirety from microfilm produced in the 1960’s as a part of the Presidential Papers project. The collection had been organized numerous times by editors before it was deposited at the Library. Containing over 65,000 items, including letters, diaries, ledgers, and account books, at the end of scanning it numbered over 147,000 pages.
Before the collection was digitized, researchers and scholars could only view it on microfilm at the Library of Congress in Washington D.C. or from a purchased microfilm copy at their institution. To locate items of interest, researchers had to browse a printed index of all material on microfilm arranged by writer or recipient. After locating an item of interest, researchers had to locate the film and search the reel from the beginning.
Goals
Although increased access to the collection was a primary goal of the project, protection of the treasured original was also a concern. Since all the papers had been organized and microfilmed in the 1960s, the decision was made to scan the microfilm and not risk damage to originals which had been placed and bound by the Library’s Conservation Division into “Conservator’s Volumes.” This strategy to digitize from film would protect the original and also allow for off-site conversion.
Since the collection was already organized and some transcriptions of the manuscript papers existed, enhanced searching could be achieved by converting existing data to electronic text and linking it to the scanned image. Given the goal of the NDLP to provide Web-based access, images had to be small enough in file size to facilitate viewing by a wide range of users.
Solution
After issuing a request for proposals and evaluating responses, the Library of Congress chose Preservation Resources to digitize the collection. Preservation Resources offered five areas of expertise necessary for this project:
- Experience in scanning microfilm of retrospective research collections
- Ten years of experience preparing and microfilming primary source material, including manuscript collections
- A history of managing long-term projects, including those for the Library of Congress
- A complete in-house darkroom and film lab with experience in handling print master film
- A secure and environmentally-controlled storage vault for storing duplicate print master negatives
The process of capturing images from microfilm was an automated one; thus it allowed for more costly individual attention during image editing. Grayscale scanning was required for the entire collection to capture all of the information present in the film. This included information such as difficult handwritten material, faded text and archivists’ markings, but it also promised bleedthrough text to be as legible as possible. Where the image quality was compromised, microfilm reels were reduplicated by Preservation Resources to achieve densities that would lead to improved image quality.
At Preservation Resources, each image was examined individually as part of a quality assurance step. Custom cropping was applied to show only the item itself, excluding the conservator’s volume in which it was collected. When necessary, technicians rotated images, split multiple items on a page, and recombined pages originally filmed in segments. Preservation Resources also performed special image enhancement to some parts of the collection to increase legibility from that captured on the microfilm.
The 8-bit TIFF images were named according to the series and microfilm reel. Preservation Resources created a majority of the access images for this collection, seen on the Library of Congress’ American Memory web site. These images included reduced bit-depth GIFs and the JPEGs, which were the primary delivery format.
Benefits
The Library of Congress realized many benefits by outsourcing the scanning of this manuscript collection from microfilm:
- Protection of the original: By scanning from pre-existing microfilm, the valuable original was not exposed to additional handling, strong light, or moved from the secure storage of the Manuscript Division in the Library of Congress
- Increased access: By making the images available on the web, access to the collection is no longer limited to geography or building hours. The American Memory Web site receives more than 2 million hits a month
- Combination of expertise: Preservation Resources delivered the image files in the file structure required by the Library of Congress. The Library accomplished linking the manuscript text to the image files via the insertion of a unique identifier from the encoded text into the bibliographic database record for the document images
- Flexibility: Preservation Resources worked with the Library by investigating options in scanning the material and reporting situations that might affect other areas of the project. Sample scans, small pilots, and communication between project teams were common throughout the project
Learn more
View the images at: http://memory.loc.gov/ammem/gwhtml/gwhome.html
Find out about digitizing the collection at: http://memory.loc.gov/ammem/gwhtml/gwdigit.html