Scanning Specifications


The NAL bases its scanning specifications on guidelines from the National Archives and the Federal Agencies Digitization Guidelines Initiative. A brief overview is provided below.

Scanning must create a single Tagged Image File Format (TIFF) file per page, capturing all pages including those intentionally left blank. Scans should not be cropped or de-skewed. TIFF image files must conform to the TIFF 6.0 specification. The conversion of document pages must be as follows:

Pages in Original Document ... Resolution
(pixels per inch/samples per inch)*
Bit Depth Compression
are blank or contain only text, charts, or line art 600 1-bit bitonal

ITU-T Group 4

contain color 400 24-bit Red, Green, Blue Mode (RGB)

none

contain half-tone prints in shades of gray or black and white photographs

400 8-bit grayscale

none

are larger than 11 by 16 inches and contain color

300 24-bit RGB

none

are larger than 11 by 16 inches and contain half-tone prints in shades of gray or black and white photographs

300 8-bit grayscale

none

*Note: In selected cases, the resolution for color and grayscale images may be increased beyond the standards listed in the above table.



Machine-readable Text


When the TIFF files are processed for display in the NAL Digital Collections (NALDC), a machine-readable text file is created through optical character recognition (OCR). The OCR text is stored in an Extensible Markup Language (XML) file with the metadata for the image. The presence of the OCR text enables full-text searching of the images in the NALDC. The NALDC also provides users with the option to view the images in PDF. This PDF file is created on the fly and is not text searchable.


About the Resolution of Images Available in the NALDC


Images are scanned according to the specifications outlined above and stored on a server. When images are copied to the NALDC, the resolution for grayscale images is changed to 200 ppi/spi and for color images to 100 ppi/spi in order to reduce the size of the files stored in the repository. If you require a higher resolution image than the one available in the NALDC, please call 301-504-6503 or e-mail Special Collections.



Spotlights