Historical digitization services have been a part of preservation plans for decades, but the technology has improved dramatically. With optical character recognition (OCR), which creates the ability to search through the text of any scanned image, your collection has the potential to become a functional research tool for anyone viewing your digital library.
Converting historical scans can be a complicated process. The Federal Agencies Digital Guidelines Initiative 2016 (FADGI) notes that “without staff with a good technical foundation, achieving the appropriate level of quality . . . is problematic. Cultural heritage digitization is a specialization within the imaging field that requires specific skills and experience.” Depending on the condition of your digital collection, it may be more cost effective to outsource OCR services than handle it in-house.
Bringing Out the Text from Scanned Images
Before starting any digitization plan, it is vital to know the quality of the images. Older image files, or those created without quality equipment, may no longer be suitable for preserving a collection into the future. The FADGI offers specific quality guidelines for digital scans to be considered suitable for OCR or other information processing techniques.
Many early scanning efforts may not offer the resolution or clear detail needed for OCR software to read text. In this case, any attempt to use image-to-text converting services requires new scans with updated equipment. That can get expensive if your organization doesn’t already have scanners or digital cameras capable of creating images of sufficient quality or the manpower to perform the scans. Your organization may find it more economical to hire a company that offers both scanning and image-to-text converting services to avoid buying expensive equipment.
Why Should You Outsource OCR Services?
If your scans are suitable quality to proceed without problems or need only minimal adjustments, then you can immediately begin converting your historical scans to readable text. OCR software is available for purchase, and a single employee can digitize your collection, but before you send them off for days of converting, consider this warning from the FADGI: “avoid the trap of assuming doing the work in-house will cost less. Insourcing may cost more than outsourcing.”
Even if you don’t need to purchase new scanners or digital cameras for your digitization project, it can still be beneficial to outsource OCR services. For all that OCR software is capable of, it still reads text like a computer, and that can mean countless errors in the conversion process. If your project requires a decently accurate rendering of the text, an employee must verify potential errors the software flags. If your project requires a high level of accuracy, another pass may be needed to review the text against the scanned image manually, word-for-word. All of this increases the amount of time you must devote employee resources to the project.
Anderson Archival’s historical digitization services provide you with staff already proficient with this process. Our employees can perform the same tasks with better resources and less downtime learning new software or what errors to watch for. This can ultimately save your organization money and resources in the long run.