Anderson Archival logo

Diving into File Types

Dog Diving into Pool
Shana Scott

By Digital Archivist Shana Scott

Whenever we start a project, we ask our clients in what format do they want the final images, and unless they are an organization that must follow specific standards, generally the answer is, “Whatever you think is best.” That’s because most people don’t know what to even ask for. While you’ve no doubt seen these types of files, you might not understand the differences between them. So, let’s get a little technical and talk about file types.

File Type Terminology

Before we start, let’s define some of the concepts we’ll be discussing below.

  1. Archival Master: This is the high-quality file that represents the item. It is used when making official reproductions. This will be a large file that is kept safe from digital alterations and not used often.
  2. Access Copy: This is any number of low-quality copies of the Archival Master. These can be cropped, digitally altered, and compressed into small files that are easy to send over email or load on websites. Most images you see are what we would call access copies.
  3. File Type: This is the extension that comes after a file’s name. You have probably seen .pdf, .docx, .jpeg, etc. These letters tell the computer how to translate the bits and bytes into a picture or document. It’s why a .pdf will open in Adobe Acrobat or a .docx opens in Microsoft Word. Without a file type, the computer doesn’t know what to do with the file.
  4. Compression: Compression is a way to reduce the file size and make it easier to use by getting rid of some file data or restructuring it. If you’ve ever had a .zip file that asked you to extract the files, that’s compression. Here, we’ll be referring to image compression, or compression within a file type rather than multiple files being compressed together.

Image Files

When we digitize a book or a collection of photos, we turn the physical item into a digital image file, just like your phone takes a picture. There are several file types an image can be, each with their own pros and cons, but our standard delivery includes only two: TIFF and JPEG. You may see these written in different ways (i.e., .tif, .tiff .jpg, .jpeg), but for now we’ll simplify everything to TIFF and JPEG.

We have two goals in choosing these file types:

  1. Provide high-quality images that can be used for any purpose, including reproduction of the originals, and will be useable into the future—the archival master
  2. Provide easy-to-use images that can be shared online or over email—the access copy

TIFF

TIFF files achieve goal one, because TIFF is an open standard for still images, maintained by the International Organization of Standardization (ISO). That means that TIFF images will be supported into the future, which is very important given the speed at which hardware, software, and proprietary formats can become obsolete. A file is useless if in ten or twenty years no one can open it. Most standards, such as the Federal Agencies Digital Guidelines Initiative (FADGI), used by most archives in the US, require a stable file type like TIFF for archival master copies of their digital collection.

It’s also a good format for high-quality printing, as it is uncompressed or uses lossless compression. That means that no data is lost when saving the file. Unfortunately, that gives TIFF images a large file size, which is not good for sharing online or over email.

JPEG

JPEG is the solution to the size problem TIFF has and achieves goal two. The small file size of JPEG files makes it ideal for sending over email or loading quickly on a website. They get that small file size by using lossy compression. That means data is removed from the image in order to reduce file size, and once the data is gone, it can never come back. That makes JPEG files ideal access copies for everyday use, keeping the archival master files safely stored where they won’t be altered.

PDF/A

There’s one more file type to discuss: PDF/A. You’re probably familiar with the extension .pdf. It’s a versatile file type that can handle images, documents, and searchable text. We tend to use them on materials like books, magazines, letters, etc., but not with photos or photo albums.

While PDFs are very common, archives use PDF/A, though both end with the .pdf extension. This file type had one distinct feature that a regular PDF doesn’t: it will always load exactly as it is no matter what software is used. This is because PDF/A contains all the information needed to show the file as it was intended. Fonts, images, layout, everything is self-contained within the file, making it the ideal format for archival work. It’s also an ISO standard, so it will be useable in the future.

 

If you’re looking to keep your own digital files safe for years to come, now you know what file types will help you do just that. If you haven’t yet started your digital journey, Anderson Archival is here to bring your project to life.

Subscribe to Our Newsletter

Digital preservation is about connecting to history. We do our best to bring you the important news and personal stories you’re interested in. We’re always looking for article ideas. Come learn with us!