Transcription from the Crowd: Three Success Stories

The cost of paying for expert text recognition, proofreading, and quality assurance do not need to be prohibitory barriers to the completion of a digital library, recent stories show.

Projects ranging from preserving digital records of the stars to documenting the civil rights and suffrage movements have employed volunteers to transcribe historical documents. Whether dedicated researchers with a vested interest in the final product or a stranger offering a moment of time and effort, volunteer transcribers provide an essential service that ensures collections are accessible and searchable without a huge price tag.

Crowdsourcing volunteers is not new, not even for transcription work. Since 2012, the National Archives’ Citizen Archivist program has utilized a nearly unlimited workforce of internet users to transcribe historical documents, add metadata tags, and review changes made by other volunteers.

Part of what Anderson Archival offers is the integrity of treating your collection as our own – and for some collections, crowdsourcing the text recognition process can be an incredible solution to a lack of funding, or desire to involve the public.

Three such crowdsourced digitization and transcription projects already prove the capabilities of a large group of volunteers.

Involving specially-trained volunteers in some cases or strangers around the world also directly ties the public to the welfare of the collection.

Digital Access to a Sky Century @ Harvard (DASCH)

Not long ago, Anderson Archival wrote about the digitization of this spectacular collection. Since 2014, the Harvard-Smithsonian Center for Astrophysics has asked digital volunteers for help transcribing material related to Harvard observatory’s glass plates. This collection, known as Project PHaEDRA, includes logbooks and findings related to the DASCH glass plates analyzed by notable women in science history, such as Williamina Fleming, Annie Jump Cannon, Henrietta Swan Leavitt, Cecilia Payne-Gaposchkin, and Antonia Maura.

Transcription will allow the collection to be fully text-searchable and provides crucial contextual data about the 500,000 glass plate photographs of the night sky taken from 1885 to 1992. Without transcription of these data sets, accurate digital modeling of the scanned plates is not possible.

The collection of glass plates along with notebooks of findings will provide an incredible look at the history of our sky and of astronomy.

#TranscribeBond with University of Virginia

In August of 2018, and again in 2019, residents of Charlottesville, the University of Virginia, and digital volunteers met for a “transcribe-a-thon” of civil rights activist Julian Bond’s writings. Previously, the papers were available only to students in person at the university library, but digitization efforts aim to make this collection available online.

Transcription efforts for this collection are ongoing. Digital volunteers can join the work at From the Page. At the time of this posting, more than three thousand of the 7,797 pages of Bond’s writings are transcribed.

The Library of Congress: By the People

The Library of Congress’ By the People project proves that a few minutes of your time can impact future historical research and reference. By the People was launched in 2018, and collections relating to Abraham Lincoln, women’s suffrage, and the Civil War continue to be featured.

The digitization and transcription of these collections mean that countless pages will be available digitally and fully searchable for the first time, ensuring that the history they contain is accessible to all.

For volunteers fascinated with history, By the People has enough variety to keep anyone entertained and learning while they transcribe.

Crowdsourcing Saves Costs and Enriches the World

A solution like one of these might be possible for your collection. The cost of expert handwriting transcription, proofreading, and quality assurance may feel too high for some collection owners. A volunteer workforce can save money when this is the case!

Involving specially-trained volunteers in some cases or strangers around the world also directly ties the public to the welfare of the collection.

Crowdsourcing does have its downsides. As illustrated by the three examples above, even with millions of potential volunteers, the time taken to process documents greatly increases. In addition, a digital transcription framework may need to be purchased or developed. If a digitization project has limits on timeframe, or if leaving the accuracy of your collection to the public is a concern, crowdsourcing may not be right for you.

As with any project involving the public, your success may vary, and this is something to consider in the planning process, but crowdsourcing transcription provides an interesting alternative to the traditional model.

Anderson Archival provides all aspects of historical document digitization, including expert transcription, proofreading, and quality assurance. Are you looking for a digitization company that will treat your collection as their own? Contact Anderson Archival today.

Subscribe to Our Newsletter

Digital preservation is about connecting to history. We do our best to bring you the important news and personal stories you’re interested in. We’re always looking for article ideas. Come learn with us!

Invalid email address

Share this post with your team

Share on linkedin
Share on facebook
Share on twitter
Share on whatsapp
Share on reddit
Share on telegram
Share on email