Buchanan Library Fellows Project
Buchanan Library Fellows Project, Spring 2020
Cursive & Recursive: Generating Transcriptions of Archival Documents Using Machine Learning
*This program is open to both undergraduate and graduate students.
Vanderbilt's Special Collections has a wealth of handwritten or early modern material that is difficult for computers to read. Optical Character Recognition (OCR) has come a long way, but still struggles with these texts. We will digitize select manuscripts (or bring your own from your research) and learn to produce transcriptions using machine learning techniques to teach the computer to recognize handwriting. We will then build a simple web exhibit displaying the digitized manuscript and its transcription side by side. You will learn project management skills, collaboration, and version control with Github; learn how machine learning works and when it doesn't; and learn data management and project documentation best practices.