Skip to main content

Modalities of Text Mining: Exploring British Periodicals at Scale



About the Project

How can you identify and explore patterns across millions of documents? In this fellowship, Buchanan Library Fellows learned state-of-the-art techniques for text mining at scale. Fellows joined an ongoing research project to analyze constellations of information in Proquest's British Periodicals Collections. Depending on interest, Fellows learned to use Apache Spark, a framework for querying distributed data sets; BaseX, a native XML database; or Netsblox, a block-based programming language. The fellows learned to extract information from big data sets in the humanities, social sciences, or other fields with relative ease and confidence.

The Fellows

TingYan (Nicholas) Deng, Mark Grujic, Desiree Sagayno Hagg, Farouk Haroun, Ali A Hussain, Jiayi (Sunny) Li, Helen Qian and Cassandra Yermack

The Instructors

Mark Schoenfield, professor of English, interim director of undergraduate studies, English

Cliff Anderson,associate university librarian for research and digital strategy, interim director