Modalities of Text Mining
About the Project
How can you identify and explore patterns across millions of documents? In this fellowship, Library Buchanan Fellows learned state-of-the-art techniques for text mining at scale. Fellows joined an ongoing research project to analyze constellations of information in Proquest’s British Periodicals Collections. Depending on interest, fellows learned to use Apache Spark, a framework for querying distributed data sets; BaseX, a native XML database; or Netsblox, a block-based programming language. They learned how to extract information from big data sets in the humanities, social sciences, or other fields with relative ease and confidence.
Emma Boldwyn, Shwe Khin, Rohit Khurana, Yuzhe Lu, Erskine Nyoike
Mark Schoenfield, professor of English, interim director of undergraduate studies, English
Cliff Anderson,associate university librarian for research and digital strategy, interim director