You are here

Large-Scale Text Analysis of Japanese and Chinese Literature: An Introduction to Text Mining for Humanists

Stanford University Center for Spatial and Textual Analysis hosts a workshop on automated text analysis using Chinese and Japanese literature.

March 3, 2016 4:30pm to 6:00pm

Richard Jean So, Assistant Professor, Department of English, University of Chicago
Hoyt Long, Associate Professor, East Asian Languages and Civilizations, University of Chicago

In this “hands-on” workshop, we introduce colleagues in the humanities with little or no experience in computer science or programming to the rudiments of automated text analysis (or colloquially, “text mining”) for literary studies. This workshop will teach colleagues how to pursue this work from the very beginning steps: how to identify or build a corpora of texts; how to transform these texts into a format that a computer can interpret; how to input these texts into one’s computer and prepare them for computational and statistical analysis. After we teach these basic yet fundamental tasks, we will offer some lessons in introductory-level automated text analysis methods, such as document comparison and clustering analysis. Throughout, we will provide easy-to-use computer code, so that any previous experience in programming is not necessary. Moreover, our workshop will address the particularities of dealing with Japanese and Chinese texts within text mining, and the code we provide works specifically for this type of material. In sum, we expect participants who have completed this workshop to leave with enough practical skills to immediately begin their own text mining projects.