A Gentle Introduction to Topic Modeling Using Python

Micah D. Saxton


Topic modeling is a data mining method which can be used to understand and categorize large corpora of data; as such, it is a tool which theological librarians can use in their professional workflows and scholarly practices. In this article I provide a gentle introduction to topic modeling for those who have no prior knowledge of the topic. I begin with a conceptual overview of topic modeling which does not rely on the complicated mathematics behind the process. Then, I illustrate topic modeling by providing a narrative of building a topic model using the entirety Theological Librarianship as my example corpus. This narrative ends with an analysis of the success of the model and suggestions for improvement. Finally, I recommend a few resources for those who would like to pursue topic modeling further.

Peer-Reviewed Articles