Large-Scale Hierarchical Topic Models

TitleLarge-Scale Hierarchical Topic Models
Publication TypeConference Paper
Year of Publication2012
AuthorsPujara, J, Skomoroch, P
Conference NameNIPS Workshop on BigLearn

In the past decade, a number of advances in topic modeling have produced sophisticated models that are capable of generating hierarchies of topics. One challenge for these models is scalability: they are incapable of working at the massive scale of millions of documents and hundreds of thousands of terms. We address this challenge with a technique that learns a hierarchy of topics by iteratively applying topic models and processing subtrees of the hierarchy in parallel. This approach has a number of scalability advantages compared to existing techniques, and shows promising results in experiments assessing runtime and human evaluations of quality. We detail extensions to this approach that may further improve hierarchical topic modeling for large-scale applications.