TOM: Topic modeling and browsing

About - Publication - Code - Demo


TOM (TOpic Modeling) is a Python library for topic modeling and browsing. Its objective is to allow for an efficient analysis of a text corpus from start to finish, via the discovery of latent topics. To this end, TOM features advanced functions for preparing and vectorizing a text corpus. It also offers a unified interface for two topic models (namely LDA using either variational inference or Gibbs sampling, and NMF using alternating least-square with a projected gradient method), and implements three state-of-the-art methods for estimating the optimal number of topics to model a corpus. What is more, TOM constructs an interactive Web-based browser that makes exploring a topic model and the related corpus easy.


Adrien Guille, Pavel Soriano (2016) TOM: A library for topic modeling and browsing
Actes de la conférence française sur l'Extraction et la Gestion des Connaissances (EGC), pp. 451-456


TOM is distributed via GitHub under the terms of the MIT licence.


Check out the EGC anthology browser that was automatically generated with TOM.

Topic cloud

Topic details

Document details