Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body.
In the age of information, the amount of the written material we encounter each day is simply beyond our processing capacity.
Topic models can help to organize and offer insights for us to understand large collections of unstructured text bodies.
Originally developed as a text-mining tool, topic models have been used to detect instructive structures in data such as genetic information, images, and networks.
[4] Latent Dirichlet allocation (LDA), perhaps the most common topic model currently in use, is a generalization of PLSA.
Developed by David Blei, Andrew Ng, and Michael I. Jordan in 2002, LDA introduces sparse Dirichlet prior distributions over document-topic and topic-word distributions, encoding the intuition that documents cover a small number of topics and that topics often use a small number of words.
HLTA was applied to a collection of recent research papers published at major AI and Machine Learning venues.
[17] Several groups of researchers starting with Papadimitriou et al.[3] have attempted to design algorithms with provable guarantees.
[18] In 2017, neural network has been leveraged in topic modeling to make it faster in inference,[19] which has been extended weakly supervised version.