Topic Modeling with Latent Dirichlet Allocation (LDA) for NIPS papers

Neural Information Processing Systems (NIPS) is one of the top machine learning conferences in the world where groundbreaking work is published. Since the year 1987, a lot of exciting work has been published in this confernece, but are the trends in recent machine learning research published to this Journal. The objective of this project is to analyze a large collection of NIPS research papers from the past decades to discover the latest trends in machine learning.

Data

The data was gotten from Kaggle and includes the title, authors, abstracts, and extracted text for all NIPS papers up to 2017 (ranging from the first 1987 conference to the current 2016 conference).

Method

After processing the data, I followed a structured workflow to build an insightful topic model based on the Latent Dirichlet Allocation (LDA) algorithm. The topic model was built using gensim’s native LdaModel and visualized the results using matplotlib plots.

Next steps:

I answered the following questions graphically:

Finally, I visualized:

You can view the source code here

Results