Topic Modeling in NLP

This course intermediate concepts in natural language processing, equipping learners with the ability to clean and process large amounts of text data, segregating text into different groups and topics, as well as finding similarities between different documents. As natural language can be vague and subjective, the course also presents ways to evaluate and interpret these language models.

View Course details

DataSociety

6 hours of instruction

OBJECTIVES

Understand and implement bag of words and term frequency inverse document frequency (TF-IDF)
Process, clean, and format text data for analysis
Extract key summary metrics and words from a corpus of documents
Perform latent Dirichlet allocation (LDA) for topic modelling

PREREQUISITES

Introduction to NLP

SYLLABUS & TOPICS COVERED

TF-IDF
- The ‘bag-of-words’ approach and when it is used
- Weighting terms in a corpus
- Implementation of TF-IDF weighting
Topic Modeling
- Topic modeling
- Latent Dirichlet Allocation as topic modeling technique
- Implementation of LDA

SOFTWARE REQUIREMENTS

You will have access to a Python-based JupyterHub environment for this course. No additional download or installation is required.

About Instructor

DataSociety

148 Courses

Topic Modeling in NLP

About Instructor

DataSociety

Committed to your success with open source. OpenTeams is your easy point of access to a range of services from our open source expert network, from commercial open source support to open source training, staffing & recruiting services, and more.

Resources

OpenTeams