4 hours of instruction
This course covers the clustering concepts of natural language processing, equipping learners with the ability to cluster text data into groups and topics by finding similarities between different documents.
OBJECTIVES
- Understand measures of similarity and distance
- Learn and implement cosine similarity on text documents
- Understand how similar documents can be clustered into topics
PREREQUISITES
Topic Modeling in NLP
SYLLABUS & TOPICS COVERED
- Cosine Similarity
- Measures of similarity and distance
- Theory and implementation of cosine similarity find most similar documents
- Clustering Documents
- Clustering as an unsupervised method in text analysis
- Hierarchical clustering algorithm in a nutshell
- How to implement clustering on a corpus of documents
SOFTWARE REQUIREMENTS
You will have access to a Python-based JupyterHub environment for this course. No additional download or installation is required.