4 hours of instruction
This course intermediate concepts in natural language processing, equipping learners with the ability to clean and process large amounts of text data, segregating text into different groups and topics, as well as finding similarities between different documents.
OBJECTIVES
- Compute cosine similarity for corpus documents
- Demonstrate weighting with TF-IDF
- Implement cosine similarity to compare documents
- Visualize similar documents using interactive network graph
PREREQUISITES
Foundational understanding of NLP concepts.
SYLLABUS & TOPICS COVERED
- Cosine Similarity
- Compute term similarity matrix
- Create corpus term similarity heatmap
- TF-IDF
- Implementation of TF-IDF weighting
- Build network graphs to compare the documents
SOFTWARE REQUIREMENTS
You will have access to an R-based Posit Cloud environment for this course. No additional download or installation is required.