Text Mining In R

This course intermediate concepts in natural language processing, equipping learners with the ability to clean and process large amounts of text data, segregating text into different groups and topics, as well as finding similarities between different documents.

4 hours of instruction

This course intermediate concepts in natural language processing, equipping learners with the ability to clean and process large amounts of text data, segregating text into different groups and topics, as well as finding similarities between different documents.


  1. Compute cosine similarity for corpus documents
  2. Demonstrate weighting with TF-IDF
  3. Implement cosine similarity to compare documents
  4. Visualize similar documents using interactive network graph


Foundational understanding of NLP concepts.


  1. Cosine Similarity
    • Compute term similarity matrix
    • Create corpus term similarity heatmap
  2. TF-IDF
    • Implementation of TF-IDF weighting
    • Build network graphs to compare the documents


You will have access to an R-based Posit Cloud environment for this course. No additional download or installation is required.

About Instructor


148 Courses

Not Enrolled
This course is currently closed