Data Science (R) ,

Text Mining In R

This course intermediate concepts in natural language processing, equipping learners with the ability to clean and process large amounts of text data, segregating text into different groups and topics, as well as finding similarities between different documents.

View Course details

DataSociety

4 hours of instruction

OBJECTIVES

Compute cosine similarity for corpus documents
Demonstrate weighting with TF-IDF
Implement cosine similarity to compare documents
Visualize similar documents using interactive network graph

PREREQUISITES

Foundational understanding of NLP concepts.

SYLLABUS & TOPICS COVERED

Cosine Similarity
- Compute term similarity matrix
- Create corpus term similarity heatmap
TF-IDF
- Implementation of TF-IDF weighting
- Build network graphs to compare the documents

SOFTWARE REQUIREMENTS

You will have access to an R-based Posit Cloud environment for this course. No additional download or installation is required.

About Instructor

DataSociety

148 Courses

Text Mining In R

About Instructor

DataSociety

Committed to your success with open source. OpenTeams is your easy point of access to a range of services from our open source expert network, from commercial open source support to open source training, staffing & recruiting services, and more.

Resources

OpenTeams