Courses

  • 0 Lessons

    Data Wrangling in R

    Data is often messy, requiring cleaning and restructuring before it can be reliably used in a program or project. In this course, learners will augment their understanding of base R using an open-source set of packages intended for data cleaning and wrangling, the tidyverse. After installing this package, learners will practice working with functions that allow data to be selected, filtered, summarized, rearranged, and otherwise transformed according to analyst-vetted best practices.

  • 0 Lessons

    Decision Trees in R

    Decision tree models are classification algorithms that sort novel data into categories based on iterative splitting, like the branches of a tree, according to input parameters. In this course, learners will identify use cases for decision trees in R. They will wrangle data and implement a decision tree model before attempting to evaluate its effectiveness. Finally, learners will use their knowledge of the mathematics behind decision trees to tune the model and improve its classificatory function.

  • 0 Lessons

    Ensemble Methods In R

    This course covers an overview of ensemble learning methods like random forest and boosting. At the end of this course, students will be able to implement and compare random forest algorithm and boosting.

  • 0 Lessons

    Interactive Visualization in R

    In this course, learners will use R packages to create charts and maps with interactive elements. Tooltips, hover states, and other dynamic elements allow for the encoding of additional layers of data to enrich your data visualizations. Learners will build basic interactive visualizations using the Highcharter package before moving on to more advanced chart types. Learners will also work with more complex data to create maps and network graphs, exportable as HTML widgets.
  • 0 Lessons

    Intermediate Clustering in R

    In this course, learners will encounter more sophisticated methods for generating clusters within unlabeled data using R. The first method, hierarchical clustering, creates tree branch-based clusters in order of increasing specificity. The second, density-based clustering, creates groups based on the concentration of data points within a region. By the end of this course, learners will prepare data for, implement, and optimize these models, and compare their relative advantages.
  • 0 Lessons

    Intermediate R

    This course familiarizes learners with key concepts in programming essential for writing code in base R. After loading a dataset into their environment, learners will create variables to represent values that change according to specific conditions. Learners will also construct conditional statements, loops, and functions to practice iterating over these variables with the basic building blocks of a simple program.
  • 0 Lessons

    Intermediate Statistics in R

    This course is designed for learners who would like to learn about statistics and apply it for decision-making. This course is a comprehensive review of intermediate statistics topics like t-value, t-distribution, chi-square distribution, f-statistic, and f-distribution that enable us to compare observed and expected frequencies objectively.
  • 0 Lessons

    Introduction to Classification in R

    Classification is a machine learning technique that can be used to sort novel data into labeled categories. In this course, learners will identify use cases for classification algorithms and become familiar with the theoretical underpinnings of supervised machine learning (working with labeled data). In particular, learners will build, evaluate, and interpret a k-nearest neighbors model in R, based on one of the most commonly used classification algorithms.
  • 0 Lessons

    Introduction to Clustering in R

    Clustering is a machine learning technique that can be used to group unlabeled data based on shared features. In this course, learners will identify use cases for clustering algorithms and become familiar with the theoretical underpinnings of unsupervised machine learning (working with unlabeled data). In particular, learners will build, evaluate, and interpret a k-means model in R, based on one of the most commonly used clustering algorithms.
  • 0 Lessons

    Introduction To NLP In R

    This course covers the basics of natural language processing, equipping learners with the ability to clean and process large amounts of text data required for text analysis.
  • 0 Lessons

    Introduction to R

    This course introduces learners to the fundamentals of the R programming language. Favored by statisticians, data miners, and data analysts, R is a powerful language for developing statistical software applications and automating data processing. By the end of this course, learners will identify how data scientists use R, recognize basic data types and data structures, and perform basic calculations using foundational base R.
  • 0 Lessons

    Introduction to Statistics in R

    This course is designed for learners who would like to learn about statistics and apply it for decision-making. This course is a comprehensive review of statistical terms ranging from foundational (mean, median, mode, standard deviation, variance) to more complex concepts such as normality in data, confidence intervals, and p- values. Additional topics include how to calculate summary statistics and how to carry out hypothesis testing to inform decisions.
  • 0 Lessons

    Introduction to Visualization in R

    Creating visualizations is a critical means of exploring data and revealing insights. In this course, learners will take their R skills to the next level by preparing data for exploratory analysis and creating basic, static visualizations. Using both base R and tidyverse packages, learners will generate bar charts, scatter plots, histograms, and other common visualizations to better understand the shape, structure, and features of a sample dataset.
  • 0 Lessons

    Logistic Regression in R

    Logistic regression is a classification algorithm useful for sorting data into two classes: either this, or that. In this course, learners will identify use cases for logistic regression in R. They will wrangle data and implement a logistic regression model before attempting to evaluate its effectiveness. Finally, learners will use their knowledge of the mathematics behind logistic regression to tune the model and improve its classificatory function.
  • 0 Lessons

    Multiple Linear Regression in R

    Multiple linear regression is a supervised learning technique that allows analysts to model the relationship between a certain number of labeled features and a single continuous numerical target variable. In this course, learners will encounter the mathematical underpinnings of regression models in general before building, optimizing, and evaluating a multiple linear regression model in R. Learners will also discuss concepts such as statistical significance to clearly present their findings.
  • 0 Lessons

    Principal Component Analysis

    This course covers the basics of Principal Component Analysis (PCA), the need for PCA for better interpretability of large datasets by applying dimesion-reduction without any information loss.
  • 0 Lessons

    RShiny Apps

    The RShiny package offers data scientists and analysts tremendous control over the end user's experience of interactive visualizations. In this course, learners will use RShiny as a comprehensive tool to build, structure, and publish an entire data dashboard. After being introduced to RShiny's structure, learners will generate interactive visualizations. Learners will also be introduced to core principles of front-end, user-facing interface design in order to optimize their dashboard's usability.
  • 0 Lessons

    Simple Linear Regression in R

    Regression is a machine learning technique that can be used to model and predict the relationship between variables, features and a continuous numerical target. In this course, learners will identify use cases for simple linear regression, focusing on the relationship between two variables only. Students will build, evaluate, and interpret a simple linear regression model in R, with an emphasis on checking the model for explanatory and predictive power.
  • 0 Lessons

    Text Mining In R

    This course intermediate concepts in natural language processing, equipping learners with the ability to clean and process large amounts of text data, segregating text into different groups and topics, as well as finding similarities between different documents.