Blog
Library

Exploratory Data Analysis with R

From this course, you will learn to set up your data and code to avoid mistakes and ensure reproducibility. You will thoroughly understand the structure and content of your data, build clear plots to evaluate the distribution of your data with ggplot, construct summaries of your variables with dplyr, and implement data cleaning and validation tasks to get your data ready for data mining activities Test a hypothesis or check assumptions related to a specific model Estimate parameters and figure the margins of error

Features Includes:
  • Self-paced with Life Time Access
  • Certificate on Completion
  • Access on Android and iOS App

Course Preview Video

Description

Harness the skills to analyze your data effectively with EDA and R.

The greatest number of mistakes and failures in data analysis comes from not performing adequate Exploratory Data Analysis (EDA). Lack of EDA knowledge can expose you to the great risk of drawing incorrect, and potentially harmful, conclusions from your data analysis.

In this course, you will learn how EDA helps you draw conclusions to make better sense of your data and implement correct techniques. We'll begin with a brief introduction to EDA, its importance, and advantages over BI tools. Using R libraries like dplyr and ggplot2, we will generate insights and formulate relevant questions for investigation and communicate the results effectively using visualizations. You will learn how to spot missing data and errors, validate assumptions, and identify the patterns for understanding the problem. Based on this, you’ll be able to select a correct ML model to use for your data.

By the end of the course, you will be able to quickly get know and interpret various kinds of data sets you will be presented with, and easily understand how to handle and work with them in order to make them ready for further modeling activities.

Here's the link to the GitHub repo to this course: https://github.com/PacktPublishing/Exploratory-Data-Analysis-with-R

Please note that basic knowledge of R and R Studio, together with some knowledge of descriptive statistics, are key to getting the best out of this course.

About the Author

  • Andrea Cirillo is a Senior Audit Quantitative Analyst at Intesa Sanpaolo Banking Group. He works daily with copious volumes of "messy" data for the purpose of auditing credit risk models. This has prompted him to develop the key skills needed to succeed in Exploratory Data Analysis (EDA). Andrea is also an active contributor to the R community with well-received packages like updateR and paletteR. He recently focused resolving some of his R-related pain-points by helping R users draw the most out of their data through effective data visualization tools like the dataviz bot Vizscorer.

Basic knowledge
  • Basic knowledge of R and R Studio, together with some knowledge of descriptive statistics, are key to getting the best out of this course

What will you learn
  • Set up your data and code to avoid mistakes and ensure reproducibility
  • Really understand the structure and content of your data
  • Build clear plots to evaluate the distribution of your data with ggplot
  • Construct summaries of your variables with dplyr
  • Implement data cleaning and validation tasks to get your data ready for data mining activities
  • Test a hypothesis or check assumptions related to a specific model
  • Estimate parameters and figure the margins of error
Course Curriculum
No of Lectures: 37 Total Duration: 04:42:58
Reviews

No Review Yet