R Programming for Data Science and Analytics

R Programming for Data Science and Analytics

Venue Upper Hill, Nairobi, Kenya
Duration 10 days, Mon-Fri, 8:30am-4:30pm
Charges (Nairobi) $1750 + 16% VAT
Charges (Online) $1250 + 16% VAT
Contacts training@stemxresearch.com
Call / WhatsApp +254 721 462 424

Course description

R Programming for Data Science and Analytics is a comprehensive course designed to introduce participants to the R programming language and its application in data analysis. The course covers everything from the installation of R and RStudio to advanced data manipulation and statistical analysis. Through hands-on practice, learners will explore R's data types, structures, user-defined functions, and error handling, as well as techniques for data wrangling, cleaning, and exploratory data analysis. Emphasis is placed on practical skills with real-world datasets, enabling participants to perform sophisticated data analyses and build predictive models using R.

Course objectives

By the end of this course, you will be able to:

  • Install R and RStudio IDE and understand its user interface.
  • Identify and work with different data types and data structures in R including vectors, factors, matrices, data frames, tibble, and so forth.
  • Load/export data from/to various sources, save, and explore datasets.
  • Recognize and handle errors in R effectively using control statements and error-handling functions.
  • Utilize conditional and repetitive control structures, such as if, if ... else, if ... else if ... else, for, and while loops, to automate tasks and make scripts more efficient.
  • Perform essential data wrangling tasks, renaming and generating variables, and recoding data for analysis.
  • Select and manipulate specific variables and observations to create focused subsets of data for detailed analysis.
  • Identify and resolve common data quality issues such as missing values, inconsistent data types, duplicates, and outliers.
  • Merge and concatenate datasets effectively, using various join techniques to combine data from multiple sources.
  • Conduct exploratory data analysis (EDA) including frequency tabulation, and descriptive statistics to uncover insights from data.

Target groups

The course is designed for:

  • Data Analysts: Professionals looking to enhance their data analysis skills using R for more efficient and accurate data manipulation and visualization.
  • Researchers: Individuals who need to process and analyze large datasets as part of their research projects.
  • Statisticians: Statisticians who want to deepen their understanding of statistical methods and apply them using R.
  • Students: Those studying data science, statistics, or related fields who require a solid foundation in R programming for their coursework and projects.

Course requirements

To get the best out of the course, the following will be required:

  • Dedication: This course demands a significant commitment to learning and practice, and a serious level of dedication and concentration throughout the workshop sessions.
  • Problem-Solving Skills: Be prepared to tackle complex data challenges.
  • Basic Programming Knowledge: While not strictly necessary, prior programming experience will be beneficial.

Course outline

  1. Fundamentals of the R Programming Language
    1. Installation and RStudio interface
      1. Download and install R
      2. Download and install RStudio
      3. The R Studio interface
    2. R Data types
      1. Introduction
      2. Double
      3. Integer
      4. Character
      5. Logical
      6. Date
      7. Complex
      8. Additional data types
      9. Data type conversion
    3. R Data structures
      1. Introduction
      2. Vectors
      3. Factors
      4. Lists
      5. Matrices
      6. Arrays
      7. Data frames
      8. Tibble
      9. Time series
      10. Data structure conversion
    4. User defined functions
      1. Introduction
      2. Defining functions
      3. Scope and environment
      4. Nested functions
      5. Function documentation
    5. Basic error handling
      1. What is error handling
      2. Importance of error handling
    6. Repetitive structures
      1. Introduction
      2. For loops
      3. While loops
      4. Loop control statements
    7. Control statements
      1. Introduction
      2. Conditional statements
      3. switch statement
  2. Data Wrangling and Cleaning using dplyr Part I
    1. Creating and Importing DataFrames
      1. Introduction
      2. R DataFrames
      3. Import datasets
      4. Export datasets
    2. Working with variables
      1. Common DataFrame functions
      2. Rename variables
      3. Assign variable labels
      4. Generate new variables
      5. Specify where to insert the generated variable
      6. Generating multiple variables
      7. Replace the values of a variable
      8. Recode categorical variables
      9. Combine categories
      10. Transform continuous into categorical variable
      11. Assign ranks to a variable
    3. Extended generation of variables
      1. Generate variables by row-wise operations
      2. Generate functions with conditions and custom functions
      3. Generate dummy variables
      4. Split strings into separate variables
      5. Multiple response questions
      6. Concatenate multiple variables into one
    4. Creating subsets of DataFrames
      1. Introduction
      2. Select variables
      3. Select observations
      4. Select both observations and variables
      5. Drop variables
      6. Drop observations
      7. Randomly sample a DataFrame
      8. Split DataFrame on categorical variables
  3. Data Wrangling and Cleaning using dplyr Part II
    1. Changing the appearance of DataFrames
      1. Sort DataFrames
      2. Color DataFrame values (optional)
      3. Relocate variables
      4. Transpose DataFrames
      5. Stack and unstack DataFrame variables
      6. Reshape DataFrames
    2. Detecting data quality issues - Part 1
      1. Dataset overview
      2. Inconsistent data type
      3. Missing values
      4. Apply custom functions to DataFrame
      5. Detect and replace outliers
      6. Handling duplicates
      7. Scale numerical variables
    3. Detecting data quality issues - Part 2
      1. Extract substrings
      2. Transform the case of DataFrame values
      3. Remove white space
      4. Correct misspelled words
      5. Replace occurrence of substrings
      6. Replace strings using a named vector
      7. Remove specified characters
      8. Drop rows or columns that contain a specified substring
    4. Combining DataFrames (Concatenate and Merge)
      1. Concatenating Datasets
        1. Bind rows (append rows)
        2. Bind columns (append columns)
      2. One-to-one merging (joins)
        1. Inner join
        2. Outer (full) join
        3. Left join
        4. Right join
        5. Merge more than 2 datasets
      3. Many-to-one merging
      4. Many-to-many merging
  4. Exploratory Data Analysis
    1. Tabulation of Frequencies
      1. Introduction
      2. One Way Frequency Tables
      3. Two Way Frequency Tables
        1. Calculating Frequencies
        2. Adding Row Percentage to Two Way Contingency Tables
        3. Adding Column Percentage to Two Way Contingency Tables
    2. Descriptive Statistics
      1. Introduction
      2. Tables of Statistics
        1. Minimum, Maximum, Mean and Standard Deviation
        2. Minimum, Maximum, Median and Inter-Quartile Range
        3. Multiple Categorical Variables
        4. Multiple Quantitative Variables
      3. Calculating p-values
    3. Analysis of Multiple Response Questions
      1. One Way Frequencies
      2. Calculate Associations
  5. Way forward

StartDateEndDateChargesEnroll
VENUE: Nairobi, Kenya (Upper Hill)
02Jun202513Jun2025$1750Enroll
04Aug202515Aug2025$1750Enroll
06Oct202517Oct2025$1750Enroll
01Dec202512Dec2025$1750Enroll
VENUE: Online (Zoom)
05May202516May2025$1250Enroll
07Jul202518Jul2025$1250Enroll
01Sep202512Sep2025$1250Enroll
03Nov202514Nov2025$1250Enroll

StartDateEndDateChargesEnroll
VENUE: Nairobi, Kenya (Upper Hill)
16Jun202527Jun2025$1750Enroll
18Aug202529Aug2025$1750Enroll
20Oct202531Oct2025$1750Enroll
15Dec202526Dec2025$1750Enroll
VENUE: Online (Zoom)
19May202530May2025$1250Enroll
21Jul202501Aug2025$1250Enroll
15Sep202526Sep2025$1250Enroll
17Nov202528Nov2025$1250Enroll

StartDateEndDateChargesEnroll
VENUE: Nairobi, Kenya (Upper Hill)
02Jun202513Jun2025$1750Enroll
04Aug202515Aug2025$1750Enroll
06Oct202517Oct2025$1750Enroll
01Dec202512Dec2025$1750Enroll
VENUE: Online (Zoom)
05May202516May2025$1250Enroll
07Jul202518Jul2025$1250Enroll
01Sep202512Sep2025$1250Enroll
03Nov202514Nov2025$1250Enroll

StartDateEndDateChargesEnroll
VENUE: Nairobi, Kenya (Upper Hill)
16Jun202527Jun2025$1750Enroll
18Aug202529Aug2025$1750Enroll
20Oct202531Oct2025$1750Enroll
15Dec202526Dec2025$1750Enroll
VENUE: Online (Zoom)
19May202530May2025$1250Enroll
21Jul202501Aug2025$1250Enroll
15Sep202526Sep2025$1250Enroll
17Nov202528Nov2025$1250Enroll

StartDateEndDateChargesEnroll
VENUE: Nairobi, Kenya (Upper Hill)
02Jun202513Jun2025$1750Enroll
04Aug202515Aug2025$1750Enroll
06Oct202517Oct2025$1750Enroll
01Dec202512Dec2025$1750Enroll
VENUE: Online (Zoom)
05May202516May2025$1250Enroll
07Jul202518Jul2025$1250Enroll
01Sep202512Sep2025$1250Enroll
03Nov202514Nov2025$1250Enroll

StartDateEndDateChargesEnroll
VENUE: Nairobi, Kenya (Upper Hill)
16Jun202527Jun2025$1750Enroll
18Aug202529Aug2025$1750Enroll
20Oct202531Oct2025$1750Enroll
15Dec202526Dec2025$1750Enroll
VENUE: Online (Zoom)
19May202530May2025$1250Enroll
21Jul202501Aug2025$1250Enroll
15Sep202526Sep2025$1250Enroll
17Nov202528Nov2025$1250Enroll