Here you find the exercise sheet for chapter 4: “Data Visualization”
Create an R project for solving this Exercise Sheet.
Download the csv-file SSRC_data.csv and the R script SSRC_C4_template.R and put it in the R project folder you created in Task 1.
Open the SSRC_C4_template.R R Script.
Load the tidyverse package.
Use the read.csv() command to load the SSRC data into R and call the respective data object SSRC_data.
Get a first impression of the dataset by checking out the dataset using the str() command.
Transform the three character variables in the dataset into factor variables. Make sure that the levels of the physical_activity_level and education_level variables are ordered in a reasonable way. (You learned how to do that in chapter 3)
What kind of plot would be useful to analyze the …
Use a bar chart to analyze the distribution of the physical_activity_level variable.
Create the same bar chart as in Task 9 but with colored bars and a decreased bar-width of 0.5.
Create a histogram to analyze the distribution of the variable age.
Create the same histogram as in Task 11 but change the binwidth to 1.
Create a density plot to analyze the distribution of the variable bmi.
Create a plot that depicts the distributions of bmi for males and females in a single plot.
Create a set of parallel boxplots to describe the relationship between education_level and bmi.
Create a scatterplot to analyze the relationship between age and bmi.
Create the same scatterplot as in task 16 and add a line that approximates the relationship between age and bmi. (Use method = “lm”)
Create the same scatterplot as in task 17 and add three horizontal lines that indicate bmi levels of 18.5, 25 and 30.