Here you find the sample solution for the exercise sheet of chapter 2
Create an R project for solving this Exercise Sheet.
Download the csv-file SSRC_data.csv and the R script SSRC_C2_template.R and put it in the R project folder you created in Task 1.
Open the SSRC_C2_template.R R Script.
Use the read.csv() command to load the SSRC data into R and call the respective data object SSRC_data.
# Load the dataset
SSRC_data <- read.csv("SSRC_data.csv")
Get a first impression of the dataset by checking out the first 6 rows of the dataset and by looking at the data in the spreadsheet mode.
# Check out the first 6 rows
head(SSRC_data)
age gender education_level physical_activity_level bmi
1 64 male medium low 27.9
2 59 female low medium 27.5
3 39 female high low 27.4
4 30 female high low 24.2
5 49 male medium low 23.9
6 37 male medium medium 30.7
# Produce spreadsheet mode
View(SSRC_data)
Install and load the tidyverse package. (If you have already installed the package before, loading the package is sufficient)
# Install tidyverse package (if you have not done it yet)
# install.packages("tidyverse")
# Load tidyverse package
library("tidyverse")
Create a dataset that only contains the variables age and bmi and call this dataset SSRC_data_C2_task_7. Check out the first six rows of this dataset.
# Create the dataset
SSRC_data_C2_task_7 <- select(SSRC_data, age, bmi)
# Check out dataset
head(SSRC_data_C2_task_7)
age bmi
1 64 27.9
2 59 27.5
3 39 27.4
4 30 24.2
5 49 23.9
6 37 30.7
Create a dataset that only contains subjects with a bmi below 18.5 and call this dataset SSRC_data_C2_task_8. Check out the first six rows of this dataset.
# Create the dataset
SSRC_data_C2_task_8 <- filter(SSRC_data, bmi < 18.5)
# Check out dataset
head(SSRC_data_C2_task_8)
age gender education_level physical_activity_level bmi
1 28 male medium low 15.8
2 62 female high low 18.3
3 31 male high low 17.4
4 31 male high medium 17.3
5 53 male low low 16.8
6 27 female medium medium 17.3
Create a dataset that only contains individuals that have a low level of education and a bmi above 25 and call this dataset SSRC_data_C2_task_9. Check out the first six rows of this dataset.
# Create the dataset
SSRC_data_C2_task_9 <- filter(SSRC_data, education_level == "low" & bmi > 25)
# Check out dataset
head(SSRC_data_C2_task_9)
age gender education_level physical_activity_level bmi
1 59 female low medium 27.5
2 57 female low low 35.3
3 40 female low low 26.3
4 56 female low medium 33.7
5 67 male low low 31.5
6 71 female low low 29.6
Create a dataset that only contains individuals with a bmi between 18.5 and 25 and is restricted to the variables bmi and gender. Use the Pipe operator to do so and call the dataset SSRC_data_C2_task_10. Check out the first six rows of this dataset.
# Create the dataset
SSRC_data_C2_task_10 <- SSRC_data %>%
filter(bmi >= 18.5 & bmi <= 25) %>%
select(bmi, gender)
# Check out dataset
head(SSRC_data_C2_task_10)
bmi gender
1 24.2 female
2 23.9 male
3 18.5 female
4 24.1 male
5 23.1 female
6 23.1 male
Use the summarize() command in combination with the filter() command to calculate the mean, maximum and minimum bmi of males that feature a low level of physical activity.
# Calculate mean, maximum and minimum bmi
SSRC_data %>%
filter(gender == "male" & physical_activity_level == "low") %>%
summarize(mean_bmi = mean(bmi), maximum_bmi = max(bmi), minimum_bmi = min(bmi))
mean_bmi maximum_bmi minimum_bmi
1 27.46175 55 15.8
Use the summarize() command in combination with the group_by() command to compare males and females with respect to their mean age and bmi.
# Compare males and females with respect to age and bmi
SSRC_data %>%
group_by(gender) %>%
summarize(mean_bmi = mean(bmi), mean_age = mean(age))
# A tibble: 2 × 3
gender mean_bmi mean_age
<chr> <dbl> <dbl>
1 female 26.4 47.1
2 male 27.2 48.4