Load Data and Packages

Load Packages

# Load tidyverse package
library("tidyverse")

# Load "knitr" package -> needed for kable() command 
library("knitr")

Load Dataset

# Load the dataset
SSRC_data <- read.csv("SSRC_data.csv")

Data Preparation

Get a first impression of the dataset.

kable(head(SSRC_data))
age gender education_level physical_activity_level bmi
64 male medium low 27.9
59 female low medium 27.5
39 female high low 27.4
30 female high low 24.2
49 male medium low 23.9
37 male medium medium 30.7

Transform the three categorical variables in the dataset into factor variables.

# Transform into factor variables 
SSRC_data <- mutate(SSRC_data, gender = as.factor(gender),
                               education_level = as.factor(education_level),
                               physical_activity_level = as.factor(physical_activity_level))

BMI Distribution

In the following, we will analyze the distribution of BMI graphically by means of:

  1. A Histogram
  2. A Density Plot

Histogram

# Create histogram
ggplot(data = SSRC_data, mapping = aes(x = bmi)) +
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Density Plot

# Create density plot
ggplot(data = SSRC_data, mapping = aes(x = bmi)) +
  geom_density()

Age Distribution

In the following, we will analyze the distribution of age graphically by means of:

  • A Histogram
  • A Density Plot

Histogram

Density Plot