Chapter 12 Descriptive Analysis
12.1 Single Column
12.1.1 Find minimum value in a numeric column
Description | |
---|---|
Method to find the minimum value of a numeric column in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(column_name, na.rm = TRUE)) package
Actual Instructions
::summarize(df, minimum_year_built = min(year_built, na.rm = TRUE)) dplyr
12.1.2 Find maximum value in a numeric column
Description | |
---|---|
Method to find the maximum value of a numeric column in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(column_name, na.rm = TRUE)) package
Actual Instructions
::summarize(df, maximum_year_built = max(year_built, na.rm = TRUE)) dplyr
12.1.3 Calculate mean value in a numeric column
Description | |
---|---|
Method to calculate the mean value of a numeric column in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(column_name, na.rm = TRUE)) package
Actual Instructions
::summarize(df, mean_year_built = mean(year_built, na.rm = TRUE)) dplyr
12.1.4 Calculate median value in a numeric column
Description | |
---|---|
Method to calculate the median value of a numeric column in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(column_name, na.rm = TRUE)) package
Actual Instructions
::summarize(df, median_year_built = median(year_built, na.rm = TRUE)) dplyr
12.1.5 Calculate standard deviation in a numeric column
Description | |
---|---|
Method to calculate the standard deviation of a numeric column in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(column_name, na.rm = TRUE)) package
Actual Instructions
::summarize(df, stdev_year_built = sd(year_built, na.rm = TRUE)) dplyr
12.1.6 Calculate sum of a numeric column
Description | |
---|---|
Method to calculate the total sum of a numeric column in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(column_name, na.rm = TRUE)) package
Actual Instructions
::summarize(df, sum_living_units = sum(living_units, na.rm = TRUE)) dplyr
12.1.7 Round calculated mean value in a numeric column
Description | |
---|---|
Method to calculate the mean value of a numeric column and round to one decimal place in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(function(column_name, na.rm = TRUE), number)) package
Actual Instructions
::summarize(df, mean_year_built = round(mean(year_built, na.rm = TRUE), 1)) dplyr
12.1.8 Calculate multiple descriptive statistics of a numeric column
Description | |
---|---|
Method to calculate multiple descriptive statistics from a numeric column in a dataframe |
Ingredients | |
---|---|
Package | Data |
readr |
sample.csv |
Preparation
<- readr::read_csv("C:/data/sample.csv") df
Sample Instructions
::function(data, new_column_name = function(column_name, na.rm = TRUE),
packagenew_column_name = function(column_name, na.rm = TRUE),
new_column_name = function(column_name, na.rm = TRUE),
new_column_name = function(column_name, na.rm = TRUE),
new_column_name = function(column_name, na.rm = TRUE))
Actual Instructions
::summarize(df, min_year_built = min(year_built, na.rm = TRUE),
dplyrmean_year_built = mean(year_built, na.rm = TRUE),
median_year_built = median(year_built, na.rm = TRUE),
max_year_built = max(year_built, na.rm = TRUE),
stdev_year_built = sd(year_built, na.rm = TRUE))
12.2 Multiple Columns
column based rowwise summarize_all summarize_if min, mean, max, median, sd