Chapter 10 Modifying Data

When working with data it is common that values in columns need to be modified, or creating new columns of data. The following recipes outline how to change data with a column, and how to create data in a new column using different techniques.

10.1 Static Value

Changing or creating data values in a column so that all rows have the same value; with examples for character, numeric, and date data types.

10.1.1 Change or create a single character column with a static character value

Description
Method to change or create a character column with a single specific character value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, column_name = "text")


Actual Instructions

dplyr::mutate(df, country = "Canada")

10.1.2 Change or create a single numeric column with a static numeric value

Description
Method to change or create a numeric column with a single specific numeric value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, column_name = number)


Actual Instructions

dplyr::mutate(df, population = 100000)

10.1.3 Change a single date column with a static date value

Description
Method to change a date column with a single specific date value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, column_name = "yyyy-mm-dd")


Actual Instructions

dplyr::mutate(df, date_added = "2021-11-19")

10.1.4 Create a new date column with a static date value

Description
Method to create a new date column with a single specific date value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = as.Date("yyyy-mm-dd"))


Actual Instructions

dplyr::mutate(df, date_added = as.Date("2021-11-19"))

10.1.5 Change or create multiple character columns with a static character value

Description
Method to change or create multiple character columns with a single specific character value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, column_name1 = "text",
                  column_name2 = "text")


Actual Instructions

dplyr::mutate(df, country = "Canada",
              country_code = "CA")

10.1.6 Change or create multiple numeric columns with a static numeric value

Description
Method to change or create multiple numeric columns with a single specific numeric value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, column_name1 = number,
                  column_name2 = number)


Actual Instructions

dplyr::mutate(df, population = 100000,
              growth_rate = 1.75)

10.1.7 Change multiple date columns with a static date value

Description
Method to change multiple date columns with a single specific date value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, column_name1 = "yyyy-mm-dd",
                  column_name2 = "yyyy-mm-dd")


Actual Instructions

dplyr::mutate(df, date_added = "2021-11-19",
              date_changed = "2021-11-20")

10.1.8 Create multiple new date columns with a static date value

Description
Method to create multiple new date columns with a single specific date value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name1 = as.Date("yyyy-mm-dd"),
                  new_column_name2 = as.Date("yyyy-mm-dd"))


Actual Instructions

dplyr::mutate(df, date_added = as.Date("2021-11-19"),
              date_changed = as.Date("2021-11-20"))

10.2 Using Existing Column(s)

Changing or creating data values in a column with values from an existing column of data; with examples for character, numeric, and date data types.

10.2.1 Create new character column from static text and a character column

Description
Method to create a new character column by combining values from a character column with static text
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name, paste0("text", column_name))


Actual Instructions

dplyr::mutate(df, label = paste0("My name is ", first_name))

10.2.2 Create new character column from multiple character columns

Description
Method to create a new character column by combining values from multiple character columns
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name, paste0(column_name1, column_name2))


Actual Instructions

dplyr::mutate(df, username = paste0(first_name, last_name))

10.2.3 Create new character column from multiple character columns and static text

Description
Method to create a new character column by combining values from multiple character columns separated by a static text value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name, paste0(column_name1, " ", column_name2))


Actual Instructions

dplyr::mutate(df, legal_name = paste0(first_name, " ", last_name))

10.2.4 Create new numeric column from a numeric column and a mathematical operation

Description
Method to create a new numeric column by combining values from a numeric column and using a mathematical operation
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = column_name operator number)


Actual Instructions

dplyr::mutate(df, cost_per_month = cost / 12)

10.2.5 Create new numeric column from multiple numeric columns and mathematical operations

Description
Method to create a new numeric column by combining values from multiple numeric column and using a mathematical operation
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = column_name1 operator column_name2)


Actual Instructions

dplyr::mutate(df, population_density = population / area_sq_km)

10.2.6 Create new date column from a date column and a mathematical operation

Description
Method to create a new date column by combining values from a date column and using a mathematical operation
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = column_name operator number)


Actual Instructions

dplyr::mutate(df, closing_date = date_created + 7)

10.2.7 Create new date column from multiple date columns and mathematical operations

Description
Method to create a new date column by combining values from multiple date column and using a mathematical operation
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = column_name1 operator column_name2)


Actual Instructions

dplyr::mutate(df, duration = end_date - start_date)

10.2.8 Create new date column from a date column and a function

Description
Method to create a new date column by combining values from a date column and using a function
Ingredients
Package Data

readr
dplyr
lubridate

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = package::function(column_name))


Actual Instructions

dplyr::mutate(df, year_created = lubridate::year(date_created))

10.3 Conditionally from Existing Column

Creating data values in a new column using conditional statements from a existing column of data.

10.3.1 Create new character column conditionally with default static value

Description
Method to create a new character column by using a conditional statement adding a static default value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = package::function(
  column_name operator number ~ "text",
  TRUE ~ "text"))


Actual Instructions

dplyr::mutate(df, result = dplyr::case_when(
  score >= 50 ~ "pass",
  TRUE ~ "fail"))

10.3.2 Create new character column conditionally with character column as default value

Description
Method to create a new character column by using a conditional statement adding default values from a character column
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = package::function(
  column_name operator number ~ "text",
  TRUE ~ column_name))


Actual Instructions

dplyr::mutate(df, result = dplyr::case_when(
  grade == "F" ~ "Fail",
  TRUE ~ result))

10.3.3 Create new character column conditionally from existing numeric column with default static value

Description
Method to create a new character column by using a conditional statement from an existing numeric column adding a static default value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = package::function(
  column_name operator number ~ "text",
  column_name operator number ~ "text",
  TRUE ~ "text"))


Actual Instructions

dplyr::mutate(data, result = dplyr::case_when(
  score < 50 ~ "fail",
  score >= 50 ~ "pass",
  TRUE ~ "error"))

10.3.4 Create new character column conditionally from range of values in existing numeric column with default static value

Description
Method to create a new character column by using a conditional statement from a range of values in an existing numeric column adding a static default value
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = package::function(
  package::function(column_name, low_range, high_range) ~ "text",
  package::function(column_name, low_range, high_range) ~ "text",
  package::function(column_name, low_range, high_range) ~ "text",
  package::function(column_name, low_range, high_range) ~ "text",
  TRUE ~ "text"))


Actual Instructions

dplyr::mutate(data, grade = dplyr::case_when(
  dplyr::between(score, 80, 100) ~ "A",
  dplyr::between(score, 70, 79) ~ "B",
  dplyr::between(score, 60, 69) ~ "C",
  dplyr::between(score, 50, 59) ~ "D",
  TRUE ~ "F"))

10.3.5 Create new character column conditionally from specific values in a character column

Description
Method to create a new character column by using a conditional statement from specific values from a character column
Ingredients
Package Data

readr
dplyr

sample.csv


Preparation

df <- readr::read_csv("C:/data/sample.csv")


Sample Instructions

package::function(data, new_column_name = package::function(
  column_name %in% c("text1", "text2", "text3", "text4") ~ "new value1",
  column_name == "text5" ~ "new value 2",
  column_name == "text6" ~ "new value 3",
  column_name %in% c("text7", "text8") ~ "new value4",
  column_name %in% c("text9", "text10") ~ "new value5",
  column_name %in% c("text11", "text12", "text13") ~ "new value6"))


Actual Instructions

dplyr::mutate(data, region = dplyr::case_when(
  province_code %in% c("NL", "NS", "PE", "NB") ~ "Atlantic",
  province_code == "QC" ~ "Quebec",
  province_code == "ON" ~ "Ontario",
  province_code %in% c("MB", "SK") ~ "Praries",
  province_code %in% c("AB", "BC") ~ "Western",
  province_code %in% c("NU", "NT", "YK") ~ "Northern"))