Skip to content

Instantly share code, notes, and snippets.

@Lextuga007
Created June 2, 2022 16:38
Show Gist options
  • Select an option

  • Save Lextuga007/51162fbed742369d67a3dc3b7341491e to your computer and use it in GitHub Desktop.

Select an option

Save Lextuga007/51162fbed742369d67a3dc3b7341491e to your computer and use it in GitHub Desktop.
Preparing the clinicalcommissioninggroupmidyearpopulationestimates data for plotting
library(tidyverse)
library(httr) # to load from the web
library(readxl) # to read in the file
library(janitor) # to clean the spreadsheet
# load data directly from the webpage ---------------------------
tmp <- tempfile(fileext = ".xlsx")
url <- GET(url = "https://www.ons.gov.uk/file?uri=%2fpeoplepopulationandcommunity%2fpopulationandmigration%2fpopulationestimates%2fdatasets%2fclinicalcommissioninggroupmidyearpopulationestimates%2fmid2020sape23dt6a/sape23dt6amid2020ccg2021estimatesunformatted.xlsx",
write_disk(tmp))
# tidy data ---------------------------
# uses janitor package for removing empty rows, columns and making columns headers snake_case
clean_data <- read_xlsx(tmp, sheet = 4, skip = 5) %>%
remove_empty(c("rows", "cols")) %>%
clean_names() %>%
select(ccg_code:ccg_name,
x5:x10) %>%
pivot_longer(names_to = "population",
cols = x5:x10)
# To sum age groups 5 to 10 before plotting
summed_ages <- clean_data %>%
group_by(ccg_code,
ccg_name) %>%
summarise(total = sum(value))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment