Skip to content

Instantly share code, notes, and snippets.

View Lextuga007's full-sized avatar
:octocat:

Zoë Turner Lextuga007

:octocat:
View GitHub Profile
#' Check for integers
#'
#' @param df data frame to check
#' @param select_int data frame returns just the columns of data that are
#' integer64. The default returns all columns.
#' @param tidy logical set to TRUE to change set any integer or integer64 to
#' integer and give a message. FALSE will return the original data with no changes
#' and message.
#'
#' @returns data frame and log where integer64 found
library(dplyr)
library(tidyr)
# Create the data - reduced the dates in timescale to make them easier to run and see the results
data <- tibble::tribble(
~Doctor_name, ~Start_date, ~End_date, ~Weekly_hours,
"Tom", "2023-07-01", "2023-07-06", 40,
"Dick", "2023-06-01", "2023-08-30", 20,
"Harry", "2023-05-01", "2023-07-14", 8
@Lextuga007
Lextuga007 / CCG population
Created June 2, 2022 16:38
Preparing the clinicalcommissioninggroupmidyearpopulationestimates data for plotting
library(tidyverse)
library(httr) # to load from the web
library(readxl) # to read in the file
library(janitor) # to clean the spreadsheet
# load data directly from the webpage ---------------------------
tmp <- tempfile(fileext = ".xlsx")
url <- GET(url = "https://www.ons.gov.uk/file?uri=%2fpeoplepopulationandcommunity%2fpopulationandmigration%2fpopulationestimates%2fdatasets%2fclinicalcommissioninggroupmidyearpopulationestimates%2fmid2020sape23dt6a/sape23dt6amid2020ccg2021estimatesunformatted.xlsx",
write_disk(tmp))
library(tidyverse)
library(lubridate)
# Fix the random records according to this 'seed' so everytime the sequence is
# generated it is always the same
set.seed(123)
# create data and repeat 34 x 3 = 102 times
data <- data.frame(
date = sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by = "day"), 102),
# Suicides #
# code based mostly on https://www.trafforddatalab.io/charticles/2020-11-16-suicides/
# Source: Office for National Statistics
# URL: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/suicidesbylocalauthority
# Licence: OGL 3.0
# load libraries ---------------------------
library(tidyverse) ; library(httr) ; library(readxl) ; library(ggtext) ; library(stringr) ; library(janitor)
@Lextuga007
Lextuga007 / caseload code
Created May 1, 2021 17:19
Finding open referrals in a given period
library(lubridate)
library(tidyverse)
# create random start and end dates and ids which can be repeated
data <- data.frame(
start_date = sample(seq(as.Date('2019/01/01'), as.Date('2021/01/01'), by = "day"), 300),
end_date = sample(seq(as.Date('2019/01/01'), as.Date('2021/01/01'), by = "day"), 300),
patient_id = floor(runif(300, min = 1, max = 300))
)
@Lextuga007
Lextuga007 / Function
Last active April 1, 2021 09:43
Function
tidy_imd <- function(data, tidy_imd_vars = NULL){
# First rename all variables
data <- data %>%
dplyr::select(lsoa_code_2011 = LSOAcode2011,
lsoa_name_2011 = LSOAname2011,
everything()) %>%
janitor::clean_names() %>%
dplyr::select(lsoa_code_2011,
lsoa_name_2011,
# create a tibble of month names (generated by copying over from Excel using {datapasta} :)
birthmonth <- tibble::tribble(
~month_full, ~month_short,
"January", "Jan",
"February", "Feb",
"March", "Mar",
"April", "Apr",
"May", "May",
"June", "Jun",
@Lextuga007
Lextuga007 / gist:d271076245e1d2944a64893d62a3919b
Created December 2, 2020 09:47
How-to-remove-similar-kind-of-data-from-a-csv-file
Post was removed from RStudio: https://community.rstudio.com/t/how-to-remove-similar-kind-of-data-from-a-csv-file/89706
I would create a look up table with the categories listed out, apply a filter flag and then use it to filter. Having the separate table of unique categories means you can check your assumptions and it may highlight any data quality issues in the original data around these names.
```{r reprex}
library(dplyr)
library(stringr)
# Create reprex
df <- tibble::tribble(
~start, ~end, ~weight,
@Lextuga007
Lextuga007 / intro-nhsr.md
Last active April 14, 2021 17:55
Checklist for NHS-R Introduction to R and R Studio

Trainer checks

  • Ctrl+Shift+F10 or Session/Restart R as doesn't automatically reset so all packages will be loaded

Reset R Studio Cloud Global Options

  • Save workspace to .RData on exit = Ask
  • Tick Restore .RData into workspace at startup
  • Appearance RStudio theme = Modern
  • Ambience = Clouds
  • Editor font size = 10