Skip to content

Instantly share code, notes, and snippets.

@marcusrussi
Last active August 29, 2019 15:25
Show Gist options
  • Select an option

  • Save marcusrussi/06fe74efe9423b92ccb440e5131890a8 to your computer and use it in GitHub Desktop.

Select an option

Save marcusrussi/06fe74efe9423b92ccb440e5131890a8 to your computer and use it in GitHub Desktop.
requireNamespace("tibble")
requireNamespace("dplyr")
requireNamespace("purrr")
# Tallies up the quantity of every distinct value in every variable in
# a tibble, returning output as a list indexed on varible name.
#
# tbl -> list of {varname: {value, n}}
# tbl: an ungrouped tibble
# list: each element is a 2-element list. The first element
# is the list of values, and the second element is a
# tally of how many times they show up in the tibble.
freqs <- function(tbl) {
count_vars <- names(tbl) # chr_vec of var names
reducer <- function(l, count_var) { # l is the list being produced
# Using 'group_by_at' avoids quotation issues involved in using
# 'count'. 'dplyr::count' is implemented using 'group_by' and 'tally',
# anyway.
c_result <- dplyr::group_by_at(tbl, count_var) %>% dplyr::tally()
# The output of 'tally' names the 'variable' column after the variable
# being counted. We rename this to 'val' for consistency across
# all vars.
reformat <- list(val = c_result[[count_var]],
n = c_result$n)
# Build up the list, adding this variable's counts.
purrr::list_modify(l, !! count_var := reformat)
}
reduce(count_vars, reducer, .init=list())
}
# tbl -> tbl
# Same as above, but produces a tibble as output, instead of a list.
# "variable" column will hold the variable name.
freqs_df <-
compose(partial(dplyr::bind_rows, .id="variable"),
freqs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment