Skip to content

Instantly share code, notes, and snippets.

@skurscheid
Created May 7, 2020 03:50
Show Gist options
  • Select an option

  • Save skurscheid/927963a68774f683f4c9ac01db58bbff to your computer and use it in GitHub Desktop.

Select an option

Save skurscheid/927963a68774f683f4c9ac01db58bbff to your computer and use it in GitHub Desktop.
Function to estimate scaling factors based on ERCC spike ins when using sleuth for differential expression analysis
# this is a simple illustration of the principle
# it does not implement e.g. loess fitting to ERCC data as sugessted by Loven et al
# but rather uses the geometric mean/scaling approach as implemented in sleuth/DESeq2
# on the reduced matrix containing only ERCC spike in quantification
norm_factors_ercc <- function(mat) {
mat <- mat[grep('ERCC', rownames(mat)),]
nz <- apply(mat, 1, function(row) !any(round(row) == 0))
mat_nz <- mat[nz, , drop = FALSE]
p <- ncol(mat)
geo_means <- exp(apply(mat_nz, 1, function(row) mean(log(row))))
s <- sweep(mat_nz, 1, geo_means, `/`)
sf <- apply(s, 2, median)
scaling <- exp( (-1 / p) * sum(log(sf)))
sf * scaling
}
# this is the original function
norm_factors <- function(mat) {
nz <- apply(mat, 1, function(row) !any(round(row) == 0))
mat_nz <- mat[nz, , drop = FALSE]
p <- ncol(mat)
geo_means <- exp(apply(mat_nz, 1, function(row) mean(log(row))))
s <- sweep(mat_nz, 1, geo_means, `/`)
sf <- apply(s, 2, median)
scaling <- exp( (-1 / p) * sum(log(sf)))
sf * scaling
}
# for demonstration purposes you can use e.g. tximport to load the abundance estimation
kallistoDir <- 'kallisto_results/' #
files <- file.path(s2c$path, 'abundance.h5')
txi <- tximport(files, type = 'kallisto', txOut = T)
mat <- txi$abundance
norm_factors(mat)
norm_factors_ercc(mat)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment