Skip to content

Instantly share code, notes, and snippets.

View bradleyboehmke's full-sized avatar

Bradley Boehmke bradleyboehmke

View GitHub Profile
@bradleyboehmke
bradleyboehmke / save_colab_as_html.py
Created December 3, 2025 19:48
Save Colab notebook as an HTML file
from IPython.display import display, Javascript
from google.colab import drive, files
import os
import glob
import time
import nbformat
from nbconvert import HTMLExporter
import uuid
import subprocess
@bradleyboehmke
bradleyboehmke / resampling_differences.R
Last active March 24, 2021 13:13
Observing slight differences in resampling schemes between caret and rsample k-fold implementations
library(AmesHousing)
library(caret)
library(rsample)
ames <- make_ames()
ames <- tibble::rownames_to_column(ames)
# create folds
set.seed(123)
folds_caret <- createFolds(ames$Sale_Price, k = 10, returnTrain = TRUE)
library(parsnip)
library(vip)
m <- linear_reg() %>%
set_engine("lm") %>%
set_mode("regression") %>%
fit(mpg ~ ., data = mtcars)
vi_firm(m$fit, feature_names = setdiff(names(mtcars), "mpg"), train = mtcars)
vip(m$fit, method = "firm", feature_names = setdiff(names(mtcars), "mpg"), train = mtcars)
We can't make this file beautiful and searchable because it's too large.
MS_SubClass,MS_Zoning,Lot_Frontage,Lot_Area,Street,Alley,Lot_Shape,Land_Contour,Utilities,Lot_Config,Land_Slope,Neighborhood,Condition_1,Condition_2,Bldg_Type,House_Style,Overall_Qual,Overall_Cond,Year_Built,Year_Remod_Add,Roof_Style,Roof_Matl,Exterior_1st,Exterior_2nd,Mas_Vnr_Type,Mas_Vnr_Area,Exter_Qual,Exter_Cond,Foundation,Bsmt_Qual,Bsmt_Cond,Bsmt_Exposure,BsmtFin_Type_1,BsmtFin_SF_1,BsmtFin_Type_2,BsmtFin_SF_2,Bsmt_Unf_SF,Total_Bsmt_SF,Heating,Heating_QC,Central_Air,Electrical,First_Flr_SF,Second_Flr_SF,Low_Qual_Fin_SF,Gr_Liv_Area,Bsmt_Full_Bath,Bsmt_Half_Bath,Full_Bath,Half_Bath,Bedroom_AbvGr,Kitchen_AbvGr,Kitchen_Qual,TotRms_AbvGrd,Functional,Fireplaces,Fireplace_Qu,Garage_Type,Garage_Finish,Garage_Cars,Garage_Area,Garage_Qual,Garage_Cond,Paved_Drive,Wood_Deck_SF,Open_Porch_SF,Enclosed_Porch,Three_season_porch,Screen_Porch,Pool_Area,Pool_QC,Fence,Misc_Feature,Misc_Val,Mo_Sold,Year_Sold,Sale_Type,Sale_Condition,Sale_Price,Longitude,Latitude
One_Story_1946_and_Newer_All_Styles,Residential_Low_Density,141
@bradleyboehmke
bradleyboehmke / help_functions.R
Last active July 20, 2025 16:57
Simple application of word embeddings with GloVe algorithm via text2vec
# See http://text2vec.org/glove.html for more details about text2vec
get_embeddings <- function(text) {
# Create iterator over tokens
tokens <- text2vec::space_tokenizer(text)
# Create vocabulary. Terms will be unigrams (simple words).
message("Creating vocabulary...")
it <- text2vec::itoken(tokens, progressbar = FALSE)
vocab <- text2vec::create_vocabulary(it)
@bradleyboehmke
bradleyboehmke / gbm-learning-rate-trees-interaction.R
Created December 22, 2018 15:58
Illustrates how smaller learning rates require more trees to converge to optimal solutions
library(dplyr)
library(tidyr)
library(ggplot2)
library(gganimate)
set.seed(1112)
df <- tibble::tibble(
x = seq(from = 0, to = 2 * pi, length = 500),
y = sin(x) + rnorm(length(x), sd = 0.5),
truth = sin(x)
@bradleyboehmke
bradleyboehmke / analytics-connect-intermediate-r-student-script.R
Created October 21, 2018 19:39
Student script for 2018 Analytics Connect Intermediate R workshop
TBD
@bradleyboehmke
bradleyboehmke / analytics-connect-intro-r-student-script.R
Last active October 21, 2018 15:57
Student script for 2018 Analytics Connect Intro to R workshop
###############################################################################
# FUNDAMENTALS #
###############################################################################
# slide 13 ----------------------------------------------------------------
# assess where the outputs from these lines of code appear
mtcars
?sum
hist(mtcars$mpg)