![]() ![]() ![]() I am only using the linear regressions so that code for running more complicated regressions does not take away from understanding the general superlearning algorithm. Let’s first simulate a continuous outcome, y, and four potential predictors, x1, x2, x3, and x4.Ĭv_train_1 <- obs # make a data set that contains all observations except those in k=1 fit_1a <- glm(y ~ x2 + x4, data=cv_train_1) # fit the first linear regression on that training data fit_1b <- glm(y ~ x1 + x2 + x1 *x3 + sin(x4), data=cv_train_1) # second LR fit on the training data fit_1c <- glm(y ~ x1 *x2 *x3, data=cv_train_1) # and the third LR Initial set-up: Load libraries, set seed, simulate dataįor simplicity I’ll show the concept of superlearning using only four variables (AKA features or predictors) to predict a continuous outcome. Superlearning is also called stacking, stacked generalizations, and weighted ensembling by different specializations within the realms of statistics and data science.įirst I’ll go through the algorithm one step at a time using a simulated data set. ![]() For example, a tree based model averaged with a linear model (e.g. random forests and LASSO) could smooth some of the model’s edges to improve predictive performance. The motivation for this type of “ensembling” is that a mix of multiple algorithms may be more optimal for a given data set than any single algorithm.This is done using cross-validation to avoid overfitting. The superlearner algorithm “decides” how to combine, or weight, the individual algorithms based upon how well each one minimizes a specified loss function, for example, the mean squared error (MSE).Superlearning is a technique for prediction that involves combining many individual statistical algorithms (commonly called “data-adaptive” or “machine learning” algorithms) to create a new, single prediction algorithm that is expected to perform at least as well as any of the individual algorithms.It is available as an 8.5x11” pdf on Github, should you wish to print it out for reference (or desk decor). This “visual guide” I made for Chapter 3: Superlearning by Rose, van der Laan, and Eric Polley is a condensed version of the following tutorial. A Visual Guide… Over the winter, I read Targeted Learning by Mark van der Laan and Sherri Rose. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |