Posts Tagged ‘ R ’

Join Dependency Sorting

July 1, 2017
By
Join Dependency Sorting

In our latest installment of “R and big data” let’s again discuss the task of left joining many tables from a data warehouse using R and a system called "a join controller" (last discussed here). One of the great advantages to specifying complicated sequences of operations in data (rather than in code) is: it is … Continue reading Join Dependency Sorting

Read more »

Stan Weekly Roundup, 30 June 2017

June 30, 2017
By
Stan Weekly Roundup, 30 June 2017

Here’s some things that have been going on with Stan since the last week’s roundup Stan® and the logo were granted a U.S. Trademark Registration No. 5,222,891 and a U.S. Serial Number: 87,237,369, respectively. Hard to feel special when there were millions of products ahead of you. Trademarked names are case insensitive and they required […] The post Stan Weekly Roundup, 30 June 2017 appeared first on Statistical Modeling, Causal…

Read more »

Using wrapr::let() with tidyeval

June 28, 2017
By

While going over some of the discussion related to my last post I came up with a really neat way to use wrapr::let() and rlang/tidyeval together. Please read on to see the situation and example.Suppose we want to parameterize over a couple of names, one denoting a variable coming from the current environment and one … Continue reading Using wrapr::let() with tidyeval

Read more »

Please Consider Using wrapr::let() for Replacement Tasks

June 26, 2017
By

From dplyr issue 2916. The following appears to work. suppressPackageStartupMessages(library("dplyr")) COL <- "homeworld" starwars %>% group_by(.data[[COL]]) %>% head(n=1) ## # A tibble: 1 x 14 ## # Groups: COL [1] ## name height mass hair_color skin_color eye_color birth_year ## <chr> <int> <dbl> <chr> <chr> <chr> <dbl> ## 1 Luke Skywalker 172 77 blond fair blue … Continue reading Please Consider Using wrapr::let() for Replacement Tasks

Read more »

Visualizing Time Series Data in R

June 26, 2017
By
Visualizing Time Series Data in R

I’m very pleased to announce my DataCamp course on Visualizing Time Series Data in R. This course is also part of the  Time Series with R skills track. Feel free to have a look, the first chapter is free! Course Description As the saying goes, “A chart is worth a thousand words”. This is why visualization […]

Read more »

wrapr Implementation Update

June 19, 2017
By
wrapr Implementation Update

Introduction The development version CRAN version of our R helper function wrapr::let() has switched from string-based substitution to abstract syntax tree based substitution (AST based substitution, or language based substitution). I am looking for some feedback from wrapr::let() users already doing substantial work with wrapr::let(). If you are already using wrapr::let() please test if the … Continue reading wrapr Implementation Update

Read more »

Non-Standard Evaluation and Function Composition in R

June 16, 2017
By

In this article we will discuss composing standard-evaluation interfaces (SE) and composing non-standard-evaluation interfaces (NSE) in R. In R the package tidyeval/rlang is a tool for building domain specific languages intended to allow easier composition of NSE interfaces. To use it you must know some of its structure and notation. Here are some details paraphrased … Continue reading Non-Standard Evaluation and Function Composition in R

Read more »

An easy way to accidentally inflate reported R-squared in linear regression models

June 15, 2017
By

Here is an absolutely horrible way to confuse yourself and get an inflated reported R-squared on a simple linear regression model in R. We have written about this before, but we found a new twist on the problem (interactions with categorical variable encoding) which we would like to call out here. First let’s set up … Continue reading An easy way to accidentally inflate reported R-squared in linear regression models

Read more »

Use a Join Controller to Document Your Work

June 13, 2017
By
Use a Join Controller to Document Your Work

This note describes a useful replyr tool we call a "join controller" (and is part of our "R and Big Data" series, please see here for the introduction, and here for one our big data courses). When working on real world predictive modeling tasks in production, the ability to join data and document how you … Continue reading Use a Join Controller to Document Your Work

Read more »

thinning a Markov chain, statistically

June 12, 2017
By
thinning a Markov chain, statistically

Art Owen has arXived a new version of his thinning MCMC paper, where he studies how thinning or subsampling can improve computing time in MCMC chains. I remember quite well the message set by Mark Berliner and Steve MacEachern in an early 1990’s paper that subsampling was always increasing the variance of the resulting estimators. […]

Read more »


Subscribe

Email:

  Subscribe