Author: John Mount

How cdata Control Table Data Transforms Work

With all of the excitement surrounding cdata style control table based data transforms (the cdata ideas being named as the “replacements” for tidyr‘s current methodology, by the tidyr authors themselves!) I thought I would take a moment to describe how they work. cdata defines two primary data manipulation operators: rowrecs_to_blocks() and blocks_to_rowrecs(). These are the … Continue reading How cdata Control Table Data Transforms Work

Why we Did Not Name the cdata Transforms wide/tall/long/short

We recently saw this UX (user experience) question from the tidyr author as he adapts tidyr to cdata techniques. The terminology that he is not adopting from cdata is “unpivot_to_blocks()” and “pivot_to_rowrecs()”. One of the research ideas in the cdata package is that the important thing to call out is record structure. The important point … Continue reading Why we Did Not Name the cdata Transforms wide/tall/long/short

Tidyverse users: gather/spread are on the way out

From https://twitter.com/sharon000/status/1107771331012108288: From https://tidyr.tidyverse.org/dev/articles/pivot.html: There are two important new features inspired by other R packages that have been advancing of reshaping in R: The reshaping operation can be specified with a data frame that describes precisely how metadata stored in column names becomes data variables (and vice versa). This is inspired by the cdata package … Continue reading Tidyverse users: gather/spread are on the way out

wrapr::let()

I would like to once again recommend our readers to our note on wrapr::let(), an R function that can help you eliminate many problematic NSE (non-standard evaluation) interfaces (and their associate problems) from your R programming tasks. The idea is to imitate the following lambda-calculus idea: let x be y in z := ( λ … Continue reading wrapr::let()

Unit Tests in R

I am collecting here some notes on testing in R. There seems to be a general (false) impression among non R-core developers that to run tests, R package developers need a test management system such as RUnit or testthat. And a further false impression that testthat is the only R test management system. This is … Continue reading Unit Tests in R

Starting With Data Science: A Rigorous Hands-On Introduction to Data Science for Engineers

Starting With Data Science A rigorous hands-on introduction to data science for engineers. Win Vector LLC is now offering a 4 day on-site intensive data science course. The course targets engineers familiar with Python and introduces them to the basics of current data science practice. This is designed as an interactive in-person (not remote or … Continue reading Starting With Data Science: A Rigorous Hands-On Introduction to Data Science for Engineers

rquery Substitution

The rquery R package has several places where the user can ask for what they have typed in to be substituted for a name or value stored in a variable. This becomes important as many of the rquery commands capture column names from un-executed code. So knowing if something is treated as a symbol/name (which … Continue reading rquery Substitution

“If You Were an R Function, What Function Would You Be?”

We’ve been getting some good uptake on our piping in R article announcement. The article is necessarily a bit technical. But one of its key points comes from the observation that piping into names is a special opportunity to give general objects the following personality quiz: “If you were an R function, what function would … Continue reading “If You Were an R Function, What Function Would You Be?”

More on Macros in R

Recently ran into something interesting in the R macros/quasi-quotation/substitution/syntax front:

It appears !! is no longer the last word in substitution (it certainly wasn’t the first).

The described effect is actually already pretty easy t…

Getting Started With rquery

To make getting started with rquery (an advanced query generator for R) easier we have re-worked the package README for various data-sources (including SparkR!). Here are our current examples: rquery and MonetDBLite rquery and RPostgreSQL rquery and RSQLite rquery and SparkR rquery and sparklyr For the MonetDBLite the query diagrammer shows a repeated calculation that … Continue reading Getting Started With rquery

Query Generation in R

R users have been enjoying the benefits of SQL query generators for quite some time, most notably using the dbplyr package. I would like to talk about some features of our own rquery query generator, concentrating on derived result re-use. Introduction SQL represents value use by nesting. To use a query result within another query … Continue reading Query Generation in R