R users have been enjoying the benefits of SQL query generators for quite some time, most notably using the dbplyr package. I would like to talk about some features of our own rquery query generator, concentrating on derived result re-use. Introduction SQL represents value use by nesting. To use a query result within another query … Continue reading Query Generation in R
Manning has a new discount code and a free excerpt of our book Practical Data Science with R, 2nd Edition: here.
This section is elementary, but things really pick up speed as later on (also available in a paid preview).
In our cdata R package and training materials we emphasize the record-oriented thinking and how to design a transform control table. We now have an additional exciting new feature: control table keys. The user can now control which columns of a cdata control table are the keys, including now using composite keys (that is keys … Continue reading cdata Control Table Keys
We have two new chapters of Practical Data Science with R, Second Edition online and available for review! The newly available chapters cover: Data Engineering And Data Shaping – Explores how to use R to organize or wrangle data into a shape useful for analysis. The chapter covers applying data transforms, data manipulation packages, and … Continue reading PDSwR2: New Chapters!
Composing functions and sequencing operations are core programming concepts. Some notable realizations of sequencing or pipelining operations include: Unix’s |-pipe CMS Pipelines. F#‘s forward pipe operator |>. Haskel’s Data.Function & operator. The R magrittr forward pipe. Scikit-learn‘s sklearn.pipeline.Pipeline. The idea is: many important calculations can be considered as a sequence of transforms applied to a … Continue reading Function Objects and Pipelines in R
One of the design goals of the cdata R package is that very powerful and arbitrary record transforms should be convenient and take only one or two steps. In fact it is the goal to take just about any record shape to any other in two steps: first convert to row-records, then re-block the data … Continue reading Fully General Record Transforms with cdata
Please help share our news and this discount. The second edition of our best-selling book Practical Data Science with R2, Zumel, Mount is featured as deal of the day at Manning. The second edition isn’t finished yet, but chapters 1 through 4 are available in the Manning Early Access Program (MEAP), and we have finished … Continue reading Practical Data Science with R, 2nd Edition discount!
Last February I reviewed the jamovi menu-based front end to R. I’ve reviewed five more user interfaces since then, and have developed a more comprehensive template to make it easier to compare them all. Now I’m cycling back to jamovi, … Continue reading →
vtreat‘s purpose is to produce pure numeric R data.frames that are ready for supervised predictive modeling (predicting a value from other values). By ready we mean: a purely numeric data frame with no missing values and a reasonable number of columns (missing-values re-encoded with indicators, and high-degree categorical re-encode by effects codes or impact codes). … Continue reading vtreat Variable Importance
Reusable modeling pipelines are a practical idea that gets re-developed many times in many contexts. wrapr supplies a particularly powerful pipeline notation, and a pipe-stage re-use system (notes here). We will demonstrate this with the vtreat data preparation system. Our example task is to fit a model on some arbitrary data. Our model will try … Continue reading Sharing Modeling Pipelines in R
It has been just a few months since I reviewed five free and open-source point-and-click graphical user interfaces (GUIs) to the R language. I plan to keep those reviews up to date as new features are added. BlueSky’s interface would … Continue reading →
Introduction The R Commander is a free and open source user interface for the R software, one that focuses on helping users learn R commands by point-and-clicking their way through analyses. The R Commander is available on Windows, Mac, and Linux; there is … Continue reading →
Introduction Rattle is a popular free and open source Graphical User Interface (GUI) for the R software, one that focuses on beginners looking to point-and-click their way through data mining tasks. Such tasks are also referred to as machine learning or predictive analytics. … Continue reading →
Introduction BlueSky Statistics’ desktop version is a free and open source graphical user interface for the R software that focuses on beginners looking to point-and-click their way through analyses. A commercial version is also available which includes technical support and a … Continue reading →
Point-and-click graphical user interfaces (GUIs) for R allow people to analyze data using the R software, without having to learn how to program in the R language. This is a brief look at how popular each one is. Knowing … Continue reading →
Introduction RKWard is a free and open source Graphical User Interface for the R software, one that supports beginners looking to point-and-click their way through analyses, as well as advanced programmers. You can think of it as a blend of … Continue reading →
This article shows you how to enter data so that you can easily open in statistics packages such as R, SAS, SPSS, or jamovi (code or GUI steps below). Excel has some statistical analysis capabilities, but they often provide incorrect answers. For … Continue reading →