Statistics

Statistics Blogs

scala-glm: Regression modelling in Scala

June 21, 2017
By
scala-glm: Regression modelling in Scala

Introduction As discussed in the previous post, I’ve recently constructed and delivered a short course on statistical computing with Scala. Much of the course is concerned with writing statistical algorithms in Scala, typically making use of the scientific and numerical computing library, Breeze. Breeze has all of the essential tools necessary for building statistical algorithms, … Continue reading scala-glm: Regression modelling in Scala

Read more »

Picky people

June 20, 2017
By
Picky people

Our book on Bayesian cost-effectiveness analysis using BCEA is out (I think as of last week). This has been a long process (I've talked about this here, here and here). Today I've come back to the office and have open the package with my copies. T...

Read more »

Even more incompetence in reporting survey results: Washington Post on the hotseat

June 20, 2017
By

Previously, we looked at the claim that 7 percent of American adults believe that chocolate milk comes from brown cows. The media, including mainstream outlets such as the Washington Post, demonstrated incompetence in reading survey results. That was before I investigated the other survey covered by the Washington Post in that article about brown cows (link). The Post should immediately apologize for their horrible reporting. For this study, the data…

Read more »

Extreme beta distributions

June 20, 2017
By
Extreme beta distributions

A beta probability distribution has two parameters, a and b. You can think of these as the number of successes and failures out of a+b trials. The PDF of a beta distribution is approximately normal if a and b are approximately equal and a + b is large. If a and b are close, they don’t have to be very large for the beta […]

Read more »

Data Science Tool Market Share Leading Indicator: Scholarly Articles

June 19, 2017
By
Data Science Tool Market Share Leading Indicator: Scholarly Articles

Below is the latest update to The Popularity of Data Science Software. It contains an analysis of the tools used in the most recent complete year of scholarly articles. The section is also integrated into the main paper itself. New … Continue reading →

Read more »

How to read survey results: chocolate milk edition

June 19, 2017
By

Apparently, the Washington Post decided to assist the dairy industry in its latest advertising campaign by publishing a weird survey result, which claims that some adults believe that chocolate milk comes from brown cows. (link) Just start by thinking about the survey design. In order for people to express this opinion, the survey had to contain a choice of "brown cows." If they had done an open-ended survey, the result…

Read more »

Homecoming (of sort…)

June 19, 2017
By
Homecoming (of sort…)

I spent last week in Florence for our Summer School. Of course, it was home-coming for me and I really enjoyed being back to Florence $-$ although it was really hot. I would say I'm not used to that level of heat anymore, if it wasn't for the fact that...

Read more »

wrapr Implementation Update

June 19, 2017
By
wrapr Implementation Update

Introduction The development version of our R helper function wrapr::let() has switched from string-based substitution to abstract syntax tree based substitution (AST based subsitution, or language based substitution). I am looking for some feedback from wrapr::let() users already doing substantial work with wrapr::let(). If you are already using wrapr::let() please test if the current development … Continue reading wrapr Implementation Update

Read more »

Non-Standard Evaluation and Function Composition in R

June 16, 2017
By

In this article we will discuss composing standard-evaluation interfaces (SE) and composing non-standard-evaluation interfaces (NSE) in R. In R the package tidyeval/rlang is a tool for building domain specific languages intended to allow easier composition of NSE interfaces. To use it you must know some of its structure and notation. Here are some details paraphrased … Continue reading Non-Standard Evaluation and Function Composition in R

Read more »

An easy way to accidentally inflate reported R-squared in linear regression models

June 15, 2017
By

Here is an absolutely horrible way to confuse yourself and get an inflated reported R-squared on a simple linear regression model in R. We have written about this before, but we found a new twist on the problem (interactions with categorical variable encoding) which we would like to call out here. First let’s set up … Continue reading An easy way to accidentally inflate reported R-squared in linear regression models

Read more »


Subscribe

Email:

  Subscribe