Answering the question, What predictors are more important?, going beyond p-value thresholding and ranking

June 21, 2018
By

Daniel Kapitan writes: We are in the process of writing a paper on the outcome of cataract surgery. A (very rough!) draft can be found here, to provide you with some context:  https://www.overleaf.com/read/wvnwzjmrffmw. Using standard classification methods (Python sklearn, with synthetic oversampling to address the class imbalance), we are able to predict a poor outcome […] The post Answering the question, What predictors are more important?, going beyond p-value thresholding…

Read more »

A Comparative Review of the BlueSky Statistics GUI for R

June 21, 2018
By
A Comparative Review of the BlueSky Statistics GUI for R

Introduction BlueSky Statistics’ desktop version is a free and open source graphical user interface for the R software that focuses on beginners looking to point-and-click their way through analyses.  A commercial version is also available which includes technical support and a … Continue reading →

Read more »

Le Monde puzzle [#1053]

June 20, 2018
By
Le Monde puzzle [#1053]

An easy arithmetic Le Monde mathematical puzzle again: If coins come in units of 1, x, and y, what is the optimal value of (x,y) that minimises the number of coins representing an arbitrary price between 1 and 149?  If the number of units is now four, what is the optimal choice? The first question […]

Read more »

Big News: vtreat 1.2.0 is Available on CRAN, and it is now Big Data Capable

June 20, 2018
By

We here at Win-Vector LLC have some really big news we would please like the R-community’s help sharing. vtreat version 1.2.0 is now available on CRAN, and this version of vtreat can now implement its data cleaning and preparation steps on databases and big data systems such as Apache Spark. vtreat is a very complete … Continue reading Big News:…

Read more »

When does the quest for beauty lead science astray?

June 20, 2018
By

Under the heading, “please blog about this,” Shravan Vasishth writes: This book by a theoretical physicist [Sabine Hossenfelder] is awesome. The book trailer is here. Some quotes from her blog: “theorists in the foundations of physics have been spectacularly unsuccessful with their predictions for more than 30 years now.” “Everyone is happily producing papers in […] The post When does…

Read more »

Two thousand five hundred ways to say the same thing

June 20, 2018
By
Two thousand five hundred ways to say the same thing

Kaiser Fung, founder of Junk Charts and Principal Analytics Prep, discusses a map of credit card debt paydown that is deceptively complex.

Read more »

Two thousand five hundred ways to say the same thing

June 20, 2018
By
Two thousand five hundred ways to say the same thing

Kaiser Fung, founder of Junk Charts and Principal Analytics Prep, discusses a map of credit card debt paydown that is deceptively complex.

Read more »

The bootstrap method in SAS: A t test example

June 20, 2018
By
The bootstrap method in SAS: A t test example

A previous article provides an example of using the BOOTSTRAP statement in PROC TTEST to compute bootstrap estimates of statistics in a two-sample t test. The BOOTSTRAP statement is new in SAS/STAT 14.3 (SAS 9.4M5). However, you can perform the same bootstrap analysis in earlier releases of SAS by using [...] The post The bootstrap method in SAS: A t…

Read more »

Shout-Out for Marc Bellemare

June 20, 2018
By
Shout-Out for Marc Bellemare

If you don't follow Marc Bellemare's blog (shame on you - you should!), then you may not have caught up with his recent posts relating to his series of lectures on "Advanced Econometrics - Causal Inference With Observational Data" at the University of ...

Read more »

Data science teaching position in London

June 19, 2018
By

Seth Flaxman sends this along: The Department of Mathematics at Imperial College London wishes to appoint a Senior Strategic Teaching Fellow in Data Science, to be in post by September 2018 or as soon as possible thereafter. The role will involve developing and delivering a suite of new data science modules, initially for the MSc […] The post Data science…

Read more »

a chain of collapses

June 19, 2018
By
a chain of collapses

A quick riddler resolution during a committee meeting (!) of a short riddle: 36 houses stand in a row and collapse at times t=1,2,..,36. In addition, once a house collapses, the neighbours if still standing collapse at the next time unit. What are the shortest and longest lifespans of this row? Since a house with […]

Read more »

Opportunity for Comment!

June 19, 2018
By

(This is Dan) Last September, Jonah, Aki, Michael, Andrew and I wrote a paper on the role of visualization in the Bayesian workflow.  This paper is going to be published as a discussion paper in the Journal of the Royal Statistical Society Series A and the associated read paper meeting (where we present the paper and […] The post Opportunity for…

Read more »

What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?

June 19, 2018
By

Kara Weisman writes: I’m a PhD student in psychology, and I attended your talk at the Stanford Graduate School of Business earlier this year. I’m writing to ask you about something I remember you discussing at that talk: The possible role of qualitative methods in addressing issues of replicability, reproducibility, and rigor. In particular, I […] The post What is…

Read more »

HTML Widgets for Non-HTML Output Formats

Not surprisingly, HTML widgets were designed for HTML output formats (e.g., rmarkdown::html_document and rmarkdown::ioslides_presentation). Their interactivity relies on JavaScript. In general, you should not expect JavaScript to work in LaTeX/PDF or Word or PowerPoint. In January 2015, I gave an introductory talk on HTML widgets in the LA R User Group. I joked in the talk that “If you ask…

Read more »

Books to Read While the Algae Grow in Your Fur, December 2016

June 18, 2018
By

Attention conservation notice: I have no taste. Laila Lalami, The Moor's Account Historical fiction, in which the Narvaez expedition across what's now the American South and Southwest in the early 1500s is told from the view-point of the Moorish ...

Read more »

Power analysis and NIH-style statistical practice: What’s the implicit model?

June 18, 2018
By
Power analysis and NIH-style statistical practice:  What’s the implicit model?

So. Following up on our discussion of “the 80% power lie,” I was thinking about the implicit model underlying NIH’s 80% power rule. Several commenters pointed out that, to have your study design approved by NSF, it’s not required that you demonstrate that you have 80% power for real; what’s needed is to show 80% […] The post Power analysis…

Read more »

The BOOTSTRAP statement for t tests in SAS

June 18, 2018
By
The BOOTSTRAP statement for t tests in SAS

Bootstrap resampling is a powerful way to estimate the standard error for a statistic without making any parametric assumptions about its sampling distribution. The bootstrap method is often implemented by using a sequence of calls to resample from the data, compute a statistic on each sample, and analyze the bootstrap [...] The post The BOOTSTRAP statement for t tests in…

Read more »

10th ECB Workshop on Forecasting Techniques, Frankfurt

June 18, 2018
By

Starts now, program here. Looks like a great lineup. Most of the papers are posted, and the organizers also plan to post presentation slides following the conference. Presumably in future weeks I'll blog on some of the presentations.

Read more »

The Role of Resources in Data Analysis

June 18, 2018
By

When learning about data analysis in school, you don’t hear much about the role that resources—time, money, and technology—play in the development of analysis. This is a conversation that is often had “in the hallway” when talking t...

Read more »

One Little Thing: knitr::combine_words()

One Little Thing: knitr::combine_words()

When you want to output a character vector for humans to read, you probably don’t want something like [1] a b c, which is the normal way to print a vector in R. Instead, you may want a character string "a, b, and c" (Oxford comma FTW!). In 2014, I gave a guest lecture in a course at Iowa State.…

Read more »

Bayesians are frequentists

June 17, 2018
By

Bayesians are frequentists. What I mean is, the Bayesian prior distribution corresponds to the frequentist sample space: it’s the set of problems for which a particular statistical model or procedure will be applied. I was thinking about this in the context of this question from Vlad Malik: I noticed this comment on Twitter in reference […] The post Bayesians are…

Read more »

Chasing the noise in industrial A/B testing: what to do when all the low-hanging fruit have been picked?

June 16, 2018
By

Commenting on this post on the “80% power” lie, Roger Bohn writes: The low power problem bugged me so much in the semiconductor industry that I wrote 2 papers about around 1995. Variability estimates come naturally from routine manufacturing statistics, which in semicon were tracked carefully because they are economically important. The sample size is […] The post Chasing the…

Read more »

One Little Thing: Touch a Source File in a blogdown Website

One Little Thing: Touch a Source File in a blogdown Website

Motivated by a blogdown issue raised by Liang Zhang, I added an RStudio addin named “Touch File” in blogdown last month to update the modification time of the current file in RStudio. Most blogdown users should already know the LiveReload feature, which means if you edit a source file and save it, your website will be automatically rebuilt and refreshed…

Read more »


Subscribe

Email:

  Subscribe