The case for index-free data manipulation

December 10, 2016
By
The case for index-free data manipulation

Statisticians and data scientists want a neat world where data is arranged in a table such that every row is an observation or instance, and every column is a variable or measurement. Getting to this state of “ready to model format” (often called a denormalized form by relational algebra types) often requires quite a bit … Continue reading The case for index-free data manipulation

Read more »

fMRI clusterf******

December 10, 2016
By
fMRI clusterf******

Several people pointed me to this paper by Anders Eklund, Thomas Nichols, and Hans Knutsson, which begins: Functional MRI (fMRI) is 25 years old, yet surprisingly its most common statistical methods have not been validated using real data. Here, we used resting-state fMRI data from 499 healthy controls to conduct 3 million task group analyses. […] The post fMRI clusterf******…

Read more »

Create a Koch snowflake with SAS

December 10, 2016
By
Create a Koch snowflake with SAS

I have a fondness for fractals. In previous articles, I've used SAS to create some of my favorite fractals, including a fractal Christmas tree and the "devil's staircase" (Cantor ) function. Because winter is almost here, I think it is time to construct the Koch snowflake fractal in SAS. A […] The post Create a Koch snowflake with SAS appeared…

Read more »

5 more things I learned from the 2016 election

December 10, 2016
By

After posting the 19 Things We Learned from the 2016 Election, I received a bunch of helpful feedback in comments and email. Here are some of the key points that I missed or presented unclearly: Non-presidential elections Nadia Hassan points out that my article is “so focused on the Presidential race than it misses some […] The post 5 more…

Read more »

“The Fundamental Incompatibility of Scalable Hamiltonian Monte Carlo and Naive Data Subsampling”

December 10, 2016
By

Here’s Michael Betancourt writing in 2015: Leveraging the coherent exploration of Hamiltonian flow, Hamiltonian Monte Carlo produces computationally efficient Monte Carlo estimators, even with respect to complex and high-dimensional target distributions. When confronted with data-intensive applications, however, the algorithm may be too expensive to implement, leaving us to consider the utility of approximations such as […] The post “The Fundamental…

Read more »

Temple Grandin

December 9, 2016
By

She also belongs in the “objects of class Pauline Kael” category. Most autistic people are male, but Temple Grandin is the most famous and accomplished autistic person ever. The post Temple Grandin appeared first on Statistical Modeling, C...

Read more »

Nomen omen

December 9, 2016
By
Nomen omen

After resisting this for way too long, I've finally decided it was time to release more widely a couple of the R packages I've been working on $-$ I've put them on GitHub, hence the mug...In both cases, while I think the packages do work nicely, I am s...

Read more »

Round things, square things

December 9, 2016
By
Round things, square things

The following chart traces the flow of funds into AI (artificial intelligence) startups. I found it on this webpage and it is attributed to Financial Times. Here, I apply the self-sufficiency test to show that the semicircles are playing no...

Read more »

19 Things We Learned from the 2016 Election

December 8, 2016
By

OK, we can all agree that the November election result was a shocker. According to news reports, even the Trump campaign team was stunned to come up a winner. So now seemed like a good time to go over various theories floating around in political science and political reporting and see where they stand, now […] The post 19 Things…

Read more »

Which graph to use?

December 8, 2016
By
Which graph to use?

A student asked me on our Facebook page to help with an assignment. It got me thinking again about the nature of answers in statistics, and the challenge of communicating through graphs. The student gave no explanation, but rather a … Continue reading →

Read more »

“So such markets were, and perhaps are, subject to bias from deep pocketed people who may be expressing preference more than actual expectation”

December 8, 2016
By
“So such markets were, and perhaps are, subject to bias from deep pocketed people who may be expressing preference more than actual expectation”

Geoff Buchan writes in with another theory about how prediction markets can go wrong: I did want to mention one fascinating datum on Brexit: one UK bookmaker said they received about twice as many bets on leave as on remain, but the average bet on remain was *five* times what was bet on leave, meaning […] The post “So such…

Read more »

flea circus

December 7, 2016
By
flea circus

An old riddle found on X validated asking for Monte Carlo resolution  but originally given on Project Euler: A 30×30 grid of squares contains 30² fleas, initially one flea per square. When a bell is rung, each flea jumps to an adjacent square at random. What is the expected number of unoccupied squares after 50 […]

Read more »

“Dear Major Textbook Publisher”: A Rant

December 7, 2016
By
“Dear Major Textbook Publisher”:  A Rant

Dear Major Academic Publisher, You just sent me, unsolicited, an introductory statistics textbook that is 800 pages and weighs about 5 pounds. It’s the 3rd edition of a book by someone I’ve never heard of. That’s fine—a newcomer can write a good book. The real problem is that the book is crap. It’s just the […] The post “Dear Major…

Read more »

Simultaneous confidence intervals for a multivariate mean

December 7, 2016
By
Simultaneous confidence intervals for a multivariate mean

Many SAS procedure compute statistics and also compute confidence intervals for the associated parameters. For example, PROC MEANS can compute the estimate of a univariate mean, and you can use the CLM option to get a confidence interval for the population mean. Many parametric regression procedures (such as PROC GLM) […] The post Simultaneous confidence intervals for a multivariate mean…

Read more »

The EagerEyes Holiday Shopping Guide

December 7, 2016
By
The EagerEyes Holiday Shopping Guide

Are you looking for the perfect gift for the data or visualization geek in your life? Did that crazy self-driving water bottle Kickstarter still not deliver, leaving you hunting for an overpriced Nintendo Classic? The EagerEyes Holiday Shopping Guide has all the geeky, uncool gifts you could possibly want. To be clear, none of the […]

Read more »

Hey, I forgot to include a cat picture in my previous post!

December 7, 2016
By
Hey, I forgot to include a cat picture in my previous post!

Josh Miller fixes it for me: The post Hey, I forgot to include a cat picture in my previous post! appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

Using replyr::let to Parameterize dplyr Expressions

December 7, 2016
By
Using replyr::let to Parameterize dplyr Expressions

Imagine that in the course of your analysis, you regularly require summaries of numerical values. For some applications you want the mean of that quantity, plus/minus a standard deviation; for other applications you want the median, and perhaps an interval around the median based on the interquartile range (IQR). In either case, you may want … Continue reading Using replyr::let…

Read more »

the incredible accuracy of Stirling’s approximation

December 6, 2016
By
the incredible accuracy of Stirling’s approximation

The last riddle from the Riddler [last before The Election] summed up to find the probability of a Binomial B(2N,½) draw ending up at the very middle, N. Which is If one uses the standard Stirling approximation to the factorial function, log(N!)≈Nlog(N) – N + ½log(2πN) the approximation to ℘ is 1/√πN, which is not […]

Read more »

Hot hand 1, WSJ 0

December 6, 2016
By
Hot hand 1, WSJ 0

In a generally good book review on “uncertainty and the limits of human reason,” William Easterly writes: Failing to process uncertainty correctly, we attach too much importance to too small a number of observations. Basketball teams believe that players suddenly have a “hot hand” after they have made a string of baskets, so you should […] The post Hot hand…

Read more »

10 hints to make the most of teaching and academic conferences

December 6, 2016
By
10 hints to make the most of teaching and academic conferences

Hints for conference benefit maximisation I am writing this post in a spartan bedroom in Glenn Hall at La Trobe University in Bundoora (Melbourne, Australia.) Some outrageously loud crows are doing what crows do best outside my window, and I … Continue reading →

Read more »

Data 1, NPR 0

December 6, 2016
By
Data 1, NPR 0

Jay “should replace the Brooks brothers on the NYT op-ed page” Livingston writes: There it was again, the panic about the narcissism of millennialas as evidenced by selfies. This time it was NPR’s podcast Hidden Brain. The show’s host Shankar Vedantam chose to speak with only one researcher on the topic – psychologist Jean Twenge, […] The post Data 1,…

Read more »

best algorithm EVER !!!!!!!!

December 6, 2016
By
best algorithm EVER !!!!!!!!

Someone writes: On the website https://odajournal.com/ you find a lot of material for Optimal (or “optimizing”) Data Analysis (ODA) which is described as: In the Optimal (or “optimizing”) Data Analysis (ODA) statistical paradigm, an optimization algorithm is first utilized to identify the model that explicitly maximizes predictive accuracy for the sample, and then the resulting […] The post best algorithm…

Read more »

Optimization matchup: R’s glpkAPI vs Julia’s JuMP

December 6, 2016
By

tl;dr: although I use R every day and love it, doing mathematical programming using Julia is much simpler and more flexible than anything I know that is currently available in R.Recently I have learned that Iain Dunning and Joey Huchette and Miles Lubi...

Read more »


Subscribe

Email:

  Subscribe