## VIS 2016 – Tuesday

October 26, 2016
By

The official opening of the main conference was today, Tuesday. The conference is now in full swing until Friday. Opening Attendance at the conference is flat – Terry Yoo gave no precise numbers, but at least it's not shrinking. I figure they don't want to release precise numbers in the hope that there's just a […]

## Colorless green ideas tweet furiously

October 26, 2016
By

Nadia Hassan writes: Justin Wolfers and Nate Silver got into a colorful fight on twitter. Nate has 2 forecasts. Nate is doing a polls-only forecast in addition to a “traditional” one that discounts poll leads and builds in fundamentals. Wolfers noted that the 538 polls-only model had Clinton at a higher chance of winning on […] The post Colorless green…

## Should researchers be correcting for multiple tests, even when they themselves did not run the tests, but all of the tests were run on the same data?

October 25, 2016
By

A graduate student, named Caitlin Ducate, in my frequentist statistics class asks:In Criminal Justice, it's common to use large data sets like the Uniform Crime Report (UCR) or versions of the National Longitudinal Survey (NLS) because the nature of ...

## Socks, skeets, space aliens

October 25, 2016
By

In my Bayesian statistics class this semester, I asked students to invent new Bayes theorem problems, with the following criteria:1) A good Bayes's theorem problem should pose an interesting question that seems hard to solve directly, but2) It should b...

## How not to analyze noisy data: A case study

October 25, 2016
By

I was reading Jenny Davidson’s blog and came upon this note on an autobiography of the eccentric (but aren’t we all?) biologist Robert Trivers. This motivated me, not to read Trivers’s book, but to do some googling which led me to this paper from Plos-One, “Revisiting a sample of U.S. billionaires: How sample selection and […] The post How not…

## VIS 2016 – Sunday, Monday: BELIV and Being Contrarian

October 25, 2016
By

The early part of IEEE VIS 2016 is already behind us. This includes many workshops, tutorials, as well as the Doctoral Colloquium. It has been an interesting three days (considering Saturday here as well). This posting is less a report as a number of observations from a several discussions and talks. Doctoral Colloquium The reason I’m including […]

## Why Journalists need to understand statistics – Sensational Listener article about midwifery risks

October 25, 2016
By

The recent article in the Listener highlights again the need for all citizens to  be statistically literate. In particular I believe statistical literacy should be a compulsory part of all journalists’ training. I have written before about this. I was happy … Continue reading →

## Ptolemaic inference

October 24, 2016
By

OK, we’ve been seeing this a lot recently. A psychology study gets published, with a key idea that at first seems wacky but, upon closer reflection, could very well be true! Examples: – That “dentist named Dennis” paper suggesting that people pick where they live and what job to take based on their names. – […] The post Ptolemaic inference…

## And Yet It Moves: Gravitational Waves

October 24, 2016
By

"The moment he was set at liberty, he looked up to the sky and down to the ground, and, stamping with his foot, in a contemplative mood, said, Eppur si muove [And yet it moves], meaning the earth."1Giuseppe Baretti, on Galileo GalileiGalileo ...

## And Yet It Moves: Gravitational Waves

October 24, 2016
By

"The moment he was set at liberty, he looked up to the sky and down to the ground, and, stamping with his foot, in a contemplative mood, said, Eppur si muove [And yet it moves], meaning the earth."1Giuseppe Baretti, on Galileo GalileiGalileo ...

## Machine Learning vs. Econometrics, IV

October 24, 2016
By

Some of my recent posts on this topic emphasized that (1) machine learning (ML) tends to focus on non-causal prediction, whereas econometrics and statistics (E/S) has both non-causal and causal parts, and (2) E/S tends to be more concerned with probabi...

## Denver outspends everyone on this

October 24, 2016
By

Someone at the Wall Street Journal noticed that Denver's transit agency has outspent other top transit agencies, after accounting for number of rides -- and by a huge margin. But the accompanying graphic conspires against the journalist. For one thing,...

## Q&A: predictive analytics

October 24, 2016
By

A major news outlet interviewed me on predictive analytics. Here were my responses. Data mining is not just for tech companies, in fact it can be especially useful for industries which are not typically thought of to be ‘innovative’ such as agriculture. What are some of the main industries that you think benefit from predictive […]

## Ahh, that’s smooth! Anti-aliasing in SAS statistical graphics

October 24, 2016
By

I've written several articles about scatter plot smoothers: nonparametric regression curves that reveal small- and large-scale features of a response variable as a function of an explanatory variable. However, there is another kind of "smoothness" that you might care about, and that is the apparent smoothness of curves and markers […] The post Ahh, that's smooth! Anti-aliasing in SAS statistical…

## Spin

October 24, 2016
By

Yesterday all the past. The language of effect size Spreading to Psychology along the sub-fields; the diffusion Of the counting-frame and the quincunx; Yesterday the shadow-reckoning in the ivy climates. Yesterday the assessment of hypotheses by tests, The divination of water; yesterday the invention Of cartwheels and clocks, the power-pose of Horses. Yesterday the bustling […] The post Spin appeared…

## Common Speaking Mistakes To Avoid

October 24, 2016
By

Whenever I go to academic conferences, I have to sit through some terrible talks. It continues to amaze me that so many people make mistakes that are so easy to avoid. Here are a few I noticed just in the last two days. Spend first two minutes apologizing I understand the impulse to apologize. I really do. But […]

## ratio-of-uniforms

October 23, 2016
By

One approach to random number generation that had always intrigued me is Kinderman and Monahan’s (1977) ratio-of-uniform method. The method is based on the result that the uniform distribution on the set A of (u,v)’s in R⁺xX such that 0≤u²≤ƒ(v/u) induces the distribution with density proportional to ƒ on V/U. Hence the name. The proof […]

October 23, 2016
By

Jeff points to this excellently skeptical news article by Caroline Weinberg, who writes: A recent study published in the American Journal of Human Biology suggests that people with previous tattoo experience may have a better immune response to new tattoos than those being inked for the first time. That’s the finding if you read the […] The post “How One…

## Q&A time

October 23, 2016
By

Someone sent me some questions by email, and I decided to answer some of them here. How important is it that I know and understand the underlying mathematical framework to forecasting methods? I understand conceptually how most of them work, but I feel as if I may benefit from truly understanding the math. The main benefit […]

## A quick look at RStudio’s R notebooks

October 22, 2016
By

A quick demo of RStudio’s R Notebooks shown by John Mount (of Win-Vector LLC, a statistics, data science, and algorithms consulting and training firm). (link) It looks like some of the new in-line display behavior is back-ported to R Markdown and some of the difference is the delayed running and different level of interactivity in … Continue reading A quick…

## Data science for executives and managers

October 22, 2016
By

Nina Zumel recently announced upcoming speaking appearances. I want to promote the upcoming sessions at ODSC West 2016 (11:15am-1:00pm on Friday November 4th, or 3:00pm-4:30pm on Saturday November 5th) and invite executives, managers, and other data science consumers to attend. We assume most of the Win-Vector blog audience is made of practitioners (who we hope … Continue reading Data science…

## Posterior predictive distribution for multiple linear regression

October 22, 2016
By

Suppose you've done a (robust) Bayesian multiple linear regression, and now you want the posterior distribution on the predicted value of $$y$$ for some probe value of $$\langle x_1,x_2,x_3, ... \rangle$$. That is, not the posterior distribution on t...

## LOD MOOC

October 22, 2016
By

Massive Open Online Courses (MOOC) are available worldwide and offer tons of topics, also about Linked Open Data (LOD). An easy way to enter the semantic web. Two examples: HPI The Hasso Plattner Institute, Potsdam provides, for some years now, a course in Linked Data Engineering with a certificate. I did it some years ago and … Continue reading LOD MOOC