Improving on Chebyshev’s inequality

February 12, 2016
By
Improving on Chebyshev’s inequality

Chebyshev’s inequality says that the probability of a random variable being more than k standard deviations away from its mean is less than 1/k2. In symbols, This inequality is very general, but also very weak. It assumes very little about the random variable X but it also gives a loose bound. If we assume slightly more, […]

Read more »

Not So Standard Deviations Episode 9 – Spreadsheet Drama

February 12, 2016
By

For this episode, special guest Jenny Bryan (@jennybryan) joins us from the University of British Columbia! Jenny, Hilary, and I talk about spreadsheets and why some people love them and some people despise them. We also discuss blogging as part of sci...

Read more »

Not So Standard Deviations Episode 9 – Spreadsheet Drama

February 12, 2016
By

For this episode, special guest Jenny Bryan (@jennybryan) joins us from the University of British Columbia! Jenny, Hilary, and I talk about spreadsheets and why some people love them and some people despise them. We also discuss blogging as part of sci...

Read more »

Arrow’s Theorem in the news: Sleazy-ass political scientists cut-and-paste their way to 3 publications from the same material

February 12, 2016
By
Arrow’s Theorem in the news:  Sleazy-ass political scientists cut-and-paste their way to 3 publications from the same material

I’m posting this one in the evening because I know some people just hate when I write about plagiarism. But this one is so ridiculous I had to share it with you. John Smith (or maybe I should say “John Smith”?) writes: Today on a political science forum I saw this information about plagiarism by […] The post Arrow’s Theorem in the news: Sleazy-ass political scientists cut-and-paste their way to…

Read more »

Everything Ends on Wednesday

February 12, 2016
By
Everything Ends on Wednesday

The Brazilian Carnival just ended this week, but for some people it is time to starting worry about crazy things that may have happened over the days of the flesh festival. Watching the news, the spokesperson of the Test and Prevention Center (CTA) in Brasilia estimated that the number of people seeking counseling and test kits increases on average 40% the day after the carnival (Wednesday). He also disclosed that…

Read more »

The answer is e, what was the question?!

February 11, 2016
By
The answer is e, what was the question?!

A rather exotic question on X validated: since π can be approximated by random sampling over a unit square, is there an equivalent for approximating e? This is an interesting question, as, indeed, why not focus on e rather than π after all?! But very quickly the very artificiality of the problem comes back to […]

Read more »

The aviator

February 11, 2016
By
The aviator

To continue with the weirdest week in terms of emails, I have received one today that says: Dear Dr. Baio,I represent XXX (an imprint of YYY).  We are looking to publish books in aviation that are scientific, academic or professional in natur...

Read more »

New R Code for High-Frequency Financial Data Analysis

February 11, 2016
By

I looked through the manual (below). Looks well done.From the email:Package features estimators for working with high frequency market data.Microstructure Noise:- Autocovariance Noise Variance- Realized Noise Variance- Unbiased Realized Noise Variance...

Read more »

Trends and Opportunities in Data Analysis

February 11, 2016
By
Trends and Opportunities in Data Analysis

Andy Warhol said “In the future, everyone will be world-famous for 15 minutes.” Here’s my 15 seconds of fame, a soundbite from the IBM Insight conference last year. My comments start at 1:30. In a nutshell, I predict that data analyt...

Read more »

In general, hypothesis testing is overrated and hypothesis generation is underrated, so it’s fine for these data to be collected with exploration in mind.

February 11, 2016
By

In preparation for writing this news article, Kelly Servick asked me what I thought about the Kavli HUMAN Project (see here and here). Here’s what I wrote: The general idea of gathering comprehensive data seems reasonable to me. I’ve often made the point that careful data collection and measurement are important. Data analysis is the […] The post In general, hypothesis testing is overrated and hypothesis generation is underrated, so…

Read more »

A follow-up to Crowdsourcing Research

February 11, 2016
By
A follow-up to Crowdsourcing Research

Last month I published some thoughts on crowdsourcing research, inspired by Anthony Goldbloom’s talk at Statistical Programming DC on the Kaggle experience. Today, I found a rather similar discussion  on crowdsourcing research (on the online version of the magazine Good) as a potential way to increase the accuracy of scientific research and reducing bias. I think more consideration needs to […]

Read more »

Data handcuffs

February 10, 2016
By

A few years ago, if you asked me what the top skills I got asked about for students going into industry, I'd definitely have said things like data cleaning, data transformation, database pulls, and other non-traditional statistical tasks. But as companies have progressed from the point of storing data to actually wanting to do something with

Read more »

La guerre des étoiles : distinguer le signal du bruit

February 10, 2016
By
La guerre des étoiles : distinguer le signal du bruit

La grande difficulté dans la modélisation et la construction de modèles prédictifs est de réussir à distinguer le signal et le bruit (pour reprendre le titre du classique de Nate Silver). La réponse statistique est la notion de significativité, et la recherche des ‘étoiles’ dans les sorties de régression. Avec l’explosion du nombre de données, il est devenu crucial de faire cette distinction, de savoir quelles sont les interactions qui…

Read more »

Scientific explanation of Panther defeat!

February 10, 2016
By
Scientific explanation of Panther defeat!

Roy’s comment on our recent post inspires me to reveal the true explanation underlying the Carolina team’s shocking Super Bowl loss. The Panthers were primed during the previous week with elderly-themed words such as “bingo” and “Manning.” As well-established research has demonstrated, this caused Cam and the gang to move more slowly, hence all the […] The post Scientific explanation of Panther defeat! appeared first on Statistical Modeling, Causal Inference,…

Read more »

More on Policy Uncertainty

February 10, 2016
By

Speaking of policy uncertainty (earlier blog post here), read here about the exciting ongoing project at the Becker-Friedman Institute of the University of Chicago.

Read more »

Who do you think I am?

February 10, 2016
By
Who do you think I am?

Today I've received an email inviting me to submit a paper to a scientific journal (it doesn't really matter what journal it is, or whether or not my own work is actually relevant for them). I looove the way they've addressed to me, though. I think tha...

Read more »

Sample with replacement and unequal probability in SAS

February 10, 2016
By
Sample with replacement and unequal probability in SAS

How do you sample with replacement in SAS when the probability of choosing each observation varies? I was asked this question recently. The programmer thought he could use PROC SURVEYSELECT to generate the samples, but he wasn't sure which sampling technique he should use to sample with unequal probability. This […] The post Sample with replacement and unequal probability in SAS appeared first on The DO Loop.

Read more »

Books to Read While the Algae Grow in Your Fur, January 2016

February 10, 2016
By

Attention conservation notice: I have no taste. Mark Thompson, The White War: Life and Death on the Italian Front, 1915--1919 A well-told narrative history of the war, mostly from the Italian side. He covers all aspects, from the back-and-forth of ...

Read more »

"Robust Bayesian inference via coarsening" (Next Week at the Statistics Seminar)

February 10, 2016
By

Attention conservation notice: Only of interest if you (1) care allocating precise fractions of a whole belief over a set of mathematical models when you know none of them is actually believable, and (2) will be in Pittsburgh on Monday. As someone wh...

Read more »

Worst Practices Conference

February 10, 2016
By

This ad just arrived in the email.  What a title.  Presumably the conference is about improving worst-case outcomes in order to improve expected minimax loss.  But still, that title...2016 Foresight Practitioner Conference:Worst Practice...

Read more »

Finding the K in K-means by Parametric Bootstrap

February 9, 2016
By
Finding the K in K-means by Parametric Bootstrap

One of the trickier tasks in clustering is determining the appropriate number of clusters. Domain-specific knowledge is always best, when you have it, but there are a number of heuristics for getting at the likely number of clusters in your data. We cover a few of them in Chapter 8 (available as a free sample … Continue reading Finding the K in K-means by Parametric Bootstrap

Read more »

Posterior Update of Bayes@Lund 2016

February 9, 2016
By
Posterior Update of Bayes@Lund 2016

For the third year round I and Ullrika Sahlin arranged Bayes@Lund, a mini-conference bringing together researchers interested in or working with Bayesian methods in and around Sweden. This year we were thrilled to have over 70 attendees, both from ne...

Read more »

Analysis: Clinton backed by Big Money: Sanders by Small

February 9, 2016
By
Analysis: Clinton backed by Big Money: Sanders by Small

This article examines FEC data in depth and finds what most people already know. Hillary Clinton's presidential bid is financed largely through a relatively small quantity of big donors while Bernie Sanders' presidential bid is funded by numerous small donors.In order to do our analysis, we look at four hundred thousand individualized contributions reported to the FEC at the end of the 2015 year. These contributions are only reported for…

Read more »


Subscribe

Email:

  Subscribe