Statistics

Statistics Blogs

Variable Selection using Cross-Validation (and Other Techniques)

July 1, 2015
By
Variable Selection using Cross-Validation (and Other Techniques)

A natural technique to select variables in the context of generalized linear models is to use a stepŵise procedure. It is natural, but contreversial, as discussed by Frank Harrell  in a great post, clearly worth reading. Frank mentioned about 10 points against a stepwise procedure. It yields R-squared values that are badly biased to be high. The F and chi-squared tests quoted next to each variable on the printout do not have the…

Read more »

July Reading

July 1, 2015
By
July Reading

Now that the (Northern) summer is here, you should have plenty of time for reading. Here are some recommendations:Ahelegbey, D. F., 2015. The econometrics of networks: A review. Working Paper 13/WP/2015, Department of Economics, University of Venice.Ca...

Read more »

Stapel’s Fix for Science? Admit the story you want to tell and how you “fixed” the statistics to support it!

July 1, 2015
By
Stapel’s Fix for Science? Admit the story you want to tell and how you “fixed” the statistics to support it!

Stapel’s “fix” for science is to admit it’s all “fixed!” That recent case of the guy suspected of using faked data for a study on how to promote support for gay marriage in a (retracted) paper, Michael LaCour, is directing a bit of limelight on our star fraudster Diederik Stapel (50+ retractions). The Chronicle of Higher Education just published an article by […]

Read more »

NBER IFM Data Session and Site

June 30, 2015
By

The NBER's International Finance and Macroeconomics (IFM) Program is sponsoring a 2015 Summer Institute "Data Session" and a corresponding web site ("Catalog of Data Sources") where the various datasets are archived.Great idea. Hats off to the org...

Read more »

The Econometrics of Temporal Aggregation – VI – Tests of Linear Restrictions

June 29, 2015
By
The Econometrics of Temporal Aggregation – VI – Tests of Linear Restrictions

This post is one of several related posts. The previous ones can be found here, here, here, here and here. These posts are based on Giles (2014).Many of the statistical tests that we perform routinely in econometrics can be affected by the level o...

Read more »

Our first column: what’s so fun about fake data

June 29, 2015
By

Gelman sums up the reasons why there is a crisis in experimental research in our time. The journal publication process fails to catch fake research (let alone bad research), and the new media prefer sensationalist headlines over good science. Many res...

Read more »

Generalizing from Marketing Research: The Right Question and the Correct Analysis

June 28, 2015
By
Generalizing from Marketing Research: The Right Question and the Correct Analysis

The marketing researcher asks some version of the following question in every study, "Tell me what you want?" The rest is a summary of the notes taken during the ensuing conversation.Steve Jobs' quote suggests that we might do better getting a reaction...

Read more »

A bit about Win-Vector LLC

June 26, 2015
By

Win-Vector LLC is a consultancy founded in 2007 that specializes in research, algorithms, data-science, and training. (The name is an attempt at a mathematical pun.) Win-Vector LLC can complete your high value project quickly (some examples), and train...

Read more »

Gelman and I on the Daily Beast

June 26, 2015
By

If it's somethin' weird an' it don't look good, who ya gonna call? Statbusters! That is the name of our weekly column for the Daily Beast, starting today. Andrew Gelman and I will alternate weeks. As I write this, nothing is up yet. Try going to the Daily Beast to find Andrew's first column.

Read more »

An Attempt to Understand Boosting Algorithm(s)

June 26, 2015
By
An Attempt to Understand Boosting Algorithm(s)

Last tuesday, at the annual meeting of the French Economic Association, I was having lunch with Alfred, and while we were chatting about modeling issues (econometric models against machine learning prediction), he asked me what boosting was. Since I could not be very specific, we’ve been looking at wikipedia webpage. Boosting is a machine learning ensemble meta-algorithm for reducing bias primarily and also variance in supervised learning, and a family of machine learning algorithms…

Read more »


Subscribe

Email:

  Subscribe