I will survive!

January 25, 2016
By

Here's a very long post, to make up for the recent silence on the blog... Lately, I've been working on a new project involving the use of survival analysis data and results, specifically for health economic evaluation (cue Cake's rendition below).I hav...

Read more »

What is a moving average?

January 25, 2016
By
What is a moving average?

A moving average (also called a rolling average) is a statistical technique that is used to smooth a time series. Moving averages are used in finance, economics, and quality control. You can overlay a moving average curve on a time series to visualize how each value compares to a rolling […] The post What is a moving average? appeared first on The DO Loop.

Read more »

Not So Standard Deviations Episode 8 – Snow Day

January 25, 2016
By

Hilary and I were snowed in over the weekend, so we recorded Episode 8 of Not So Standard Deviations. In this episode, Hilary and I talk about how to get your foot in the door with data science, the New England Journal's view on data sharing, Google's "Cohort Analysis", and trying to predict a movie's

Read more »

(Legally) Free Books!

January 24, 2016
By
(Legally) Free Books!

(An earlier version of this post inadvertently included links to "pirated" material. This has now been rectified, and the post has been completely re-written.)There are several Econometrics books, and comprehensive sets of lecture notes, that can be ac...

Read more »

2 new reasons not to trust published p-values: You won’t believe what this rogue economist has to say.

January 24, 2016
By

Political scientist Anselm Rink points me to this paper by economist Alwyn Young which is entitled, “Channelling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results,” and begins, I [Young] follow R.A. Fisher’s The Design of Experiments, using randomization statistical inference to test the null hypothesis of no treatment effect in a […] The post 2 new reasons not to trust published p-values: You won’t believe what…

Read more »

2 new reasons not to trust published p-values: You won’t believe what this rogue economist has to say.

January 24, 2016
By

Political scientist Anselm Rink points me to this paper by economist Alwyn Young which is entitled, “Channelling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results,” and begins, I [Young] follow R.A. Fisher’s The Design of Experiments, using randomization statistical inference to test the null hypothesis of no treatment effect in a […] The post 2 new reasons not to trust published p-values: You won’t believe what…

Read more »

Strippers, JFK, Stalin, and the Oxford Comma

January 24, 2016
By
Strippers, JFK, Stalin, and the Oxford Comma

Maybe everyone already knows about the Oxford comma and the crazy stripper thing. I just learned about them. Anyway, here goes. Consider (1) "x, y and z" vs. (2) "x, y, and z". The difference is that (2) has an extra comma before "and". I always th...

Read more »

This graph is so ugly—and you’ll never guess where it appeared

January 23, 2016
By
This graph is so ugly—and you’ll never guess where it appeared

Raghu Parthasarathy writes: I know you’re sick of seeing / being pointed to awful figures, but this one is an abomination of a sort I’ve never seen before: It’s a pie chart *and* a word cloud. In an actual research paper! Messy, illegible, and generally pointless. It’s Figure 1 of this paper (in Cell — […] The post This graph is so ugly—and you’ll never guess where it appeared appeared…

Read more »

Introduction to MCMC

January 23, 2016
By
Introduction to MCMC

Markov Chain Monte Carlo (MCMC) is a technique for getting your work done when Monte Carlo won’t work. The problem is finding the expected value of f(X) where X is some random variable. If you can draw independent samples xi from X, the solution is simple: When it’s possible to draw these independent samples, the sum above is well […]

Read more »

One quick tip for building trust in missing-data imputations?

January 22, 2016
By

Peter Liberman writes: I’m working on a paper that, in the absence of a single survey that measured the required combination of variables, analyzes data collected by separate, uncoordinated Knowledge Networks surveys in 2003. My co-author (a social psychologist who commissioned one of the surveys) and I obtained from KN unique id numbers for all […] The post One quick tip for building trust in missing-data imputations? appeared first on…

Read more »

Modelling With the Generalized Hermite Distribution

January 22, 2016
By
Modelling With the Generalized Hermite Distribution

"Count" data occur frequently in economics. These are simply data where the observations are integer-valued - usually 0, 1, 2, ....... . However, the range of values may be truncated (e.g., 1, 2, 3, ....).To model data of this form we typically resort ...

Read more »

Kéry and Schaub’s Bayesian Population Analysis Translated to Stan

January 21, 2016
By
Kéry and Schaub’s Bayesian Population Analysis Translated to Stan

Hiroki ITÔ (pictured) has done everyone a service in translating to Stan the example models [update: only chapters 3–9 so far, not the whole book; the rest are in the works] from Marc Kéry and Michael Schaub (2012) Bayesian Population Analysis using WinBUGS: A Hierarchical Perspective. Academic Press. You can find the code in our […] The post Kéry and Schaub’s Bayesian Population Analysis Translated to Stan appeared first on…

Read more »

Parallel BLAS in R

January 21, 2016
By

I'm working on a new chapter for my R Programming book and the topic is parallel computation. So, I was happy to see this tweet from David Robinson (@drob) yesterday: How fast is this #rstats code? x <- replicate(5e3, rnorm(5e3)) x %*% t(x) For me, w/Microsoft R Open, 2.5sec. Wow. https://t.co/0SbijNxxVa — David Robinson (@drob)

Read more »

Optimism with Data

January 21, 2016
By
Optimism with Data

What will our future be like? Is there no or some hope that things evolve in a good direction? Will we make progress? Data play a crucial role in answering these questions. Steven Pinker (Harvard University, Department of Psychology) in his answer to the EDGE question of 2016 considers that Quantifying Human Progress is the most interesting recent (scientific) news: ‘But … Continue reading Optimism with Data

Read more »

If you’re using Stata and you want to do Bayes, you should be using StataStan

January 21, 2016
By
If you’re using Stata and you want to do Bayes, you should be using StataStan

Robert Grant, Daniel Furr, Bob Carpenter, and I write: Stata users have access to two easy-to-use implementations of Bayesian inference: Stata’s native bayesmh function and StataStan, which calls the general Bayesian engine Stan. We compare these on two models that are important for education research: the Rasch model and the hierarchical Rasch model. Stan (as […] The post If you’re using Stata and you want to do Bayes, you should…

Read more »

Irritating pseudo-populism, backed up by false statistics and implausible speculations

January 21, 2016
By

I was preparing my lecture for tomorrow and happened to come across this post from five years ago. And now I’m irritated by Matt Ridley all over again! I wonder if he’s still bashing “rich whites” and driving that 1975 Gremlin...

Read more »

Rogue sociologist can’t stop roguin’

January 20, 2016
By

Mark Palko points me to two posts by Paul Campos (here and here) on this fascinating train wreck of a story. What happens next? It was ok that George Orwell and A. J. Liebling and David Sedaris made stuff up because they’re such good writers. And journalists make up quotes all the time. But who’s […] The post Rogue sociologist can’t stop roguin’ appeared first on Statistical Modeling, Causal Inference,…

Read more »

Win-Vector data science mailing list (and a give-away!)

January 20, 2016
By

Win-Vector LLC is starting a data science mailing list that we would like you to sign up for. It is going to be a (deliberately infrequent) set of updates including Win-Vector LLC notices, upcoming speaking events, and data science products. To kick this off we will be awarding 5 free permanent subscriptions to our video … Continue reading Win-Vector data science mailing list (and a give-away!)

Read more »

Prepping Data for Analysis using R

January 20, 2016
By
Prepping Data for Analysis using R

Nina and I are proud to share our lecture: “Prepping Data for Analysis using R” from ODSC West 2015. Nina Zumel and John Mount ODSC WEST 2015 It is about 90 minutes, and covers a lot of the theory behind the vtreat data preparation library. We also have a Github repository including all the lecture … Continue reading Prepping Data for Analysis using R

Read more »

Jim Albert’s Baseball Blog

January 20, 2016
By
Jim Albert’s Baseball Blog

Jim Albert has a baseball blog: Baseball with R I sent a link internally to people I knew were into baseball, to which Andrew replied, “I agree that it’s cool that he doesn’t just talk, he has code.” (No kidding—the latest post as of writing this was on an R package to compute value above […] The post Jim Albert’s Baseball Blog appeared first on Statistical Modeling, Causal Inference, and…

Read more »

Time-Varying Dynamic Factor Loadings

January 20, 2016
By

Check out Mikkelsen et al. (2015).  I've always wanted to try high-dimensional dynamic factor models (DFM's) with time-varying loadings as an approach to network connectedness measurement (e.g., increasing connectedness would correspond to increas...

Read more »

My talk Fri 1pm at the University of Chicago

January 20, 2016
By

It’s the Data Science and Public Policy colloquium, and they asked me to give my talk, Little Data: How Traditional Statistical Ideas Remain Relevant in a Big-Data World. Here’s the abstract: “Big Data” is more than a slogan; it is our modern world in which we learn by combining information from diverse sources of varying […] The post My talk Fri 1pm at the University of Chicago appeared first on…

Read more »

Banking to 45 degrees: Aspect ratios for time series plots

January 20, 2016
By
Banking to 45 degrees: Aspect ratios for time series plots

In SAS, the aspect ratio of a graph is the physical height of the graph divided by the physical width. Recently I demonstrated how to set the aspect ratio of graphs in SAS by using the ASPECT= option in PROC SGPLOT or by using the OVERLAYEQUATED statement in the Graph […] The post Banking to 45 degrees: Aspect ratios for time series plots appeared first on The DO Loop.

Read more »


Subscribe

Email:

  Subscribe