Easily generate correlated variables from any distribution

February 27, 2014
By
Easily generate correlated variables from any distribution

In this post I will demonstrate in R how to draw correlated random variables from any distributionThe idea is simple.  1. Draw any number of variables from a joint normal distribution. 2. Apply the univariate normal CDF of variables to derive pro...

Read more »

More time series data online

February 27, 2014
By
More time series data online

Earlier this week I had coffee with Ben Fulcher who told me about his online collection comprising about 30,000 time series, mostly medical series such as ECG measurements, meteorological series, birdsong, etc. There are some finance series, but not ma...

Read more »

Phil6334: Feb 24, 2014: Induction, Popper and pseudoscience (Day #4)

February 27, 2014
By
Phil6334: Feb 24, 2014: Induction, Popper and pseudoscience (Day #4)

Phil 6334* Day #4: Mayo slides follow the comments below. (Make-up for Feb 13 snow day.) Popper reading is from Conjectures and Refutations. As is typical in rereading any deep philosopher, I discover (or rediscover) different morsals of clues to understanding—whether fully intended by the philosopher or a byproduct of their other insights, and a more contemporary reading. […]

Read more »

Drowning in insignificance

February 26, 2014
By
Drowning in insignificance

Some researchers (in both science and marketing) abuse a slavish view of p-values to try and falsely claim credibility. The incantation is: “we achieved p = x (with x ≤ 0.05) so you should trust our work.” This might be true if the published result had been performed as a single project (and not as […] Related posts: Bayesian and Frequentist Approaches: Ask the Right Question Worry about correctness and…

Read more »

Taking a Random Sample on Amazon Redshift

February 26, 2014
By

Recently, I was approached by Vicky whom I'm working with at a client, to help with a particular problem.  She wanted to calculate page view summaries for a random sample of visitors from a table containing about a billion page views.  This i...

Read more »

A good comment on one of my papers

February 26, 2014
By

An anonymous reviewer wrote: I appreciate informal writing styles as a means of increasing accessibility. However, the informality here seems to decrease accessibility – partly because of the assumed knowledge of the reader for concepts and terms, and also for its wandering style. Many concepts are introduced without explanation and are not clearly and decisively […]The post A good comment on one of my papers appeared first on Statistical Modeling,…

Read more »

Econometrics, political science, epidemiology, etc.: Don’t model the probability of a discrete outcome, model the underlying continuous variable

February 26, 2014
By

This is an echo of yesterday’s post, Basketball Stats: Don’t model the probability of win, model the expected score differential. As with basketball, so with baseball: as the great Bill James wrote, if you want to predict a pitcher’s win-loss record, it’s better to use last year’s ERA than last year’s W-L. As with basketball […]The post Econometrics, political science, epidemiology, etc.: Don’t model the probability of a discrete outcome,…

Read more »

Data Science is Hard, But So is Talking

February 26, 2014
By

Jeff, Brian, and I had to record nine separate introductory videos for our Data Science Specialization and, well, some of us were better at it than others. It takes a bit of practice to read effectively from a teleprompter, something … Continue reading →

Read more »

Good guys in sports need a dose of reality

February 26, 2014
By

I will be speaking at the Agilone Data Driven Marketing Summit (link) in San Francisco on Thursday. I will be talking about hiring for numbersense. Drop by if you are in the area. Future events are listed on the right column of the blog >>> *** I feel bad piling on the "good guys" in the sports doping spectacle but sometimes, you need someone to point you to the mirror.…

Read more »

How to automatically select a smooth curve for a scatter plot in SAS

February 26, 2014
By
How to automatically select a smooth curve for a scatter plot in SAS

My last blog post described three ways to add a smoothing spline to a scatter plot in SAS. I ended the post with a cautionary note: From a statistical point of view, the smoothing spline is less than ideal because the smoothing parameter must be chosen manually by the user. [...]

Read more »

Winner of the Febrary 2014 palindrome contest (rejected post)

February 26, 2014
By
Winner of the Febrary 2014 palindrome contest (rejected post)

Winner of February 2014 Palindrome Contest Samuel Dickson Palindrome: Rot, Cadet A, I’ve droned! Elba, revile deviant, naïve, deliverable den or deviated actor. The requirement was: A palindrome with Elba plus deviate with an optional second word: deviant. A palindrome that uses both deviate and deviant tops an acceptable palindrome that only uses deviate. Bio: Sam Dickson is […]

Read more »

Improved evolution of correlations

February 26, 2014
By

Update June 2013: A systematic analysis of the topic has been published:Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47, 609-612. doi:10.1016/j.jrp.2013.05.009 Check ...

Read more »

The first CREDAM Award for creative data management goes to … the German government!

February 26, 2014
By

“If you torture the data long enough, it will confess.” This aphorism, attributed to Ronald Coase, sometimes has been used in a disrespective manner, as if was wrong to do creative data analysis. This view obviously is misleading. In contra...

Read more »

Further thoughts on post-publication peer review (PPPR)

February 26, 2014
By

Sanjay Srivastava blogged some interesting thoughts about the process of post-publication peer review (PPPR), reflecting about his own comment on a PLOS ONE publication. I agree that open peer commentaries after publication are one important part of th...

Read more »

Installation of WRS package (Wilcox’ Robust Statistics)

February 26, 2014
By

Update Feb 17, 2014: WRS moved to Github – This installation procedure has been updated and still is valid Some users had trouble installing the WRS package from R-Forge. Here’s a method that should work automatically and fail-safe: [cc lan...

Read more »

At what sample size do correlations stabilize?

February 26, 2014
By

Maybe you have encountered this situation: you run a large-scale study over the internet, and out of curiosity, you frequently  the correlation between two variables. My experience with this practice is usually frustrating, as in small sample sizes (a...

Read more »

Finally! Tracking CRAN packages downloads

February 26, 2014
By

[Update June 12: Data.tables functions have been improved (thanks to a comment by Matthew Dowle); for a similar approach see also Tal Galili's post] The guys from RStudio now provide CRAN download logs (see also this blog post). Great work! I always as...

Read more »

Exploring the robustness of Bayes Factors: A convenient plotting function

February 26, 2014
By

One critique frequently heard about Bayesian statistics is the subjectivity of the assumed prior distribution. If one is cherry-picking a prior, of course the posterior can be tweaked, especially when only few data points are at hand. For example, see ...

Read more »

New robust statistical functions in WRS package – Guest post by Rand Wilcox

February 26, 2014
By

Today a new version (0.23.1) of the WRS package (Wilcox’ Robust Statistics) has been released. This package is the companion to his rather exhaustive book on robust statistics, “Introduction to Robust Estimation and Hypothesis Testing”...

Read more »

Interactive exploration of a prior’s impact

February 26, 2014
By

The probably most frequent criticism of Bayesian statistics sounds something like “It’s all subjective – with the ‘right’ prior, you can get any result you want.”. In order to approach this criticism it has been sugg...

Read more »

Applied Statistics Lesson of the Day – The Matched-Pair (or Paired) t-Test

Applied Statistics Lesson of the Day – The Matched-Pair (or Paired) t-Test

My last lesson introduced the matched pairs experimental design, which is a special type of the randomized blocked design.  Let’s now talk about how to analyze the data from such a design. Since the experimental units are organized in pairs, the units between pairs (blocks) are not independently assigned.  (The units within each pair are […]

Read more »

Nonlinear Time Series just appeared

February 25, 2014
By
Nonlinear Time Series just appeared

My friends Randal Douc and Éric Moulines just published this new time series book with David Stoffer. (David also wrote Time Series Analysis and its Applications with Robert Shumway a year ago.) The books reflects well on the research of Randal and Éric over the past decade, namely convergence results on Markov chains for validating […]

Read more »

Useful for referring—2-25-2014

February 25, 2014
By
Useful for referring—2-25-2014

Interview with Nick Chamandy, statistician at Google You and Your Research +  video Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained A Survival Guide to Starting and Finishing a PhD Six Rules For Wearing Suits For Beginners Why I Created C++ More advice to scientists on blogging Software engineering practices for graduate students Statistics Matter […]

Read more »


Subscribe

Email:

  Subscribe