I love massive open online courses such as provided on Coursera and edX. So I enrolled in the Data Analysis for Genomics course on edX. I am not alone there as seen from this posting on FreshBiostats.I was shocked when I took the Pre-Course R self-asse...

The previous post claimed it’s reasonable to expect frequencies in binary experiments to be near .5 simply because that’s what most possible outcomes lead to. Reasonable or not, there’s no guarantee it’ll happen however. If 1% o...

A reader writes in: This op-ed made me think of one your recent posts. Money quote: If you are primarily motivated to make money, you just need to get as much information as you need to do your job. You don’t have time for deep dives into abstract matters. You certainly don’t want to let […] The post “If you are primarily motivated to make money, you . . .…

“There was a vain and ambitious hospital director. A bad statistician. ..There were good medics and bad medics, good nurses and bad nurses, good cops and bad cops … Apparently, even some people in the Public Prosecution service found the witch hunt deeply disturbing.” This is how Richard Gill, statistician at Leiden University, describes a […]

This bit is perhaps worth saying again, especially given the occasional trolling on the internet by people who disparage their ideological opponents by calling them “religious” . . . So here it is: Sometimes the choice of statistical philosophy is decided by convention or convenience. . . . In many settings, however, we have freedom […] The post “Schools of statistical thoughts are sometimes jokingly likened to religions. This analogy…

A linguist send me an email with the above title and a link to a paper, “The Effect of Language on Economic Behavior: Evidence from Savings Rates, Health Behaviors, and Retirement Assets,” by M. Keith Chen, which begins: Languages differ widely in the ways they encode time. I test the hypothesis that languages that grammatically […] The post “More research from the lunatic fringe” appeared first on Statistical Modeling, Causal…

It strikes me that the media loves to talk about probability, a subject about which journalists are ill-trained to write. The latest example of this is Forbes' attempt to draw a lesson out of the Warren Buffett's gimmicky $1 billion NCAA pool. As we all learned, by the time the 25th match drew to a close, all 8.7 million entrants have gotten at least one winner wrong, thus there would…

Introduction A while ago, one of my co-workers asked me to group box plots by plotting them side-by-side within each group, and he wanted to use patterns rather than colours to distinguish between the box plots within a group; the publication that will display his plots prints in black-and-white only. I gladly investigated how to […]

Kaiser Fung shares this graph from Ritchie King: Kaiser writes: What they did right: - Did not put the data on a map - Ordered the countries by the most recent data point rather than alphabetically - Scale labels are found only on outer edge of the chart area, rather than one set per panel […] The post Small multiples of lineplots > maps (ok, not always, but yes in…

Editor's note: This is a guest post by Alyssa Frazee, a graduate student in the Biostatistics department at Johns Hopkins and a participant in the recent rOpenSci hackathon. Last week, I took a break from my normal PhD student schedule … Continue reading →

For those who weren't able to attend my recent talks, a few have surfaced online. *** JMP put up the video of the webcast from last Friday with Alberto Cairo, a data visualization expert and author of The Functional Art. You can access it from here. This event is part of their Analytically Speaking series with recent guests such as David Hand and Michael Schrage. I also appear on this…

Yesterday I blogged about the Hilbert matrix. The (i,j)th element of the Hilbert matrix has the value 1 / (i+j-1), which is the reciprocal of an integer. However, the printed Hilbert matrix did not look exactly like the formula because the elements print as finite-precision decimals. For example, the last […]

I recently introduced the use of linear basis function models for supervised learning problems that involve non-linear relationships between the predictors and the target. A common type of basis function for such models is the Gaussian basis function. This type of model uses the kernel of the normal (or Gaussian) probability density function (PDF) as […]

Consider again an experiment that seeks to determine the causal relationships between factors and the response, where . Ideally, the sample size is large enough for a full factorial design to be used. However, if the sample size is small and the number of possible treatments is large, then a fractional factorial design can be used instead. Such a […]