There are (at least) two big challenges official statistics will be faced with in the next few years and which will possibly

Joe Blitztein sent around the following graph: (The x-axis goes from 2000 to 2012 and the y=axis goes from 0 to 120.) 100 statistics majors (this combines sophomores, juniors, and seniors, but still, that's a lot more than the 1 or 2 or 3 a year we're used to seeing). At first I was like,

I recently made three posts regarding analysis of ordinal data. A post looking at all methods I could find in R, a post with an additional method and a post using JAGS. Common in all three was using the cheese data, a data set where...

In fact, I’m going there with my family and some friends, including two probabilists (I mean professionals, I am merely an amateur), with this incredible challenge: will I be able to convince probabilists to go to play at the Casino? Actually, I also want to study them carefully, to understand how we should play optimally. For example, I hope I can make them play the roulette. Roulette is simple. With…

Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended

David Williams writes: I am completing my doctoral dissertation dealing with modeling adverse birth outcomes. The models are complex with 9 risk factors, 5 area level variables and 4 individual level variables. I used hierarchical logistic regression (SAS glimmix) to analyze the data. I am now faced with reporting the results. Can you please recommend

R can perform the usual mathematical operations, below are the functions: Arithmetic + - addition - - subtraction * - multiplication / - division Trigonometry sin  ...

(guest post) When Relevance is Irrelevant, by Stephen Senn Head of Competence Center for Methodology and Statistics (CCMS) Applied statisticians tend to perform analyses on additive scales and additivity is an important aspect of an analysis to try to check. Consider survival analysis. The most important model used, the default in many cases, is the […]

Joan Garfield, a leading researcher in statistics education, is conducting a survey of graduate students who teach or assist with the teaching of statistics. She writes: We want to invite them to take a short survey that will enable us to collect some baseline data that we may use in a grant proposal we are

Jeff and I talk about the recent Reinhart-Rogoff reproducibility kerfuffle and how it turns out that data analysis is really hard no matter how big the dataset.

A fair complaint when seeing yet another "data science" article is to say: "this is just medical statistics" or "this is already part of bioinformatics." We certainly label many articles as "data science" on this blog. Probably the complaint is slightly cleaner if phrased as "this is already known statistics." But the essence of the

Noam Chomsky elicits a lot of emotional reactions. I've talked with some linguists who think Chomsky's been a real roadblock to research in recent decades. Other linguists love Chomsky, but I think they're the kind of linguists I wouldn't spend much time talking with. Many people admire Chomsky's political activism, but sociologist blogger Fabio Rojas

That's not really what the FDA said but based on the shameful actions unearthed by ProPublica and reported by Scientific American here, one would think one can get away with this scam. Three years ago, the FDA busted a Houston lab (based on a whistleblower report) that fabricated loads of research studies that were used by the FDA to approve about 100 drugs. Eighty-percent of those drugs were generic drugs…

The weekly Le Monde puzzle is (again) a permutation problem that can be rephrased as follows: Find where denotes the set of permutations on {0,…,10} and is defined modulo 11 [to turn {0,...,10} into a torus]. Same question for and for This is rather straightforward to code if one adopts a brute-force approach:: (where I […]

It is now increasingly common for experimental psychologists (among others) to use multilevel models (also known as linear mixed models) to analyze data that used to be shoe-horned into a repeated measures ANOVA design. Chapter 18 of Serious Stats introduces multilevel models by considering them as an extension of repeated measures ANOVA models that can