Big Data, Open Data and Official Statistics

April 21, 2013
By
Big Data, Open Data and Official Statistics

There are (at least) two big challenges official statistics will be faced with in the  next few years and which will possibly …Continue reading »

Read more »

Exponential increase in the number of stat majors

April 21, 2013
By
Exponential increase in the number of stat majors

Joe Blitztein sent around the following graph: (The x-axis goes from 2000 to 2012 and the y=axis goes from 0 to 120.) 100 statistics majors (this combines sophomores, juniors, and seniors, but still, that’s a lot more than the 1 or 2 or 3 a year we’re used to seeing). At first I was like, [...]The post Exponential increase in the number of stat majors appeared first on Statistical Modeling,…

Read more »

Ordinal data, models with observers

April 21, 2013
By

I recently made three posts regarding analysis of ordinal data. A post looking at all methods I could find in R, a post with an additional method and a post using JAGS. Common in all three was using the cheese data, a data set where...

Read more »

In three months, I’ll be in Vegas (trying to win against the house)

April 21, 2013
By
In three months, I’ll be in Vegas (trying to win against the house)

In fact, I’m going there with my family and some friends, including two probabilists (I mean professionals, I am merely an amateur), with this incredible challenge: will I be able to convince  probabilists to go to play at the Casino? Actually, I also want to study them carefully, to understand how we should play optimally. For example, I hope I can make them play the roulette. Roulette is simple. With…

Read more »

My new forecasting book is finally finished

April 21, 2013
By
My new forecasting book is finally finished

My new online forecasting book (written with George Athanasopoulos) is now completed. I previously described it on this blog nearly a year ago. In reality, an online book is never complete, and we plan to continually update it. But it is now at the poi...

Read more »

Metric driven Agile for Big Data

April 20, 2013
By
Metric driven Agile for Big Data

Working in Bing Local Search brings together a number of interesting challenges. Firstly, we are in a moderately sized organization, which means that our org chart has some rough similarities to our high level system architecture. This means that we...

Read more »

A mess with which I am comfortable

April 20, 2013
By

Having established that survey weighting is a mess, I should also acknowledge that, by this standard, regression modeling is also a mess, involving many arbitrary choices of variable selection, transformations and modeling of interaction. Nonetheless, regression modeling is a mess with which I am comfortable and, perhaps more relevant to the discussion, can be extended [...]The post A mess with which I am comfortable appeared first on Statistical Modeling, Causal…

Read more »

Displaying inferences from complex models

April 20, 2013
By

David Williams writes: I am completing my doctoral dissertation dealing with modeling adverse birth outcomes. The models are complex with 9 risk factors, 5 area level variables and 4 individual level variables. I used hierarchical logistic regression (SAS glimmix) to analyze the data. I am now faced with reporting the results. Can you please recommend [...]The post Displaying inferences from complex models appeared first on Statistical Modeling, Causal Inference, and…

Read more »

R: Basic Mathematical Functions

April 20, 2013
By

R can perform the usual mathematical operations, below are the functions: Arithmetic +    - addition -    - subtraction *    - multiplication /    - division Trigonometry sin &nbsp...

Read more »

Stephen Senn: When relevance is irrelevant

April 20, 2013
By
Stephen Senn: When relevance is irrelevant

(guest post) When Relevance is Irrelevant, by Stephen Senn Head of Competence Center for Methodology and Statistics (CCMS) Applied statisticians tend to perform analyses on additive scales and additivity is an important aspect of an analysis to try to check. Consider survival analysis. The most important model used, the default in many cases, is the […]

Read more »

Grad students: Participate in an online survey on statistics education

April 19, 2013
By

Joan Garfield, a leading researcher in statistics education, is conducting a survey of graduate students who teach or assist with the teaching of statistics. She writes: We want to invite them to take a short survey that will enable us to collect some baseline data that we may use in a grant proposal we are [...]The post Grad students: Participate in an online survey on statistics education appeared first on…

Read more »

Most squares

April 19, 2013
By
Most squares

Sometimes you try really hard to ask interesting questions, so that people think that you're clever and interesting too, that you actually don't realise how dumb you're being just for asking those very same questions.For example, a few years ago, Marta...

Read more »

Podcast #7: Reinhart, Rogoff, Reproducibility

April 19, 2013
By

Jeff and I talk about the recent Reinhart-Rogoff reproducibility kerfuffle and how it turns out that data analysis is really hard no matter how big the dataset.

Read more »

Data Science, Machine Learning, and Statistics: what is in a name?

April 19, 2013
By
Data Science, Machine Learning, and Statistics: what is in a name?

A fair complaint when seeing yet another “data science” article is to say: “this is just medical statistics” or “this is already part of bioinformatics.” We certainly label many articles as “data science” on this blog. Probably the complaint is slightly cleaner if phrased as “this is already known statistics.” But the essence of the […] Related posts: A Personal Perspective on Machine Learning Setting expectations in data science projects…

Read more »

Chomsky chomsky chomsky chomsky furiously

April 19, 2013
By
Chomsky chomsky chomsky chomsky furiously

Noam Chomsky elicits a lot of emotional reactions. I’ve talked with some linguists who think Chomsky’s been a real roadblock to research in recent decades. Other linguists love Chomsky, but I think they’re the kind of linguists I wouldn’t spend much time talking with. Many people admire Chomsky’s political activism, but sociologist blogger Fabio Rojas [...]The post Chomsky chomsky chomsky chomsky furiously appeared first on Statistical Modeling, Causal Inference, and…

Read more »

FDA endorses masking placebos as proven drugs as a get-rich-quick scheme

April 19, 2013
By

That's not really what the FDA said but based on the shameful actions unearthed by ProPublica and reported by Scientific American here, one would think one can get away with this scam. Three years ago, the FDA busted a Houston lab (based on a whistleblower report) that fabricated loads of research studies that were used by the FDA to approve about 100 drugs. Eighty-percent of those drugs were generic drugs…

Read more »

Amazon AWS Summit 2013

April 19, 2013
By
Amazon AWS Summit 2013

I was fortunate enough to have been able to attend the Amazon AWS Summit in NYC and to listen to Werner Vogels give the keynote.  I will share a few of my thoughts on the AWS 2013 Summit and some of my take-aways.  I attended sessions that focused on two products in particular: Redshift and [...]

Read more »

Le Monde puzzle [#817]

April 18, 2013
By
Le Monde puzzle [#817]

The weekly Le Monde puzzle is (again) a permutation problem that can be rephrased as follows: Find where denotes the set of permutations on {0,…,10} and is defined modulo 11 [to turn {0,...,10} into a torus]. Same question for and for This is rather straightforward to code if one adopts a brute-force approach:: (where I […]

Read more »

Gender Balance in Conferences on Data Visualization

April 18, 2013
By
Gender Balance in Conferences on Data Visualization

Gender Balance [stefaner.eu] by Moritz Stefaner provides an eye-opening look on how well females are represented as speakers for conferences on data visualization, creative code and information graphics. Based on Andy Kirk's data visualization censu...

Read more »

Using ggplot2 to recreate 2012 Best Cities Results

April 18, 2013
By
Using ggplot2 to recreate 2012 Best Cities Results

Data from: http://images.businessweek.com/slideshows/2012-09-26/americas-50-best-citiesThis time, I used ggplot2 to recreate the graphs created previously using Tableau. After all, R and ggplot2 are open source and free. Yes, I could leave a space...

Read more »

Moments of mixtures

April 18, 2013
By
Moments of mixtures

I needed to compute the higher moments of a mixture distribution for a project I’m working on. I’m writing up the code here in case anyone else finds this useful. (And in case I’ll find it useful in the future.)…Read more ›

Read more »

The Art of R programming review – part 9

April 18, 2013
By
The Art of R programming review – part 9

It's finally come to this - the last installment of the ground breaking, epic and superlative book review. Let's jump right into it! The last three chapters of the book, which I will briefly touch upon, deal with performance enhancement, which is an im...

Read more »

Using multilevel models to get accurate inferences for repeated measures ANOVA designs

April 18, 2013
By
Using multilevel models to get accurate inferences for repeated measures ANOVA designs

It is now increasingly common for experimental psychologists (among others) to use multilevel models (also known as linear mixed models) to analyze data that used to be shoe-horned into a repeated measures ANOVA design. Chapter 18 of Serious Stats introduces multilevel models by considering them as an extension of repeated measures ANOVA models that can […]

Read more »


Subscribe

Email:

  Subscribe