Beware of questionable front page articles warning you to beware of questionable front page articles (iii)

November 10, 2013
By

In this time of government cut-backs and sequester, scientists are under increased pressure to dream up ever new strategies to publish attention-getting articles with eye-catching, but inadequately scrutinized, conjectures. Science writers are under similar pressures, and to this end they have found a way to deliver up at least one fire-breathing, front page article a […]

Keynote speaker

November 9, 2013
By

Earlier today, I was trying to finish preparing the poster for the Clinical Trials Methodology Conference $-$ I'll have both the poster presentation (on the Expected Value of Information under mixed strategies) and my talk on the Stepped Wedge des...

Typo in Ghitza and Gelman MRP paper

November 9, 2013
By

Devin Caughey points out a typo in the second column of page 765 of our AJPS paper. Here’s what we have: The typo is in the third line of the second paragraph above. Where it says y^*_j = y.bar^*_j n_j, it should be y^*_j = y.bar^*_j n^*_j. One frustrating system of the current system of […]The post Typo in Ghitza and Gelman MRP paper appeared first on Statistical Modeling, Causal…

Multicollinearity and collinearity (in multiple regression) – a tutorial

November 9, 2013
By

This blog post was written for undergraduate research methods teaching. I have therefore tried to keep everything relatively simple and equation-free. The content is based loosely on more detailed material in my book Serious stats. What are collineari...

Null Effects and Replication

November 9, 2013
By

Filed under: Comedy, Error Statistics, Statistics

Maximum Likelihood versus Goodness of Fit

November 9, 2013
By
$\{X_1,\cdots,X_n\}$

Thursday, I got an interesting question from a colleague of mine (JP). I mean, the way I understood the question turned out to be a nice puzzle (but I have to confess I might have misunderstood). The question is the following : consider a i.i.d. sample of continuous variables. We would like to choose between two (parametric) families for the distribution,  and . If we use maximum likelihood techniques, we…

Key Driver vs. Network Analysis in R

November 8, 2013
By

When marketing researchers speak of driver analysis, they are referring to an input-output model with overall satisfaction as the output and performance ratings of specific product and service components as the inputs. The causal model is straightforwa...

Generating functions

November 8, 2013
By
$F(x)=1-e^{-x}/3$

Today, I wanted to publish a post on generating functions, based on discussions I had with Jean-Francois while having our coffee after lunch a couple of times already. The other reason is that I publish my post while my student just finished their Probability exam (and there were a few questions on generating functions). A short introduction (back on a specific exercise) In the Probability exam, I included an exercise we’ve…

Translating between R and SQL: the basics

November 8, 2013
By

An introductory comparison of using the two languages. Background R was made especially for data analysis and graphics.  SQL was made especially for databases.  They are allies. The data structure in R that most closely matches a SQL table is a data frame.  The terms rows and columns are used in both. A mashup There […] The post Translating between R and SQL: the basics appeared first on Burns Statistics.

A day with the news!

November 8, 2013
By

One great thing about working in statistics and political science is, between them, these two subjects are connected to just about everything. From the day’s news (sort of): Pat Robertson Thinks Low-Carb Diets Violate God’s Principles: I wonder what Art De Vany will think of this. I had the impression that lo-carb is vaguely connected […]The post A day with the news! appeared first on Statistical Modeling, Causal Inference, and…

Financial Data Accessible from R – part III

November 8, 2013
By

I came across a new source of data which I think is really worth sharing: ThinkNum. It gathers around 2,000 sources of data but more importantly it allows the user to manipulate this data via functions and graphics and there is an R package available on CRAN. Interested readers can find a very good post […]

Rescue remedy

November 7, 2013
By

Interesting day, today. I woke up really early (3.45am) to catch my flight to Amsterdam to give my talk at the Chemometrics Workshop. The cab got me to the airport early enough so that I could clear security, have a coffee and slowly make my way to the...

Replication of few graphs/charts in base R, ggplot2, and rCharts

November 7, 2013
By

UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS: http://bit.ly/1jJ6f7v. PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.In this post, I use a simulated dataset (7 variables -3 factor a...

The e-Writing Jungle Part 1: LaTeX to pdf to the Web

November 7, 2013
By

LaTeX and MathML and MathJax and Python and Sphinx and IPython and R and Knitter and Firefox and Chrome and ...My head is spinning with all this stuff. Maybe yours is too.One thing is clear: The traditional academic book publishing paradigm (broadly de...

I’m negative on the expression “false positives”

November 7, 2013
By

After seeing a document sent to me and others regarding the crisis of spurious, statistically-significant research findings in psychology research, I had the following reaction: I am unhappy with the use in the document of the phrase “false positives.” I feel that this expression is unhelpful as it frames science in terms of “true” and […]The post I’m negative on the expression “false positives” appeared first on Statistical Modeling, Causal…

Nix on the expression “false positives”

November 7, 2013
By

After seeing a document sent to me and others regarding the crisis of spurious, statistically-significant research findings in psychology research, I had the following reaction: I am unhappy with the use in the document of the phrase “false positives.” I feel that this expression is unhelpful as it frames science in terms of “true” and […]The post Nix on the expression “false positives” appeared first on Statistical Modeling, Causal Inference,…

Data visualizations gone beautifully wrong

November 7, 2013
By

Jeremy Fox points us to this compilation of data visualizations in R that went wrong, in a way that ended up making them look like art. They are indeed wonderful. The post Data visualizations gone beautifully wrong appeared first on Statistical Modelin...

Light entertainment: when up is down (double bill)

November 7, 2013
By

Abhinav takes us to this chart: Only in politics. *** A co-worker reminds me of this gem by Fox News a few years back:

Tapestry 2014 Announced

November 7, 2013
By

After a very successful Tapestry conference in February this year, we have been getting a steady stream of questions from people about another event next year. Now we're finally able to announce next year's event. And it will be awesome, again.

“Marginally significant”

November 6, 2013
By

Jeremy Fox writes: You’ve probably seen this [by Matthew Hankins]. . . . Everyone else on Twitter already has. It’s a graph of the frequency with which the phrase “marginally significant” occurs in association with different P values. Apparently it’s real data, from a Google Scholar search, though I haven’t tried to replicate the search […]The post “Marginally significant” appeared first on Statistical Modeling, Causal Inference, and Social Science.

Correspondence Analysis in R

November 6, 2013
By

Correspondence analysis (from a layman’s perspective) is like principal components analysis for categorical data. It can be useful to discover structure in this type of data. My friend Gianmarco Alberti, an archaeologist, has put together an in depth web site … Continue reading →

A Mitochondrial Manhattan Plot

November 6, 2013
By

Manhattan plots have become the standard way to visualize results for genetic association studies, allowing the viewer to instantly see significant results in the rough context of their genomic position.  Manhattan plots are typically shown on a l...

If we observed p_hat = .46, why do we use p=.5?

November 6, 2013
By

I aim to commit statistical sin. I’m going to accept the null hypothesis for no other reason than because I “failing to reject it”. Having tarnished my reputation with that, I’ll finish by ignoring the only data available and ba...