Hier, sur Twitter, @JF_Godbout partageait un joli graphique relatif aux élections québécoises, avec les nombres de votes obtenus (ici en pourcentage des votes totaux) et le pourcentage de sièges que cela donne, Il faut dire qu’hier, c’...

District Data Labs is a new endeavor by members of the local data community (myself included) to increase educational outreach about data-related topics through workshops and other media to the local data community. We want District Data Labs to be an efficient learning resource for people who want to enhance and expand their analytical and […]

Recently, I was asked:Why do you not recommend Access to use? Just curious. Read on page xi of your intro in Data Analysis Using SQL and Excel. Just beginning a class in SQL and bought your text. Thanks, MortThis is a very fair question and o...

Joshua Vogelstein pointed me to this post by Michael Nielsen on how to teach Simpson’s paradox. I don’t know if Nielsen (and others) are aware that people have developed some snappy graphical methods for displaying Simpson’s paradox (and, more generally, aggregation issues). We do some this in our Red State Blue State book, but before […] The post Understanding Simpson’s paradox using a graph appeared first on Statistical Modeling, Causal…

I enjoy reading the Graphically Speaking blog because it teaches me a lot about ODS statistical graphics, especially features of the SGPLOT procedure and the Graph Template Language (GTL). Yesterday Sanjay blogged about how to construct a stacked bar chart of percentages so that each bar represents 100%. His chart […]

I recently looked at the strategy that invests in the components of S&P/TSX 60 index, and discovered that there are some abnormal jumps/drops in historical data that I could not explain. To help me spot these points and remove them, I created a helper function data.clean() function in data.r at github. Following is an example […]

Every once in a while we see computational papers published in science journals with high impact factors. Genomics related methods appear quite often in these journals. Several of my junior colleagues express frustration that all their papers get rejected from these journals. … Continue reading →

In “Story: A Definition,” visual analysis researcher Robert Kosara writes: A story ties facts together. There is a reason why this particular collection of facts is in this story, and the story gives you that reason. provides a narrative path through those facts. In other words, it guides the viewer/reader through the world, rather than just throwing […] The post How literature is like statistical reasoning: Kosara on stories. Gelman and Basbøll…

Cathy O'Neil may need no introduction to blog readers. She's the author of the hard-hitting MathBabe blog, and she shares my passion for explaining how data analysis really works. She is co-author of the recent book Doing Data Science (link), with Rachel Schutt. Cathy has a varied career spanning academia and industry, as she explains below. *** KF: How did you pick up your impressive statistical reasoning skills? CO: Thanks…

Here's a new one for your reading pleasure. Interesting history. Minchul and I went in trying to escape the expected loss minimization paradigm. We came out realizing that we hadn't escaped, but simultaneously, that not all loss functions are created e...

In the context of AR(1) processes, we spent some time to explain what happens when is close to 1. if the process is stationary, if the process is a random walk if the process will explode Again, random walks are extremely interesting processes,...