Unsuck your writing

April 8, 2014
By
Unsuck your writing

I recently found this little gem of a web app that analyzes the clarity of your writing. Hemingway highlights long, complex, and hard to read sentences. It also highlights complex words where a simple one would do, and highlights adverbs, suggesting yo...

Read more »

Understanding Simpson’s paradox using a graph

April 8, 2014
By
Understanding Simpson’s paradox using a graph

Joshua Vogelstein pointed me to this post by Michael Nielsen on how to teach Simpson’s paradox. I don’t know if Nielsen (and others) are aware that people have developed some snappy graphical methods for displaying Simpson’s paradox (and, more generally, aggregation issues). We do some this in our Red State Blue State book, but before […] The post Understanding Simpson’s paradox using a graph appeared first on Statistical Modeling, Causal…

Read more »

Getting Social Sciences Out of the Black Box: The Open Access Revolution

April 8, 2014
By
Getting Social Sciences Out of the Black Box: The Open Access Revolution

Trading Ethos for LogosUp until very recently (the last 10 years) it has been uncommon for social science researchers to share their data even when the sharing would neither compromise the private information of the subjects nor the validity of the stu...

Read more »

Construct a stacked bar chart in SAS where each bar equals 100%

April 8, 2014
By
Construct a stacked bar chart in SAS where each bar equals 100%

I enjoy reading the Graphically Speaking blog because it teaches me a lot about ODS statistical graphics, especially features of the SGPLOT procedure and the Graph Template Language (GTL). Yesterday Sanjay blogged about how to construct a stacked bar chart of percentages so that each bar represents 100%. His chart […]

Read more »

JMBayes R package (webinar)

April 8, 2014
By
JMBayes R package (webinar)

A free webinar will provide an introduction to the “JMBayes” R package which provides methods for Joint Modeling of Longitudinal and Time-to-Event Data under a Bayesian Approach. Webinar Format: - Introduction to Joint Models and the JMBayes R package – … Continue reading →

Read more »

Annotation charts and histograms with googleVis

April 8, 2014
By
Annotation charts and histograms with googleVis

After my posts on timeline, Sankey and calendar charts, this will be the last to introduce new chart types of the developer version of googleVis. Today I will give examples for the new annotation charts and histograms.Annotation chartsAnnotation charts...

Read more »

Quality of Historical Stock Prices from Yahoo Finance

April 8, 2014
By
Quality of Historical Stock Prices from Yahoo Finance

I recently looked at the strategy that invests in the components of S&P/TSX 60 index, and discovered that there are some abnormal jumps/drops in historical data that I could not explain. To help me spot these points and remove them, I created a helper function data.clean() function in data.r at github. Following is an example […]

Read more »

Job at Center for Open Science

April 7, 2014
By
Job at Center for Open Science

This looks like an interesting job. Dear Dr. Hyndman, I write from the Center for Open Science, a non-profit organization based in Charlottesville, Virginia in the United States, which is dedicated to improving the alignment between scientific values and scientific practices. We are dedicated to open source and open science. We are reaching out to you to find out if you know anyone who might be interested in our Statistical…

Read more »

data scientist position

April 7, 2014
By
data scientist position

Our newly created Chaire “Economie et gestion des nouvelles données” in Paris-Dauphine, ENS Ulm, École Polytechnique and ENSAE is recruiting a data scientist starting as early as May 1, the call remaining open till the position is filled. The location is in one of the above labs in Paris, the duration for at least one […]

Read more »

Writing good software can have more impact than publishing in high impact journals for genomic statisticians

April 7, 2014
By

Every once in a while we see computational papers published in science journals with high impact factors.  Genomics related methods appear quite often in these journals. Several of my junior colleagues express frustration that all their papers get rejected from these journals. … Continue reading →

Read more »

The Internet and religious affiliation

April 7, 2014
By
The Internet and religious affiliation

A few weeks ago I published this paper on arXiv: "Religious affiliation, education and Internet use".  Regular readers of this blog will recognize this as the article I was writing about in July 2012, including this article.A few days ago, MIT Tec...

Read more »

How literature is like statistical reasoning: Kosara on stories. Gelman and Basbøll on stories.

April 7, 2014
By

In “Story: A Definition,” visual analysis researcher Robert Kosara writes: A story ties facts together. There is a reason why this particular collection of facts is in this story, and the story gives you that reason. provides a narrative path through those facts. In other words, it guides the viewer/reader through the world, rather than just throwing […] The post How literature is like statistical reasoning: Kosara on stories. Gelman and Basbøll…

Read more »

R Continues Its Rapid Growth

April 7, 2014
By
R Continues Its Rapid Growth

I’ve just updated the section below from The Popularity of Data Analysis Software. Note that the overall article is still under construction and all the figure numbers have changed from previous versions. Growth in Capability The capability of analytics software … Continue reading →

Read more »

On deck this week

April 7, 2014
By

Mon: How literature is like statistical reasoning: Kosara on stories. Gelman and Basbøll on stories. Tues: Understanding Simpson’s paradox using a graph Wed: Advice: positive-sum, zero-sum, or negative-sum Thurs: Small multiples of lineplots > maps (ok, not always, but yes in this case) Fri: “More research from the lunatic fringe” Sat: “Schools of statistical thoughts […] The post On deck this week appeared first on Statistical Modeling, Causal Inference, and…

Read more »

Numbersense Pros: Cathy O’Neil talks about trust in data analysis

April 7, 2014
By
Numbersense Pros: Cathy O’Neil talks about trust in data analysis

Cathy O'Neil may need no introduction to blog readers. She's the author of the hard-hitting MathBabe blog, and she shares my passion for explaining how data analysis really works. She is co-author of the recent book Doing Data Science (link), with Rachel Schutt. Cathy has a varied career spanning academia and industry, as she explains below. *** KF: How did you pick up your impressive statistical reasoning skills? CO: Thanks…

Read more »

Point Forecast Accuracy Evaluation

April 7, 2014
By

Here's a new one for your reading pleasure. Interesting history. Minchul and I went in trying to escape the expected loss minimization paradigm. We came out realizing that we hadn't escaped, but simultaneously, that not all loss functions are created e...

Read more »

Vector and matrix norms in SAS

April 7, 2014
By
Vector and matrix norms in SAS

Did you know that SAS/IML 12.1 provides built-in functions that compute the norm of a vector or matrix? A vector norm enables you to compute the length of a vector or the distance between two vectors in SAS. Matrix norms are used in numerical linear algebra to estimate the condition […]

Read more »

Story: A Definition

April 7, 2014
By
Story: A Definition

What makes a story? What does a story do? In part one of this little series, I argued that stories and worlds are not opposites, but complements. In this part, I try to explain the differences between worlds and stories, and present a definition.

Read more »

Stationarity of ARCH processes

April 7, 2014
By

In the context of AR(1) processes, we spent some time to explain what happens when  is close to 1. if  the process is stationary, if  the process is a random walk if  the process will explode Again, random walks are extremely interesting processes,...

Read more »

Jaynes’s Bayesian view of Frequencies

April 6, 2014
By

The last post together with Christian Hennig’s comment, naturally reminded me of Jaynes and his view of frequencies. After a discussion similar to my previous post, but approached in a different way and in more depth, Jaynes states (PTLOS p. 292)...

Read more »

Looking at Measles Data in Project Tycho, part II

April 6, 2014
By
Looking at Measles Data in Project Tycho, part II

Continuing from last week, I will now look at incidence rates of measles in the US. To recap, Project Tycho contains data from all weekly notifiable disease reports for the United States dating back to 1888. These data are freely available to any...

Read more »

Seven (a-day)

April 6, 2014
By
Seven (a-day)

This week (among other things, including my Vespa breaking down twice in three days) I was busy taking part in an interview panel for a research associate position, together with colleagues in the Medical School at UCL.One of the questions we...

Read more »

An old discussion of food deserts

April 6, 2014
By
An old discussion of food deserts

I happened to be reading an old comment thread from 2012 (follow the link from here) and came across this amusing exchange: Perhaps this is the paper Jonathan was talking about? Here’s more from the thread: Anyway, I don’t have anything ...

Read more »


Subscribe

Email:

  Subscribe