Le Monde puzzle [#825]

June 18, 2013
By
Le Monde puzzle [#825]

Yet another puzzle which first part does not require R programming, even though it is a programming question in essence: Given five real numbers x1,…,x5, what is the minimal number of pairwise comparisons needed to rank them? Given 33 real numbers, what is the minimal number of pairwise comparisons required to find the three largest ones? […]

Read more »

re:log: Tracking the Movements of Conference Attendees via WiFi

June 18, 2013
By
re:log: Tracking the Movements of Conference Attendees via WiFi

re:log [opendatacity.de] by German data designers OpenDataCity reveals the movements of about 6,700 different electronic devices during re:publica 2013, a prestigious European conference on the topic of Digital Society. A dynamic map of the conferenc...

Read more »

Map Stack: Designing a Map in Easy and Fun Ways

June 18, 2013
By
Map Stack: Designing a Map in Easy and Fun Ways

Map Stack [stamen.com] by Stamen Design aims to make it radically simpler for lay people to design completely unique, personalized maps. The online visual map design service provides easy access to the color, opacity and brightness of any map backgro...

Read more »

There are no fat sprinters

June 18, 2013
By

This post is by Phil. A little over three years ago I wrote a post about exercise and weight loss in which I described losing a fair amount of weight due to (I believe) an exercise regime, with no effort to change my diet; this contradicted the prediction of studies that had recently been released. [...]The post There are no…

Read more »

BCEA 1.3.0

June 18, 2013
By
BCEA 1.3.0

After months of work (although to be fair, we haven't worked 100% full time on this), Andrea and I are nearly ready to publish the next release of BCEA. Andrea has done a brilliant job and is responsible for most of the good new features (NB: see ...

Read more »

Job opening! Come work with us!

June 18, 2013
By

Postdoctoral position in statistical modeling of social networks A full-time postdoctoral position is available beginning Fall 2014 in the research group of Tian Zheng and Andrew Gelman working on statistical analysis and modeling of social network data, in close cooperation with our experimental collaborators. Four key papers of this project so far are: http://www.stat.columbia.edu/~gelman/research/published/overdisp_final.pdf http://nersp.osg.ufl.edu/~ufruss/documents/mccormick_salganik_zheng10.pdf [...]The post Job opening! Come…

Read more »

Surveys on sensitive topics

June 18, 2013
By

Andrew Sullivan (link) has a few questions about a new Pew survey focusing on the LGBT subpopulation. He wonders, for instance, about the high proportion of self-identified bisexuals in this poll. What interests a statistician here is the fact that the poll deals with sensitive matters, which typically present a challenge in terms of survey response, and nonresponse bias. The…

Read more »

googleVis 0.4.3 released with improved Geocharts

June 18, 2013
By
googleVis 0.4.3 released with improved Geocharts

The Google Charts Tools provide two kinds of heat map charts for geographical data, the Flash based Geomap and the HTML5/SVG based Geochart. I prefer the Geochart as it doesn't require Flash, but so far there have been two shortcomings with it: I...

Read more »

Zombie Apocalypse Survival Test – R-Powered (using Concerto)

June 17, 2013
By
Zombie Apocalypse Survival Test – R-Powered (using Concerto)

This test is the first attempt to seriously assess the ability of individuals to survive a zombie apocalypse.  This test is administered using the R powered open-source testing platform Concerto developed at the University of Cambridge. The t...

Read more »

Bayesian computational tools

June 17, 2013
By
Bayesian computational tools

I just updated my short review on Bayesian computational tools I first wrote in April for the Annual Review of Statistics and Its Applications. The coverage is quite restricted, as I took advantage of two phantom papers I had started a while ago, one with Jean-Michel Marin, on hierarchical Bayes methods and on ABC. (As […]

Read more »

Weak identification provides partial information

June 17, 2013
By

Matt Selove writes: My question is about Bayesian analysis of the linear regression model. It seems to me that in some cases this approach throws out useful information. As an example, imagine you have two basketball players randomly drawn from the pool of NBA players (which provides the prior). You’d like to estimate how many [...]The post Weak identification provides…

Read more »

Job opening at new “big data” consulting firm!

June 17, 2013
By

David Shor sends along a job announcement for Civis Analytics, which he describes as “basically Obama’s Analytics team reconstituted as a company”: Data Scientist Position Overview Data Scientists are responsible for providing the fundamental data science that powers our work – including predictive analytics, data mining, experimental design and ad-hoc statistical analysis. As a Data [...]The post Job opening at…

Read more »

Back to basics

June 17, 2013
By
Back to basics

Today, we review one of the basic principles Ed Tufte very effectively advocated in his famous book: use gridlines and data labels only if absolutely necessary. The enemy is redundancy. Here is a chart that appeared in the New York...

Read more »

Repetition factors versus frequency variables

June 17, 2013
By
Repetition factors versus frequency variables

A regular reader noticed my post on initializing vectors by using repetition factors and asked whether that technique would be useful to expand data that are given in value-frequency pairs. The short answer is "no." Repetition factors are useful for defining (static) matrix literals. However, if you want to expand [...]

Read more »

Exploratory Data Analysis: Combining Box Plots and Kernel Density Plots into Violin Plots for Ozone Pollution Data

Exploratory Data Analysis: Combining Box Plots and Kernel Density Plots into Violin Plots for Ozone Pollution Data

Introduction Recently, I began a series on exploratory data analysis (EDA), and I have written about descriptive statistics, box plots, and kernel density plots so far.  As previously mentioned in my post on box plots, there is a way to combine box plots and kernel density plots.  This combination results in violin plots, and I […]

Read more »

Bayesian robust regression for Anscombe quartet

June 16, 2013
By
Bayesian robust regression for Anscombe quartet

In 1973, Anscombe presented four data sets that have become a classic illustration for the importance of graphing the data, not merely relying on summary statistics. The four data sets are now known as "Anscombe's quartet." Here I present a Bayesian ap...

Read more »

Why engineers and poets need to know about statistics

June 16, 2013
By
Why engineers and poets need to know about statistics

I’m kidding about poets. But lots of people need to understand the three basic areas of statistics, Chance, Data and Evidence. Recently Tony Greenfield, an esteemed applied statistician, (with his roots in Operations Research) posted the following request on a … Continue reading →

Read more »

The scaling of Expected Shortfall

June 16, 2013
By
The scaling of Expected Shortfall

Getting Expected Shortfall given the standard deviation or Value at Risk. Previously There have been a few posts about Value at Risk and Expected Shortfall. Properties of the stable distribution were discussed. Scaling One way of thinking of Expected Shortfall is that it is just some number times the standard deviation, or some other number … Continue reading →

Read more »

Sunday data/statistics link roundup (6/16/13 – Father’s day edition!)

June 16, 2013
By

Datapalooza! I'm wondering where my invite is? I do health data stuff, pick me, pick me! Actually it does sound like a pretty good idea - in general giving a bunch of smart people access to interesting data and real … Continue reading →

Read more »

Evilicious: Why We Evolved a Taste for Being Bad

June 16, 2013
By
Evilicious: Why We Evolved a Taste for Being Bad

The other day, a friend told me that when he saw me blogging on Noam Chomsky, he was surprised not to see any mention of disgraced primatologist Marc Hauser. I was like, whaaaaaa? I had no idea these two had any connection. In fact, though, they wrote papers together. This made me wonder what Chomsky [...]The post Evilicious: Why We…

Read more »

Distribution of car weights

June 16, 2013
By
Distribution of car weights

Two weeks ago I described car data, among which weight distribution of cars in Netherlands. At that time it was purely plots. In the mean time I decided I wanted to model trends. As a first step of that, I decided to fit distributions for these da...

Read more »

Open Data Census

June 15, 2013
By
Open Data Census

Open Data Census The Open Knowledge Foundation OKFN publishes first results of its Open Data Census, just before the G8 …Continue reading »

Read more »

Exploratory multilevel analysis when group-level variables are of importance

June 15, 2013
By

Steve Miller writes: Much of what I do is cross-national analyses of survey data (largely World Values Survey). . . . My big question pertains to (what I would call) exploratory analysis of multilevel data, especially when the group-level predictors are of theoretical importance. A lot of what I do involves analyzing cross-national survey items [...]The post Exploratory multilevel analysis…

Read more »

Subscribe

Email:

  Subscribe