## Types of Data on R

August 26, 2013
By

There are different types of data on R. I use type here as a technical term, rather than merely a synonym of “variety”.  There are three main types of data: Numeric: ordinary numbers Character: not treated as a number, but as a word. You cannot add two characters, even if they appear to be numerical. Characters […]

## An ignored issue in Big Data analysis

August 26, 2013
By

The Sunday Review section of New York Times on August 11 contains two pieces I'd like to discuss. The first piece, by Ian Urbina, an investigative reporter, is called "I Flirt and Tweet. Follow Me at #Socialbot". He tells a fascinating story about how programmers create "bots" that impersonate people surfing the Web. For instance, researchers at a Brazilian university created a bot named "Carina Santos" who was rated by…

## Kosara wants to rescue infographics

August 26, 2013
By

Robert Kosara takes us back to the 1940s, and an incredible "infographics" project by the Lawrence Livermoore Laboratory. (link) Here is one of the designs: Kosara laments: When did information graphics turn into ‘infographics,’ and when did we lose the...

## Determine the version of SAS software at run time

August 26, 2013
By

Recently I wrote about how to determine the age of your SAS release. Experienced SAS programmers know that you can programatically determine information about your SAS release by using certain automatic macro variables that SAS provides: SYSVER: contains the major and minor version of the SAS release SYSVLONG: contains the [...]

## Changeability of Value at Risk estimators

August 26, 2013
By

How does Value at Risk change through time for the same portfolio? Previously There has been a number of posts on Value at Risk, including a basic introduction to Value at Risk and Expected Shortfall. The components garch model was also described. Issue The historical method for Value at Risk is by far the most commonly … Continue reading →

## Statistics is not beautiful (sniff)

August 25, 2013
By

Statistics is not really elegant or even fun in the way that a mathematics puzzle can be. But statistics is necessary, and enormously rewarding. I like to think that we use statistical methods and principles to extract truth from data. … Continue reading →

## Exponential Smoothing Again: Structural Change

August 25, 2013
By

Here's another fascinating example of the ongoing and surprisingly modern magic of exponential smoothing (ES).In my last post I asked you to read the latest from Neil Shephard, on stochastic volatility and exponential smoothing. Now read the lates...

## A new Bem theory

August 25, 2013
By

The other day I was talking with someone who knows Daryl Bem a bit, and he was sharing his thoughts on that notorious ESP paper that was published in a leading journal in the field but then was mocked, shot down, and was repeatedly replicated with no success. My friend said that overall the Bem […]The post A new Bem theory appeared first on Statistical Modeling, Causal Inference, and Social…

## More REML exercise

August 25, 2013
By

Last week I tried exercise 1 of the SAS(R) proc mixed with R libraries lme4 and MCMCglm. So this week I aimed for exercise 2 but ended up redoing exercise 1 with nlme.Exercise 2 results gave me problems with library lme4 and latter parts of the ex...

## From SVG to probability distributions [with R package]

August 25, 2013
By
$From SVG to probability distributions [with R package]$

Hey, To illustrate generally complex probability density functions on continuous spaces, researchers always use the same examples, for instance mixtures of Gaussian distributions or a banana shaped distribution defined on with density function: If we draw a sample from this distribution using MCMC we obtain a [scatter]plot like this one: Clearly it doesn’t really look […]

## Did Faulty Software Shut Down the NASDAQ?

August 24, 2013
By

This past Thursday, the NASDAQ stock exchange shut down for just over 3 hours due to some technical problems. It's still not clear what the problem was because NASDAQ officials are being tight-lipped. NASDAQ has had a bad run of … Continue reading →

## All inference is about generalizing from sample to population

August 24, 2013
By

Jeff Walker writes: Your blog has skirted around the value of observational studies and chided folks for using causal language when they only have associations but I sense that you ultimately find value in these associations. I would love for you to expand this thought in a blog. Specifically: Does a measured association “suggest” a […]The post All inference is about generalizing from sample to population appeared first on Statistical…

## Measurement error in monkey studies

August 24, 2013
By

Following up on our recent discussion of combative linguist Noam Chomsky and disgraced primatologist Marc Hauser, here are some stories from Jay Livingston about monkey research. Don’t get me wrong—I eat burgers, so I’m not trying to get on my moral high horse here. But the stories do get you thinking about measurement error and […]The post Measurement error in monkey studies appeared first on Statistical Modeling, Causal Inference, and…

## Statistics professors on college football, Crimson Tide, and Buckeyes

August 23, 2013
By

The Crimson Tide and the Buckeyes are ranked the first and second in the Associate Press preseason poll of 2013 college football season. Alabama is  going for its third consecutive national championship, which has not happened before. Meanwhile, the Ohio State has never posted consecutive undefeated/untied seasons. Are they going to defeat the odds? Prof. Mark Berliner […]

## Residuals from a logistic regression

August 23, 2013
By

I always claim that graphs are important in econometrics and statistics ! Of course, it is usually not that simple. Let me come back to a recent experience. A got an email from Sami yesterday, sending me a graph of residuals, and asking me what could be done with a graph of residuals, obtained from a logistic regression ? To get a better understanding, let us consider the following dataset…

## Discrimination

August 23, 2013
By

Sometimes it's cool to discriminate. (I mean, if Abercrombie does it...) There I said it. I'm sick and tired of people being all PC(A) about it. I think Linear Discriminant Analysis is a great way to classify your data, and I don't care who knows it. ...

## Stratifying PISA scores by poverty rates suggests imitating Finland is not necessarily the way to go for US schools

August 23, 2013
By

For the past several years a steady stream of articles and opinion pieces have been praising the virtues of Finish schools and exalting the US to imitate this system. One data point supporting this view comes from the most recent … Continue reading →

## “I mean, what exact buttons do I have to hit?”

August 23, 2013
By

This American Life reporter Gabriel Rhodes says: This is one of the big differences between Jon and Anthony, between scientist and non-scientist. For Jon, having a year’s worth of work suddenly thrown into question is a normal day at the office. But for Anthony, that’s not normal. And it’s not OK. The time in Jon’s […]The post “I mean, what exact buttons do I have to hit?” appeared first on…

## If you are near DC/Baltimore, come see Jeff talk about Coursera

August 23, 2013
By

I'll be speaking at the Data Science Maryland meetup. The title of my presentation is "Teaching Data Science to the Masses". The talk is at 6pm on Thursday, Sept. 19th. More info here.

## GitHub renders CSV in the browser, becomes even better for social data set creation

August 23, 2013
By

I've written in a number of places about how GitHub can be a great place to store data. Unlike basically all other web data storage sites (many of which I really like such as Dataverse and FigShare) GitHub enables deep social data set development and f...

## A critical look at “critical thinking”: deduction and induction

August 23, 2013
By

I’m cleaning away some cobwebs around my old course notes, as I return to teaching after 2 years off (since I began this blog). The change of technology alone over a mere 2 years (at least here at Super Tech U) might be enough to earn me techno-dinosaur status: I knew “Blackboard” but now it’s […]

## The Flavor Connection: Mapping Commonalities in Food Flavors

August 22, 2013
By

The Flavor Connection [scientificamerican.com] by Jan Willem Tulp for popular science magazine Scientific American maps the similarities of flavor compounds between different foods. Each food is represented by a blue dot, which is sized according to...

## Improvements to Kindle Version of BDA3

August 22, 2013
By

I let Andrew know about the comments about the defective Kindle version of BDA2 and he wrote to his editor at Chapman and Hall, Rob Calver, who wrote back with this info: I can guarantee that the Kindle version of the third edition will be a substantial improvement. We publish all of our mathematics and […]The post Improvements to Kindle Version of BDA3 appeared first on Statistical Modeling, Causal Inference,…