Basic Character Manipulation (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 20: Text as data. Overview of the character data type, and of strings. Basic string operations: extracting and replacing substrings; splitting strings into character vectors; assembling character vectors into strings; tabulating counts of s...

Read more »

Regular Expressions (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 21: Regular expressions. Why we need ways of describing patterns of strings, and not just specific strings. The syntax and semantics of regular expressions: constants, concatenation, alternation, repetition. Back-references and capture group...

Read more »

Importing Data from Webpages (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 22: Importing data from webpages. Example: scraping weblinks. Using regular expressions again (with multiple capture groups). Building networks of political books. Introduction to Statistical Computing

Read more »

Change of Representation (Introduction to Statistical Computing)

January 2, 2014
By

(My notes for this lecture are too fragmentary to post. What follows is the sketch.) The "raw data" is often not in the format most useful for the model one wants to work with. Lots of statistical computing work is about moving the information from...

Read more »

Databases (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 25: The idea of a relational database. Tables, fields, keys, normalization. Server-client model. Example of working with a database server. Intro to SQL, especially SELECT. Aggregation in databases is like split/apply/combine. Joining tables...

Read more »

Speed, Complexity and Interfacing with Other Systems (Introduction to Statistical Computing)

January 2, 2014
By

(My notes from this lecture are too fragmentary to type up; here's the sketch) Programmer time is (usually) much more valuable than computer time; therefore, "premature optimization is the root of all evil". That said, computer time isn't free, ...

Read more »

End of Year Inventory, 2013

January 2, 2014
By

Attention conservation notice: Navel-gazing. Paper manuscripts completed: 4 Papers accepted: 3 Papers rejected: 4 (fools! we'll show you all!) Papers in revise-and-resubmit purgatory: 2 Papers in refereeing limbo: 1 Papers with co-authors waitin...

Read more »

2013

January 2, 2014
By

There’s lots of overlap but I put each paper into only one category.  Also, I’ve included work that has been published in 2013 as well as work that has been completed this year and might appear in 2014 or later.  So you can can think of this list as representing roughly two years’ work. Political […]The post 2013 appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

Teaching linear models

January 2, 2014
By
Teaching linear models

I teach several courses every year and the most difficult to pull off is FORE224/STAT202: regression modeling. The academic promotion application form in my university includes a section on one’s ‘teaching philosophy’. I struggle with that part because I suspect I lack anything as grandiose as a philosophy when teaching: as most university lecturers I […]

Read more »

Generalized linear models for predicting rates

January 1, 2014
By
Generalized linear models for predicting rates

I often need to build a predictive model that estimates rates. The example of our age is: ad click through rates (how often a viewer clicks on an ad estimated as a function of the features of the ad and the viewer). Another timely example is estimating default rates of mortgages or credit cards. You […] Related posts: What does a generalized linear model do? The equivalence of logistic regression…

Read more »

“Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying”

January 1, 2014
By
“Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying”

Etienne LeBel writes: You’ve probably already seen it, but I thought you could have a lot of fun with this one!! The article, with the admirably clear title given above, is by James McNulty, Michael Olson, Andrea Meltzer, Matthew Shaffer, and begins as follows: For decades, social psychological theories have posited that the automatic processes […]The post “Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be…

Read more »

No on Yes/No decisions

December 31, 2013
By

Just to elaborate on our post from last month (“I’m negative on the expression ‘false positives’”), here’s a recent exchange exchange we had regarding the relevance of yes/no decisions in summarizing statistical inferences about scientific questions. Shravan wrote: Isn’t it true that I am already done if P(theta>0) is much larger than P(thetaThe post No on Yes/No decisions appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

Jeff Leek’s non-comprehensive list of awesome things other people did in 2013

December 31, 2013
By
Jeff Leek’s non-comprehensive list of awesome things other people did in 2013

Jeff Leek, biostats professor at Johns Hopkins and instructor of the Coursera Data Analysis course, recently posted on Simly Statistics this list of awesome things other people accomplished in 2013 in genomics, statistics, and data science.At risk of s...

Read more »

Blog recap of 2013

December 31, 2013
By

Posts by page views Interview with a forced convert to R from Matlab A first step towards R from spreadsheets Plot ranges of data in R A statistical review of ‘Thinking, Fast and Slow’ by Daniel Kahneman The 3 dots construct in R Translating between R and SQL: the basics An R debugging example R […] The post Blog recap of 2013 appeared first on Burns Statistics.

Read more »

An Animation of the Construction of a Confidence Interval

December 31, 2013
By
An Animation of the Construction of a Confidence Interval

I’m playing blog ping-pong with John Kruschke’s Doing Bayesian Data Analysis blog as he was partly inspired by my silly post on Bayesian mascots when writing a nice piece on Icons for the essence of Bayesian and frequentist data analysis. That pi...

Read more »

MCMSki IV, Jan. 6-8, 2014, Chamonix (news #16)

December 30, 2013
By
MCMSki IV, Jan. 6-8, 2014, Chamonix (news #16)

I am now in Chamonix, a week before MCMSki IV. The town is packed with tourists from all over Europe, English being the dominant language. There is not much snow so far, even though some runs reach town. (I did ski in the nearby Les Houches today and the red runs were either icy or […]

Read more »

Some things R can do you might not be aware of

December 30, 2013
By

There is a lot of noise around the "R versus Contender X" for Data Science. I think the two main competitors right now that I hear about are Python and Julia. I'm not going to weigh into the debates because … Continue reading →

Read more »

Two good maps, considered part 2

December 30, 2013
By
Two good maps, considered part 2

This is a continuation of my previous post on the map of the age of Brooklyn's buildings, in which I suggested that aggregating the data would bring out the geographical patterns better. For its map illustrating the pattern of insurance...

Read more »

Two good maps, considered

December 30, 2013
By
Two good maps, considered

A Relection on the past year: Thanks to you for continuing to make this blog a success. Writing it has given me much enjoyment over the years, and I have learned much from your comments as well as from the...

Read more »

Brief introduction to Scala and Breeze for statistical computing

December 30, 2013
By
Brief introduction to Scala and Breeze for statistical computing

Introduction In the previous post I outlined why I think Scala is a good language for statistical computing and data science. In this post I want to give a quick taste of Scala and the Breeze numerical library to whet the appetite of the uninitiated. This post certainly won’t provide enough material to get started […]

Read more »

Brief introduction to Scala and Breeze for statistical computing

December 30, 2013
By
Brief introduction to Scala and Breeze for statistical computing

Introduction In the previous post I outlined why I think Scala is a good language for statistical computing and data science. In this post I want to give a quick taste of Scala and the Breeze numerical library to whet the appetite of the uninitiated. This post certainly won’t provide enough material to get started […]

Read more »

Bill Gates’s favorite graph of the year

December 30, 2013
By
Bill Gates’s favorite graph of the year

Under the subject line “Blog bait!”, Brendan Nyhan points me to this post at the Washington Post blog: For 2013, we asked some of the year’s most interesting, important and influential thinkers to name their favorite graph of the year — and why they chose it. Here’s Bill Gates’s. Infographic by Thomas Porostocky for WIRED. […]The post Bill Gates’s favorite graph of the year appeared first on Statistical Modeling, Causal…

Read more »

Blog year 2013 in review

December 30, 2013
By
Blog year 2013 in review

Highlights of the blog over the past year. Most popular posts The posts with the most hits during the year. A practical introduction to garch modeling (posted in 2012) A tale of two returns (posted in 2010) The top 7 portfolio optimization problems (posted in 2012) The number 1 novice quant mistake (posted in 2011) On smart beta … Continue reading →

Read more »


Subscribe

Email:

  Subscribe