## Homework: Several Hundred Degrees of Separation (Introduction to Statistical Computing)

January 2, 2014
By

Homework 10: in which we refine our web-crawler from the previous assignment, by way of further working with regular expressions, and improving our estimates of page-rank. (This assignment ripped off from Vince Vu, with permission.) Introduction...

## Simulation V: Matching Simulation Models to Data (Introduction to Statistical Computing)

January 2, 2014
By

$\newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \DeclareMathOperator*{\argmin}{argmin}$ (My notes for this lecture are too incomplete to be worth typing up, so here's the sketch.) Methods, Models, Simulations Statistical methods try t...

## Computing for Statistics (Introduction to Statistical Computing)

January 2, 2014
By

(My notes from this lecture are too fragmentary to post; here's the sketch.) What should you remember from this class? Not: my mistakes (though remember that I made them). Not: specific packages and ways of doing things (those will change). Not: t...

## 36-350, Fall 2013: Self-Evaluation and Lessons Learned (Introduction to Statistical Computing)

January 2, 2014
By

This was not one of my better performances as a teacher. I felt disorganized and unmotivated, which is a bit perverse, since it's the third time I've taught the class, and I know the material very well by now. The labs were too long, and my attempts...

## Simulation IV: Quantifying Uncertainty with Simulations (Introduction to Statistical Computing)

January 2, 2014
By

(My notes for this lecture are too fragmentary to write up properly; here's the sketch.) Two forms of statistical uncertainty: (I) How much would our answers change if the data were different? (II) How diverse are the answers which don't make use hat...

## Optimization II: Deterministic, Unconstrained Optimization (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 18: Deterministic, Unconstrained Optimization. The trade-off of approximation versus time. Newton's method: motivation from Taylor expansion; as gradient descent with adaptive step-size; pros and cons. Coordinate descent instead of multivar...

## Optimization III: Stochastic, Constrained, and Penalized Optimization (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 19: Stochastic, Constrained, and Penalized Optimization. Constrained optimization: maximizing multinomial likelihood as an example of why constraints matter. The method of Lagrange multipliers for equality constraints. Lagrange multipliers...

## Basic Character Manipulation (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 20: Text as data. Overview of the character data type, and of strings. Basic string operations: extracting and replacing substrings; splitting strings into character vectors; assembling character vectors into strings; tabulating counts of s...

## Regular Expressions (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 21: Regular expressions. Why we need ways of describing patterns of strings, and not just specific strings. The syntax and semantics of regular expressions: constants, concatenation, alternation, repetition. Back-references and capture group...

## Importing Data from Webpages (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 22: Importing data from webpages. Example: scraping weblinks. Using regular expressions again (with multiple capture groups). Building networks of political books. Introduction to Statistical Computing

## Change of Representation (Introduction to Statistical Computing)

January 2, 2014
By

(My notes for this lecture are too fragmentary to post. What follows is the sketch.) The "raw data" is often not in the format most useful for the model one wants to work with. Lots of statistical computing work is about moving the information from...

## Databases (Introduction to Statistical Computing)

January 2, 2014
By

Lecture 25: The idea of a relational database. Tables, fields, keys, normalization. Server-client model. Example of working with a database server. Intro to SQL, especially SELECT. Aggregation in databases is like split/apply/combine. Joining tables...

## Speed, Complexity and Interfacing with Other Systems (Introduction to Statistical Computing)

January 2, 2014
By

(My notes from this lecture are too fragmentary to type up; here's the sketch) Programmer time is (usually) much more valuable than computer time; therefore, "premature optimization is the root of all evil". That said, computer time isn't free, ...

## End of Year Inventory, 2013

January 2, 2014
By

Attention conservation notice: Navel-gazing. Paper manuscripts completed: 4 Papers accepted: 3 Papers rejected: 4 (fools! we'll show you all!) Papers in revise-and-resubmit purgatory: 2 Papers in refereeing limbo: 1 Papers with co-authors waitin...

## 2013

January 2, 2014
By

There’s lots of overlap but I put each paper into only one category.  Also, I’ve included work that has been published in 2013 as well as work that has been completed this year and might appear in 2014 or later.  So you can can think of this list as representing roughly two years’ work. Political […]The post 2013 appeared first on Statistical Modeling, Causal Inference, and Social Science.

## Teaching linear models

January 2, 2014
By

I teach several courses every year and the most difficult to pull off is FORE224/STAT202: regression modeling. The academic promotion application form in my university includes a section on one’s ‘teaching philosophy’. I struggle with that part because I suspect I lack anything as grandiose as a philosophy when teaching: as most university lecturers I […]

## Generalized linear models for predicting rates

January 1, 2014
By

I often need to build a predictive model that estimates rates. The example of our age is: ad click through rates (how often a viewer clicks on an ad estimated as a function of the features of the ad and the viewer). Another timely example is estimating default rates of mortgages or credit cards. You […] Related posts: What does a generalized linear model do? The equivalence of logistic regression…

## “Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying”

January 1, 2014
By

Etienne LeBel writes: You’ve probably already seen it, but I thought you could have a lot of fun with this one!! The article, with the admirably clear title given above, is by James McNulty, Michael Olson, Andrea Meltzer, Matthew Shaffer, and begins as follows: For decades, social psychological theories have posited that the automatic processes […]The post “Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be…

## No on Yes/No decisions

December 31, 2013
By

Just to elaborate on our post from last month (“I’m negative on the expression ‘false positives’”), here’s a recent exchange exchange we had regarding the relevance of yes/no decisions in summarizing statistical inferences about scientific questions. Shravan wrote: Isn’t it true that I am already done if P(theta>0) is much larger than P(thetaThe post No on Yes/No decisions appeared first on Statistical Modeling, Causal Inference, and Social Science.

## Jeff Leek’s non-comprehensive list of awesome things other people did in 2013

December 31, 2013
By

Jeff Leek, biostats professor at Johns Hopkins and instructor of the Coursera Data Analysis course, recently posted on Simly Statistics this list of awesome things other people accomplished in 2013 in genomics, statistics, and data science.At risk of s...

## Blog recap of 2013

December 31, 2013
By

Posts by page views Interview with a forced convert to R from Matlab A first step towards R from spreadsheets Plot ranges of data in R A statistical review of ‘Thinking, Fast and Slow’ by Daniel Kahneman The 3 dots construct in R Translating between R and SQL: the basics An R debugging example R […] The post Blog recap of 2013 appeared first on Burns Statistics.

## An Animation of the Construction of a Confidence Interval

December 31, 2013
By

I’m playing blog ping-pong with John Kruschke’s Doing Bayesian Data Analysis blog as he was partly inspired by my silly post on Bayesian mascots when writing a nice piece on Icons for the essence of Bayesian and frequentist data analysis. That pi...

## MCMSki IV, Jan. 6-8, 2014, Chamonix (news #16)

December 30, 2013
By

I am now in Chamonix, a week before MCMSki IV. The town is packed with tourists from all over Europe, English being the dominant language. There is not much snow so far, even though some runs reach town. (I did ski in the nearby Les Houches today and the red runs were either icy or […]