Save descriptive statistics for multiple variables in a SAS data set

March 28, 2016
By
Save descriptive statistics for multiple variables in a SAS data set

Descriptive univariate statistics are the foundation of data analysis. Before you create a statistical model for new data, you should examine descriptive univariate statistics such as the mean, standard deviation, quantiles, and the number of nonmissing observations. In SAS, there is an easy way to create a data set that […] The post Save descriptive statistics for multiple variables in a SAS data set appeared first on The DO Loop.

Read more »

Sample quantiles 20 years later

Sample quantiles 20 years later

Almost exactly 20 years ago I wrote a paper with Yanan Fan on how sample quantiles are computed in statistical software. It was cited 43 times in the first 10 years, and 457 times in the next 10 years, making it my third paper to receive 500+ citations...

Read more »

AI’s not AI

March 27, 2016
By
AI’s not AI

There has been a lot of commentary recently on issues relating to an experimental chat bot that Microsoft has (or had) launched named (after, perhaps, a river in Scotland) Tay. After a brief existence online, the bot was removed due...

Read more »

AI’s not AI

March 27, 2016
By
AI’s not AI

There has been a lot of commentary recently on issues relating to an experimental chat bot that Microsoft has (or had) launched named (after, perhaps, a river in Scotland) Tay. After a brief existence online, the bot was removed due...

Read more »

“Rbitrary Standards”

March 27, 2016
By

Allen and Michael pointed us on the Stan list to these amusing documents by Oliver Keyes: Rbitrary Standards: “This is an alternate FAQ for R. Specifically, it’s an FAQ that tries to answer all the questions about R’s weird standards, format...

Read more »

Who was Shirley Almon?

March 26, 2016
By
Who was Shirley Almon?

How often have you said to yourself, "I wonder what happened to Jane X"? (Substitute any person's name you wish.)Personally, I've noticed a positive correlation between my age and the frequency of occurrence of this event, but we all know that correlat...

Read more »

A. Spanos: Talking back to the critics using error statistics

March 26, 2016
By
A. Spanos: Talking back to the critics using error statistics

Given all the recent attention given to kvetching about significance tests, it’s an apt time to reblog Aris Spanos’ overview of the error statistician talking back to the critics [1]. A related paper for your Saturday night reading is Mayo and Spanos (2011).[2] It mixes the error statistical philosophy of science with its philosophy of statistics, introduces severity, […]

Read more »

He does mathematical modeling and is asking for career advice: wants to move from biology to social science

March 26, 2016
By

Rick Desper writes: I face some tough career choices. I have a background in mathematical modeling (got my Ph.D. in math from Rutgers back in the late ’90s) and spent several years working in the field of bioinformatics/computational biology (its name varies from place to place). I’ve worked on problems in modeling cancer progression and […] The post He does mathematical modeling and is asking for career advice: wants to…

Read more »

Choice Modeling with Features Defined by Consumers and Not Researchers

March 26, 2016
By
Choice Modeling with Features Defined by Consumers and Not Researchers

Choice modeling begins with a researcher "deciding on what attributes or levels fully describe the good or service." This is consistent with the early neural networks in which features were precoded outside of the learning model. That is, choice modeli...

Read more »

Not So Standard Deviations Episode 12 – The New Bayesian vs. Frequentist

March 26, 2016
By

In this episode, Hilary and I discuss the new direction for the journal Biostatistics, the recent fracas over ggplot2 and base graphics in R, and whether collecting more data is always better than collecting less (fewer?) data. Also, Hilary and Roger r...

Read more »

MIDAS Regression is Now in EViews

March 25, 2016
By
MIDAS Regression is Now in EViews

The acronym, "MIDAS", stands for several things. In the econometrics literature it refers to "Mixed-Data Sampling" regression analysis. The term was coined by Eric Ghysels a few years ago in relation to some of the novel work that he, his students, and...

Read more »

Data-dependent prior as an approximation to hierarchical model

March 25, 2016
By

Andy Solow writes: I have a question about Bayesian statistics. Why is it wrong to use the same data to formulate the prior and to update it to the posterior? I am having a hard time coming up with – or finding in the literature – a formal reason. I asked him to elaborate and […] The post Data-dependent prior as an approximation to hierarchical model appeared first on Statistical…

Read more »

High school rankings of top NCAA wrestlers

March 25, 2016
By
High school rankings of top NCAA wrestlers

Last weekend was the 2016 NCAA Division I wrestling tournament. In collegiate wrestling there are ten weight classes. The top eight wrestlers in each weight class are awarded the title "All-American" to acknowledge that they are the best wrestlers in the country. I saw a blog post on the InterMat […] The post High school rankings of top NCAA wrestlers appeared first on The DO Loop.

Read more »

Le Monde puzzle [#954]

March 24, 2016
By
Le Monde puzzle [#954]

A square Le Monde mathematical puzzle: Given a triplet (a,b,c) of integers, with a<b<c, it satisfies the S property when a+b, a+c, b+c, a+b+c are perfect squares such that a+c, b+c, and a+b+c are consecutive squares. For a given a, is it always possible to find a pair (b,c) such (a,b,c) satisfies S? Can you […]

Read more »

Radial Graphs for Time Series

March 24, 2016
By
Radial Graphs for Time Series

On How to: Weather Radials, there was a nice visualisation of temperatures. Since I am too old fashioned for ggplot2, I wanted to reproduce a similar graph with the old plot style. Assume that daily temperature is in a vector X (e.g. temperature in Montréal, QC, in 2009). To get a radial plot, use > n=length(X) > theta=seq(0,1-1/n,length=n)*2*pi > r=30+X > plot(r*cos(pi/2-theta),r*sin(pi/2-theta),type="l",xlab="",ylab="",axes=FALSE) > for(t in 1:n){ + if(X[t]>0) CL=rgb(0,0,1,.4) +…

Read more »

Multilevel regression

March 24, 2016
By

Mike Hughes writes: I have been looking a your blog entries from about 8 years ago in which you comment on the number of groups that is appropriate in multilevel regression. I have a research problem in which I have 6 groups and would like to use multilevel regression. Here is the situation. I have […] The post Multilevel regression appeared first on Statistical Modeling, Causal Inference, and Social Science.

Read more »

The future of biostatistics

March 24, 2016
By

Starting in January my colleague Dimitris Rizopoulos and I took over as co-editors of the journal Biostatistics. We are pretty fired up to try some new things with the journal and to make sure that the most important advances in statistical methodology...

Read more »

Upcoming Win-Vector LLC appearances

March 23, 2016
By
Upcoming Win-Vector LLC appearances

Win-Vector LLC will be presenting on statistically validating models using R and data science at: Strata+Hadoop World “R Day” Tutorial 9:00am–5:00pm Tuesday, March 29 2016, San Jose, California. ODSC San Francisco Meetup, 6:30pm-9:00pm Thursday, March 31, 2016, San Francisco, California. We will share code and examples. Registration required (and Strata is a paid conference). Please … Continue reading Upcoming Win-Vector LLC appearances

Read more »

In defense of endless arguments

March 23, 2016
By

A couple months ago (that is, yesterday; remember our 2-month delay) some commenters expressed exhaustion and irritation at the Kahneman-Gigerenzer catfight, or more generally the endless debate between those who emphasize irrationality in human decision making and those who emphasize the adaptive and functional qualities of our shortcuts. I would like to respond to this […] The post In defense of endless arguments appeared first on Statistical Modeling, Causal Inference,…

Read more »

Sitting still against the myth that sitting kills

March 23, 2016
By

The fad of standing while working may die hard but science is catching up to it. The idea that standing at work will make one healthier has always been a tough one to believe. It requires a series of premises: Using a standing desk increases the amount of standing Standing longer improves one's health The health improvement is measurable using a well-defined metric The incremental standing is of sufficient amount…

Read more »

Plotting overlapping prediction intervals

Plotting overlapping prediction intervals

I often see figures with two sets of prediction intervals plotted on the same graph using different line types to distinguish them. The results are almost always unreadable. A better way to do this is to use semi-transparent shaded regions. Here is an ...

Read more »

Nonparametric regression for binary response data in SAS

March 23, 2016
By
Nonparametric regression for binary response data in SAS

My previous blog post shows how to use PROC LOGISTIC and spline effects to predict the probability that an NBA player scores from various locations on a court. The LOGISTIC procedure fits parametric models, which means that the procedure estimates parameters for every explanatory effect in the model. Spline bases […] The post Nonparametric regression for binary response data in SAS appeared first on The DO Loop.

Read more »

Will. Not. Rise. To. Bait.

March 23, 2016
By

Someone sends me an email, “I don’t know what to do with this so I thought I would send it to you,” with a link to a university press release about a recently published research paper, full of silly statistical errors and signifying nothing. I replied: Can’t you just ignore this? Why give it any […] The post Will. Not. Rise. To. Bait. appeared first on Statistical Modeling, Causal Inference,…

Read more »


Subscribe

Email:

  Subscribe