Posts Tagged ‘ Data Analysis ’

Exploratory Data Analysis of Ozone Pollution in New York City – Descriptive Statistics

May 19, 2013
By
Exploratory Data Analysis of Ozone Pollution in New York City – Descriptive Statistics

Introduction This is the first of a series of posts on exploratory data analysis (EDA).  This post will calculate the common summary statistics of a univariate continuous data set – the data on ozone pollution in New York City that is part of the built-in “CO2″ data set in R.  This is a particularly good data set […]

Read more »

Use regression for a univariate analysis? Yes!

May 13, 2013
By
Use regression for a univariate analysis? Yes!

I've conducted a lot of univariate analyses in SAS, yet I'm always surprised when the best way to carry out the analysis uses a SAS regression procedure. I always think, "This is a univariate analysis! Why am I using a regression procedure? Doesn't a regression require at least two variables?" [...]

Read more »

“My” chromosome 8p inversion

May 8, 2013
By
“My” chromosome 8p inversion

There was lots of discussion on twitter yesterday about Graham Coop’s paper with Peter Ralph (or vice versa), on The geography of recent genetic ancestry across Europe, particularly regarding the FAQ they’d created. I was eager to take a look, and, it’s slightly embarrassing to say, I first did a search to see if they’d […]

Read more »

A three-panel visualization of a distribution

May 8, 2013
By
A three-panel visualization of a distribution

At a recent conference, I talked with a SAS customer who told me that he was using an R package to create a three-panel visualization of a distribution. Unfortunately, he couldn't remember the name of the package, and he has not returned my e-mails, so the purpose of today's article [...]

Read more »

Compute confidence intervals for percentiles in SAS

May 6, 2013
By
Compute confidence intervals for percentiles in SAS

PROC UNIVARIATE has provided confidence intervals for standard percentiles (quartiles) for eons. However, in SAS 9.3M2 (featuring the 12.1 analytical procedures) you can use a new feature in PROC UNIVARIATE to compute confidence intervals for a specified list of percentiles. To be clear, percentiles and quantiles are essentially the same [...]

Read more »

Quantile regression: Better than connecting the sample quantiles of binned data

April 17, 2013
By
Quantile regression: Better than connecting the sample quantiles of binned data

I often see variations of the following question posted on statistical discussion forums: I want to bin the X variable into a small number of values. For each bin, I want to draw the quartiles of the Y variable for that bin. Then I want to connect the corresponding quartile [...]

Read more »

Data science is statistics

April 5, 2013
By
Data science is statistics

When physicists do mathematics, they don’t say they’re doing “number science”. They’re doing math. If you’re analyzing data, you’re doing statistics. You can call it data science or informatics or analytics or whatever, but it’s still statistics. If you say that one kind of data analysis is statistics and another kind is not, you’re not […]

Read more »

The difference of density estimates: When does it make sense?

April 3, 2013
By
The difference of density estimates: When does it make sense?

I was recently asked how to compute the difference between two density estimates in SAS. The person who asked the question sent me a link to a paper from The Review of Economics and Statistics that contains several examples of this technique (for example, see Figure 3 on p. 16 [...]

Read more »

How do Dew and Fog Form? Nature at Work with Temperature, Vapour Pressure, and Partial Pressure

April 1, 2013
By
How do Dew and Fog Form?  Nature at Work with Temperature, Vapour Pressure, and Partial Pressure

In the early morning, especially here in Canada, I often see dew – water droplets formed by the condensation of water vapour on outside surfaces, like windows, car roofs, and leaves of trees.  I also sometimes see fog – water droplets or ice crystals that are suspended in air and often blocking visibility at great […]

Read more »

Checking for Normality with Quantile Ranges and the Standard Deviation

March 31, 2013
By
Checking for Normality with Quantile Ranges and the Standard Deviation

Introduction I was reading Michael Trosset’s “An Introduction to Statistical Inference and Its Applications with R”, and I learned a basic but interesting fact about the normal distribution’s interquartile range and standard deviation that I had not learned before.  This turns out to be a good way to check for normality in a data set. […]

Read more »

Subscribe

Email:

  Subscribe