How to Winsorize data in SAS

July 15, 2015
By
How to Winsorize data in SAS

Recently a SAS customer asked how to Winsorize data in SAS. Winsorization is best known as a way to construct robust univariate statistics. The Winsorized mean is a robust estimate of location. The Winsorized mean is similar to the trimmed mean, and both are described in the documentation for PROC […] The post How to Winsorize data in SAS appeared first on The DO Loop.

Read more »

Talk: How to Visualize Data

July 15, 2015
By
Talk: How to Visualize Data

Last week, I gave one of the visualization primer talks at BioVis in Dublin. My goal was to show people some examples, but also criticize the rather poor visualization culture in bioinformatics and challenge people to do better. Here is a write-up of that talk. Seán O’Donoghue introduced me by calling me “infamous” for speaking … Continue reading Talk: How to Visualize Data

Read more »

Spot the power howler: α = ß?

July 15, 2015
By
Spot the power howler: α = ß?

Spot the fallacy! The power of a test is the probability of correctly rejecting the null hypothesis. Write it as 1 – β. So, the probability of incorrectly rejecting the null hypothesis is β. But the probability of incorrectly rejecting the null is α (the type 1 error probability). So α = β. I’ve actually […]

Read more »

Leave the Pima Indians alone!

July 14, 2015
By
Leave the Pima Indians alone!

“…our findings shall lead to us be critical of certain current practices. Specifically, most papers seem content with comparing some new algorithm with Gibbs sampling, on a few small datasets, such as the well-known Pima Indians diabetes dataset (8 covariates). But we shall see that, for such datasets, approaches that are even more basic than […]

Read more »

Easy Bayesian Bootstrap in R

July 14, 2015
By
Easy Bayesian Bootstrap in R

A while back I wrote about how the classical non-parametric bootstrap can be seen as a special case of the Bayesian bootstrap. Well, one difference between the two methods is that, while it is straightforward to roll a classical bootstrap in R, there...

Read more »

Awesomest media request of the year

July 14, 2015
By

(Sent to all the American Politics faculty at Columbia, including me) RE: Donald Trump presidential candidacy Hi, Firstly, apologies for the group email but I wasn’t sure who would be best prized to answer this query as we’ve not had much luck so far. I am a Dubai-based reporter for **. Donald Trump recently announced […] The post Awesomest media request of the year appeared first on Statistical Modeling, Causal…

Read more »

The 2015 Big Data Summit, 9-10 August 2015, collocated with ACM KDD 2015, Sydney

July 14, 2015
By
The 2015 Big Data Summit, 9-10 August 2015, collocated with ACM KDD 2015, Sydney

The 2015 Big Data Summit 9-10 August 2015 collocated with ACM KDD 2015, Sydney URL: http://2015.bigdatasummit.co/ We take this privilege opportunity to invite you to participate in the 2015 Big Data Summit: • Co-located with ACM KDD2015 • Plenary sessions … Continue reading →

Read more »

Survey weighting and regression modeling

July 14, 2015
By

Yphtach Lelkes points us to a recent article on survey weighting by three economists, Gary Solon, Steven Haider, and Jeffrey Wooldridge, who write: We start by distinguishing two purposes of estimation: to estimate population descriptive statistics and to estimate causal effects. In the former type of research, weighting is called for when it is needed […] The post Survey weighting and regression modeling appeared first on Statistical Modeling, Causal Inference,…

Read more »

ChainLadder 0.2.1 released

July 14, 2015
By
ChainLadder 0.2.1 released

Over the weekend we released version 0.2.1 of the ChainLadder package for claims reserving on CRAN. New FeaturesNew function PaidIncurredChain by Fabio Concina, based on the 2010 Merz & Wüthrich paper Paid-incurred chain claims reserving methodFun...

Read more »

Flawed thinking about causes

July 13, 2015
By

One of the most misguided and dangerous ideas floated around by a group of Big Data enthusiasts is the notion that it is not important to understand why something happens, just because "we have a boatload of data". This is one of the central arguments in the bestseller Big Data, and it reached the mainstream much earlier when Chris Anderson, then chief editor of Wired, published his flamboyantly-titled op-ed proclaiming…

Read more »

Will Millennials Ever Get Married?

July 13, 2015
By
Will Millennials Ever Get Married?

At SciPy last week I gave a talk called "Will Millennials Ever Get Married?  Survival Analysis and Marriage Data".  I presented results from my analysis of data from the National Survey of Family Growth (NSFG).  The slides I presented ar...

Read more »

Don’t do the Wilcoxon

July 13, 2015
By
Don’t do the Wilcoxon

The Wilcoxon test is a nonparametric rank-based test for comparing two groups. It’s a cool idea because, if data are continuous and there is no possibility of a tie, the reference distribution depends only on the sample size. There are no nuisance parameters, and the distribution can be tabulated. From a Bayesian point of view, […] The post Don’t do the Wilcoxon appeared first on Statistical Modeling, Causal Inference, and…

Read more »

On deck this week

July 13, 2015
By

Mon: Don’t do the Wilcoxon Tues: Survey weighting and regression modeling Wed: Prior information, not prior belief Thurs: Draw your own graph! Fri: Measurement is part of design Sat: Annals of Spam Sun: “17 Baby Names You Didn’t Know Were ...

Read more »

What Seasonally-Adjusted U.S. Economic Data Needs Most…

July 13, 2015
By

...is non-adjustment. This is not a minor issue: there's not even an unadjusted U.S. GDP! Seasonal adjustment is sometimes desirable, but sometimes not. Sometimes it's done poorly, sometimes it's better done with extra care and transparency b...

Read more »

Compare the performance of algorithms in SAS

July 13, 2015
By
Compare the performance of algorithms in SAS

As my colleague Margaret Crevar recently wrote, it is useful to know how long SAS programs take to run. Margaret and others have written about how to use the SAS FULLSTIMER option to monitor the performance of the SAS system. In fact, SAS distributes a macro that enables you to […] The post Compare the performance of algorithms in SAS appeared first on The DO Loop.

Read more »

“Physical Models of Living Systems”

July 12, 2015
By

Phil Nelson writes: I’d like to alert you that my new textbook, “Physical Models of Living Systems,” has just been published. Among other things, this book is my attempt to bring Bayesian inference to undergraduates in any science or engineering major, and the course I teach from it has been enthusiastically received. The book is […] The post “Physical Models of Living Systems” appeared first on Statistical Modeling, Causal Inference,…

Read more »

Going to iHEA

July 11, 2015
By

iHEA's conference is kind of big deal in health economics: it's usually very big, with lots of sessions and lots of people participating. I have been to a few, both sides of the Atlantic and they are usually very good. This year it's in Milan (I think ...

Read more »

Inauthentic leadership? Development and validation of methods-based criticism

July 11, 2015
By

Thomas Basbøll writes: I need some help with a critique of a paper that is part of the apparently growing retraction scandal in leadership studies. Here’s Retraction Watch. The paper I want to look at is here: “Authentic Leadership: Development and Validation of a Theory-Based Measure” By F. O. Walumbwa, B. J. Avolio, W. L. […] The post Inauthentic leadership? Development and validation of methods-based criticism appeared first on Statistical…

Read more »

The Mozilla Fellowship for Science

July 10, 2015
By

This looks like an interesting opportunity for grad students, postdocs, and early career researchers: We’re looking for researchers with a passion for open source and data sharing, already working to shift research practice to be more collaborative, iterative and open. Fellows will spend 10 months starting September 2015 as community catalysts at their institutions, mentoring

Read more »

Open, Useful, Reusable

July 10, 2015
By
Open, Useful, Reusable

In OECD’s brand new publication ‘Government at a Glance 2015’ we can find a new indicator: The OUR Index. It stands for ‘Open, Useful, Reusable Government Data’. ‘The new OECD OURdata Index reveals that many countries have made progress in making public data more available and accessible, but large variations remain, not least with respect to the quality … Continue reading Open, Useful, Reusable

Read more »

Open, Useful, Reusable

July 10, 2015
By
Open, Useful, Reusable

In OECD’s brand new publication ‘Government at a Glance 2015’ we can find a new indicator: The OUR Index. It stands for ‘Open, Useful, Reusable Government Data’. ‘The new OECD OURdata Index reveals that many countries have made progress in making public data more available and accessible, but large variations remain, not least with respect to the quality … Continue reading Open, Useful, Reusable

Read more »

Economists betting on replication

July 10, 2015
By

Mark Patterson writes: A bunch of folks are collaborating on a project to replicate 18 experimental studies published in prominent Econ journals (mostly American Economic Review, a few Quarterly Journal of Economics). This is already pretty exciting, but the really cool bit is they’re opening a market (with real money) to predict which studies will […] The post Economists betting on replication appeared first on Statistical Modeling, Causal Inference, and…

Read more »

‘Student’, on Kurtosis

July 9, 2015
By
‘Student’, on Kurtosis

W. S. Gosset (Student) provided this useful aid to help us remember the difference between platykurtic and leptokurtic distributions:('Student', 1927. Errors of routine analysis. Biometrika, 19, 151-164. See p. 160.)Here, β2 is the fourth standardized...

Read more »


Subscribe

Email:

  Subscribe