Michael Porter as new pincushion

August 20, 2016
By

Some great comments on this post about Ted talk visionary Michael Porter. Most rewarding was this from Howard Edwards: New Zealand seems to score well on his index so perhaps I shouldn’t complain, but Michael Porter was well known in this part of the world 25 years ago when our government commissioned him to write […] The post Michael Porter as new pincushion appeared first on Statistical Modeling, Causal Inference,…

Read more »

vtreat 0.5.27 released on CRAN

August 19, 2016
By

Win-Vector LLC, Nina Zumel and I are pleased to announce that ‘vtreat’ version 0.5.27 has been released on CRAN. vtreat is a data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. (from the package documentation) Very roughly vtreat accepts an arbitrary “from the wild” data frame (with different column types, … Continue reading vtreat 0.5.27 released on CRAN

Read more »

Things that sound good but aren’t quite right: Art and research edition

August 19, 2016
By

There are a lot of things you can say that sound very sensible but, upon reflection, are missing something. For example consider this blog comment from Chris G: Years ago I heard someone suggest these three questions for assessing a work of art: 1. What was the artist attempting to do? 2. Were they successful? […] The post Things that sound good but aren’t quite right: Art and research edition…

Read more »

CFP: AusDM 2016 paper submission extended to 2 Sept

August 19, 2016
By
CFP: AusDM 2016 paper submission extended to 2 Sept

14th Australasian Data Mining Conference (AusDM 2016) Canberra, Australia, 6-8 December 2016 URL: http://ausdm16.ausdm.org/ Join us on LinkedIn: http://www.linkedin.com/groups/AusDM-4907891 The Australasian Data Mining Conference has established itself as the premier Australasian meeting for both practitioners and researchers in data mining. … Continue reading →

Read more »

Trading strategy: Making the most of the out of sample data

August 19, 2016
By
Trading strategy: Making the most of the out of sample data

When testing trading strategies a common approach is to divide the initial data set into in sample data: the part of the data designed to calibrate the model and out of sample data: the part of the data used to validate the calibration and ensure that the performance created in sample will be reflected in the […]

Read more »

GMO labeling is good science

August 18, 2016
By

A GMO labeling law has arrived in the US, albeit one that has no teeth (link). For those who don't want to click on the link, the law is passed in haste to pre-empt a more stringent Vermont law. The federal law defines GMO narrowly, businesses do not need to put word labels on packages (they can, for example, provide an 800-number), and violaters will not be punished. One of…

Read more »

My criticism of R numeric summary

August 18, 2016
By
My criticism of R numeric summary

My criticism of R‘s numeric summary() method is: it is unfaithful to numeric arguments (due to bad default behavior) and frankly it should be considered unreliable. It is likely the way it is for historic and compatibility reasons, but in my opinion it does not currently represent a desirable set of tradeoffs. summary() likely represents … Continue reading My criticism of R numeric summary

Read more »

An ethnographic study of the “open evidential culture” of research psychology

August 18, 2016
By

Claude Fischer points me to this paper by David Peterson, “The Baby Factory: Difficult Research Objects, Disciplinary Standards, and the Production of Statistical Significance,” which begins: Science studies scholars have shown that the management of natural complexity in lab settings is accomplished through a mixture of technological standardization and tacit knowledge by lab workers. Yet […] The post An ethnographic study of the “open evidential culture” of research psychology appeared…

Read more »

"Forecasting with R" short course in Eindhoven

August 18, 2016
By
"Forecasting with R" short course in Eindhoven

I will be giving my 3-day short-course/workshop on “Forecasting with R” in Eindhoven (Netherlands) from 19-21 October. Details at https://www.win.tue.nl/~adriemel/shortcourse.html Register here

Read more »

“Forecasting with R” short course in Eindhoven

August 18, 2016
By
“Forecasting with R” short course in Eindhoven

I will be giving my 3-day short-course/workshop on “Forecasting with R” in Eindhoven (Netherlands) from 19-21 October. Details at https://www.win.tue.nl/~adriemel/shortcourse.html Register here

Read more »

Stan Course up North (Anchorage, Alaska) 23–24 Aug 2016

August 17, 2016
By
Stan Course up North (Anchorage, Alaska) 23–24 Aug 2016

Daniel Lee’s heading up to Anchorage, Alaska to teach a two-day Stan course at the Alaska chapter of the American Statistical Association (ASA) meeting in Anchorage. Here’s the rundown: Information and Free Registration I hear Alaska’s beautiful in the summer—16 hour days in August and high temps of 17 degrees celsius. Plus Stan! More Upcoming […] The post Stan Course up North (Anchorage, Alaska) 23–24 Aug 2016 appeared first on…

Read more »

Two Ideas for a Better Visualization Web

August 17, 2016
By
Two Ideas for a Better Visualization Web

There is a reasonable amount of information about visualization available on the web. There are still huge gaps though, especially when it comes to bridging the gap between academic research and the rest of the world, though. Here are two ideas: one simple, one rather involved. Ben Shneiderman has recently been talking to a number of […]

Read more »

On the Evils of Hodrick-Prescott Detrending

August 17, 2016
By

[If you're reading this in email, remember to click through on the title to get the math to render.]Jim Hamilton has a very cool new paper, "Why You Should Never Use the Hodrick-Prescott (HP) Filter". Of course we've known of the pitfalls of HP ever si...

Read more »

What’s gonna happen in November?

August 17, 2016
By

Nadia Hassan writes: 2016 may be strange with Trump. Do you have any thoughts on how people might go about modeling a strange election? When I asked you about predictability and updating election forecasts, you stated that models that rely on polls at different points should be designed to allow for surprises. You have touted […] The post What’s gonna happen in November? appeared first on Statistical Modeling, Causal Inference,…

Read more »

NBC has a problem with bar lengths

August 17, 2016
By
NBC has a problem with bar lengths

Seems like reader Conor H. has found a pattern. He alerted us to the problem with bar lengths in the daily medals chart on NBC, which I blogged about the other day. Through twitter (@andyn), I was sent the following,...

Read more »

The smooth bootstrap method in SAS

August 17, 2016
By
The smooth bootstrap method in SAS

Last week I showed how to use the simple bootstrap to randomly resample from the data to create B bootstrap samples, each containing N observations. The simple bootstrap is equivalent to sampling from the empirical cumulative distribution function (ECDF) of the data. An alternative bootstrap technique is called the smooth […] The post The smooth bootstrap method in SAS appeared first on The DO Loop.

Read more »

National lottery

August 17, 2016
By
National lottery

Yesterday, many British newspapers have covered the news of the new Dementia Atlas, released by the Department of Health.As far as I can see, the atlas uses data from a variety of sources (including the Quality Outcomes Framework, QOF, scheme...

Read more »

How schools that obsess about standardized tests ruin them as measures of success

August 16, 2016
By
How schools that obsess about standardized tests ruin them as measures of success

Mark Palko and I wrote this article comparing the Success Academy chain of charter schools to Soviet-era factories: According to the tests that New York uses to evaluate schools, Success Academies ranks at the top of the state — the top 0.3 percent in math and the top 1.5 percent in English, according to the […] The post How schools that obsess about standardized tests ruin them as measures of…

Read more »

Statistical thinking on my subway commute

August 16, 2016
By

So I recently moved and needed to find the optimal subway ride up to Columbia. I have been go back and forth between my two choices to collect some data to help make up my mind. Both routes require two train exchanges but only the first leg differs. In other words: Route 1 : A -> B -> C Route 2 : X -> B -> C Here, the "nodes"…

Read more »

The Win-Vector parallel computing in R series

August 16, 2016
By

With our recent publication of “Can you nest parallel operations in R?” we now have a nice series of “how to speed up statistical computations in R” that moves from application, to larger/cloud application, and then to details. For your convenience here they are in order: A gentle introduction to parallel computing in R Running … Continue reading The Win-Vector parallel computing in R series

Read more »

Calorie labeling reduces obesity Obesity increased more slowly in California, Seattle, Portland (Oregon), and NYC, compared to some other places in the west coast and northeast that didn’t have calorie labeling

August 16, 2016
By
Calorie labeling reduces obesity Obesity increased more slowly in California, Seattle, Portland (Oregon), and NYC, compared to some other places in the west coast and northeast that didn’t have calorie labeling

Ted Kyle writes: I wonder if you might have some perspective to offer on this analysis by Partha Deb and Carmen Vargas regarding restaurant calorie counts. [Thin columnist] Cass Sunstein says it proves “that calorie labels have had a large and beneficial effect on those who most need them.” I wonder about the impact of […] The post Calorie labeling reduces obesity Obesity increased more slowly in California, Seattle, Portland…

Read more »

Probably the most useful R function I’ve ever written

August 15, 2016
By

The function in question is scriptSearch. I’m not much for superlatives — “most” and “best” imply one dimension, but we live in a multi-dimensional world. I’m making an exception. The statistic I have in mind for this use of “useful” is the waiting time between calls to the function divided by the human time saved […] The post Probably the most useful R function I’ve ever written appeared first on…

Read more »

The history of characterizing groups of people by their averages

August 15, 2016
By

Andrea Panizza writes: I stumbled across this article on the End of Average. I didn’t know about Todd Rose, thus I had a look at his Wikipedia entry: Rose is a leading figure in the science of individual, an interdisciplinary field that draws upon new scientific and mathematical findings that demonstrate that it is not […] The post The history of characterizing groups of people by their averages appeared first…

Read more »


Subscribe

Email:

  Subscribe