Posts Tagged ‘ classifier quality ’

Does Balancing Classes Improve Classifier Performance?

February 27, 2015
By
Does Balancing Classes Improve Classifier Performance?

It’s a folk theorem I sometimes hear from colleagues and clients: that you must balance the class prevalence before training a classifier. Certainly, I believe that classification tends to be easier when the classes are nearly balanced, especially when the class you are actually interested in is the rarer one. But I have always been … Continue reading Does Balancing Classes Improve Classifier Performance? → Related posts: Don’t use correlation…

Read more »

The Geometry of Classifiers

December 19, 2014
By
The Geometry of Classifiers

As John mentioned in his last post, we have been quite interested in the recent study by Fernandez-Delgado, et.al., “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” (the “DWN study” for short), which evaluated 179 popular implementations of common classification algorithms over 120 or so data sets, mostly from the UCI … Continue reading The Geometry of Classifiers → Related posts: Does Balancing Classes Improve Classifier…

Read more »

Can a classifier that never says “yes” be useful?

March 8, 2014
By
Can a classifier that never says “yes” be useful?

Many data science projects and presentations are needlessly derailed by not having set shared business relevant quantitative expectations early on (for some advice see Setting expectations in data science projects). One of the most common issues is the common layman expectation of “perfect prediction” from classification projects. It is important to set expectations correctly so […] Related posts: Setting expectations in data science projects More on ROC/AUC On Being a…

Read more »

More on ROC/AUC

January 18, 2013
By
More on ROC/AUC

A bit more on the ROC/AUC The receiver operating characteristic curve (or ROC) is one of the standard methods to evaluate a scoring system. Nina Zumel has described its application, but we would like to emphasize out some additional details. In my opinion while the ROC is a useful tool, the “area under the curve” [...] Related posts: “I don’t think that means what you think it means;” Statistics to…

Read more »


Subscribe

Email:

  Subscribe