Posts Tagged ‘ R ’

Next Kölner R User Meeting: Friday, 6 March 2014

March 3, 2015
By
Next Kölner R User Meeting: Friday, 6 March 2014

The next Cologne R user group meeting is scheduled for this Friday, 6 March 2014 and we have an exciting agenda with two talks, followed by networking drinks:Using R in Excel via R.NETGünter Faes and Matthias SpixMS Office and Excel are the 'de-facto'...

Read more »

Playing around with #rstats twitter data

February 28, 2015
By
Playing around with #rstats twitter data

As a bit of weekend fun, I decided to briefly look into the #rstats twitter data that Stephen Turner collected and made available (thanks!). Essentially, this data set contains some basic information about over 100,000 tweets that contain the hashtag… Continue reading →

Read more »

Career NBA: The Road Least Traveled

February 27, 2015
By
Career NBA: The Road Least Traveled

The bell rings - time to go to practice. Jarnell Stokes heads over to the gym, changes, and starts warming up with his teammates. It's his Junior year in high school. The Memphis, Tennessee native has a lot on his mind; soon he'll have to mak...

Read more »

Does Balancing Classes Improve Classifier Performance?

February 27, 2015
By
Does Balancing Classes Improve Classifier Performance?

It’s a folk theorem I sometimes hear from colleagues and clients: that you must balance the class prevalence before training a classifier. Certainly, I believe that classification tends to be easier when the classes are nearly balanced, especially when the class you are actually interested in is the rarer one. But I have always been … Continue reading Does Balancing Classes Improve Classifier Performance? → Related posts: Don’t use correlation…

Read more »

Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

February 26, 2015
By
Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

Anscombe’s quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician Francis Ansco...

Read more »

R: How to Layout and Design an Infographic

February 26, 2015
By
R: How to Layout and Design an Infographic

As promised from my recent article, here's my tutorial on how to layout and design an infographic in R. This article will serve as a template for more infographic design that I plan to share on future posts. Hence, we will go through the following sect...

Read more »

Announcing: Introduction to Data Science video course

February 25, 2015
By
Announcing: Introduction to Data Science video course

Win-Vector LLC’s Nina Zumel and John Mount are proud to announce their new data science video course Introduction to Data Science is now available on Udemy. We designed the course as an introduction to an advanced topic. The course description is: Use the R Programming Language to execute data science projects and become a data … Continue reading Announcing: Introduction to Data Science video course → Related posts: A bit…

Read more »

Minimal examples help

February 24, 2015
By
Minimal examples help

The other day I got stuck working with a huge data set using data.table in R. It took me a little while to realise that I had to produce a minimal reproducible example to actually understand why I got stuck in the first place. I know, this is the mantr...

Read more »

Applied Nonparametric Econometrics

February 19, 2015
By
Applied Nonparametric Econometrics

Recently, I received a copy of a new econometrics book, Applied Nonparametric Econometrics, by Daniel Henderson and Christopher Parmeter.The title is pretty self-explanatory and, as you'd expect with any book published by CUP, this is a high-quali...

Read more »

amazing Gibbs sampler

February 18, 2015
By
amazing Gibbs sampler

When playing with Peter Rossi’s bayesm R package during a visit of Jean-Michel Marin to Paris, last week, we came up with the above Gibbs outcome. The setting is a Gaussian mixture model with three components in dimension 5 and the prior distributions are standard conjugate. In this case, with 500 observations and 5000 Gibbs […]

Read more »


Subscribe

Email:

  Subscribe