Jocular Disbelief

January 22, 2012
By
Jocular Disbelief

Posted by Yann LeCun  -  Jan 19, 2012  -  Public Google+Joke of the day (true story, circa 2004):Radford Neal (giving a talk): I don't necessarily think that the Bayesian method is the best thing to do in all cases...Geoff Hinton: S...

Read more »

R Regression Diagnostics Part 1

January 20, 2012
By
R Regression Diagnostics Part 1

Linear regression can be a fast and powerful tool to model complex phenomena. However, it makes several assumptions about your data, and quickly breaks down when these assumptions, such as the assumption that a linear relationship exists between the ...

Read more »

R Regression Diagnostics Part 1

January 20, 2012
By
R Regression Diagnostics Part 1

Linear regression can be a fast and powerful tool to model complex phenomena. However, it makes several assumptions about your data, and quickly breaks down when these assumptions, such as the assumption that a linear relationship exists between the p...

Read more »

Six Fundamental Methods to Generate a Random Variable

January 20, 2012
By

Introduction To implement many numeric simulations you need a sophisticated source of instances of random variables. The question is: how do you generate them? The literature is full of algorithms requiring random samples as inputs or drivers (conditional random fields, Bayesian network models, particle filters and so on). The literature is also full of competing [...] Related posts: What is a large enough random sample? Kernel Methods and Support Vector…

Read more »

R Regression Diagnostics Part 1

January 20, 2012
By
R Regression Diagnostics Part 1

Linear regression can be a fast and powerful tool to model complex phenomena. However, it makes several assumptions about your data, and quickly breaks down when these assumptions, such as the assumption that a linear relationship exists between the pr...

Read more »

Joint Techs Netcast: Enhancing Infrastructure Support for Data Intensive Science

January 20, 2012
By
Joint Techs Netcast: Enhancing Infrastructure Support for Data Intensive Science

The winter Joint Techs meeting is next week in Baton Rouge. I'm not going, but I plan on participating via a netcast to see what's going on. Jim Bottum, Clemson's CIO, is moderating an entire day devoted to the topic Enhancing Infrastructure Suppo...

Read more »

Some of my best friends are crackpots

January 20, 2012
By
Some of my best friends are crackpots

I have a soft spot for crank science.  Recently I visited Norumbega Tower, which is an enduring monument to the crackpot theories of Eben Norton Horsford, inventor of double-acting baking powder and faux history.  But that's not what this art...

Read more »

R Regression Diagnostics Part 1

January 20, 2012
By
R Regression Diagnostics Part 1

Linear regression can be a fast and powerful tool to model complex phenomena. However, it makes several assumptions about your data, and quickly breaks down when these assumptions, such as the assumption that a linear relationship exists between the p...

Read more »

Refereeing a journal article

January 20, 2012
By
Refereeing a journal article

I’ve written briefly on this before. For an excellent and more detailed discussion of what is involved, there is a series of excellent posts on Pat Thomson’s blog: Refereeing a journal article part 1: reading Refereeing a journal article ...

Read more »

Scoping functions in R

January 19, 2012
By

I want to test embedding source code in the blog by using the handy Gist tool provided by GitHub. These two R functions are a good opportunity to test out embedding a Gist on the website. These functions allow for threshold testing within a vector in R...

Read more »

Analyzing Federal Government Bailout Recipients in R

January 19, 2012
By
Analyzing Federal Government Bailout Recipients in R

I was searching for open data recently, and stumbled on Socrata. Socrata has a lot of interesting data sets, and while I was browsing around, I found a data set on federal bailout recipients. Here is the data set. However, data sets on Socrata are n...

Read more »

Analyzing Federal Government Bailout Recipients in R

January 19, 2012
By
Analyzing Federal Government Bailout Recipients in R

I was searching for open data recently, and stumbled on Socrata. Socrata has a lot of interesting data sets, and while I was browsing around, I found a data set on federal bailout recipients. Here is the data set. However, data sets on Socrata are no...

Read more »

An Intro to Ensemble Learning in R

January 19, 2012
By

Introduction This post incorporates parts of yesterday’s post about bagging. If you are unfamiliar with bagging, I suggest that you read it before continuing with this article. I would like to give a basic overview of ensemble learning. Ensemb...

Read more »

An Intro to Ensemble Learning in R

January 19, 2012
By

Introduction This post incorporates parts of yesterday's post about bagging. If you are unfamiliar with bagging, I suggest that you read it before continuing with this article. I would like to give a basic overview of ensemble learning. Ensemble lear...

Read more »

Analyzing Federal Bailout Recipients in R

January 19, 2012
By
Analyzing Federal Bailout Recipients in R

I was searching for open data recently, and stumbled on Socrata. Socrata has a lot of interesting data sets, and while I was browsing around, I found a data set on federal bailout recipients. Here is the data set. However, data sets on Socrata are not ...

Read more »

Intro to Ensemble Learning in R

January 19, 2012
By
Intro to Ensemble Learning in R

Introduction This post incorporates parts of yesterday’s post about bagging. If you are unfamiliar with bagging, I suggest that you read it before continuing with this article. I would like to give a basic overview of ensemble learning. Ensembl...

Read more »

Analyzing Federal Bailout Recipients in R

January 19, 2012
By
Analyzing Federal Bailout Recipients in R

I was searching for open data recently, and stumbled on Socrata. Socrata has a lot of interesting data sets, and while I was browsing around, I found a data set on federal bailout recipients. Here is the data set. However, data sets on Socrata are not...

Read more »

Intro to Ensemble Learning in R

January 19, 2012
By
Intro to Ensemble Learning in R

Introduction This post incorporates parts of yesterday's post about bagging. If you are unfamiliar with bagging, I suggest that you read it before continuing with this article. I would like to give a basic overview of ensemble learning. Ensemble learn...

Read more »

Internet surveys

January 18, 2012
By
Internet surveys

I received the following email today: I am preparing a thesis … I need to conduct the widest possible poll, and it occurred to me that perhaps you could guide me toward an internet-based way in which this can be done easily. I have a ten-questi...

Read more »

Specialized Workshop, Feb 17-18, Seton Hall University

January 18, 2012
By
Specialized Workshop, Feb 17-18, Seton Hall University

I'll be doing a day-and-a-half workshop at Seton Hall University, including a specialized analysis of data produced by researchers at SHU. Details can be found here.A list of future and past workshops can be found here.

Read more »

SOPA / PIPA

January 18, 2012
By
SOPA / PIPA

Graph of the Week is blacked out today (January 18, 2012) to join in the online protest to the SOPA and PIPA bills. Helpful links: http://blog.reddit.com/2012/01/technical-examination-of-sopa-and.html http://yro.slashdot.org/story/12/01/18/0834...

Read more »

Parameterizing a gamma distribution by mode and sd

January 18, 2012
By
Parameterizing a gamma distribution by mode and sd

When trying to fashion a gamma-shaped prior, I've found it more intuitive to start with the mode and standard deviation, instead of the mean and standard deviation as used in the book. The reason is that the gamma distribution is typically very skewed,...

Read more »

Improve Predictive Performance in R with Bagging

January 18, 2012
By

Bagging, aka bootstrap aggregation, is a relatively simple way to increase the power of a predictive statistical model by taking multiple random samples(with replacement) from your training data set, and using each of these samples to construct a sepa...

Read more »


Subscribe

Email:

  Subscribe