scikit-learn’s EuroScipy 2011 coding sprint — day two

August 24, 2011
By
scikit-learn’s EuroScipy 2011 coding sprint — day two

Today's coding sprint was a bit more crowded, with some notable scipy hackers such as Ralph Gommers, Stefan van der Walt, David Cournapeau or Fernando Perez from Ipython joining in. On what got done: - We merged Jake's new BallTree code. This is a pur...

Read more »

Crowd sourcing forecasts

August 24, 2011
By
Crowd sourcing forecasts

Forecasting Ace is looking for participants to develop improved methods for predicting future events and outcomes. Their goal is to develop methods for aggregating many individual judgments in a manner that yields more accurate predictions than any one...

Read more »

scikit-learn EuroScipy 2011 coding sprint — day one

August 23, 2011
By
scikit-learn EuroScipy 2011 coding sprint — day one

As a warm-up for the upcoming EuroScipy-conference, some of the scikit-learn developers decided to gather and work together for a couple of days. Today was the first day and there was only a handfull of us, as the real kickoff is expected tomorrow. Som...

Read more »

Common Table Expressions

August 23, 2011
By
Common Table Expressions

It's been a while since I posted. My new role at TripAdvisor has been keeping me pretty busy! My first post after a long absence is about a feature of SQL that I have recently fallen in love with. Usually, I leave it to Gordon to write about SQL since ...

Read more »

useR! Conference 2011 highlights

August 20, 2011
By
useR! Conference 2011 highlights

I was at the useR! Conference at The University of Warwick in Coventry, UK, last week. My goal in going was to learn the latest things regarding (simple) dynamic graphics, (simple) web-based apps, parallel computing, and memory management (dealing with big data sets). I got just what I was hoping for and more. There are [...]

Read more »

Benchmark Regression Procedures using OLS Regression

August 18, 2011
By
Benchmark Regression Procedures using OLS Regression

Rick Wicklin discussed in his blog the performance in solving a linear system using SOLVE() function and INV() function from IML. Since regression analysis is an integral part of SAS applications and there are many SAS procedures in SAS/STAT that are c...

Read more »

Another application of R getting press

August 18, 2011
By
Another application of R getting press

Prof. Atul Butte of Stanford University and colleagues just published two articles in Science Translational Research which got a fair amount of press.  In fact I heard about the work on the radio on my commute to work. The research involves developing a computational method which can look at drug-disease interactions based on the NCBI GEO […]

Read more »

The stupidest R code ever

August 17, 2011
By
The stupidest R code ever

Let’s start this blog off right, with the stupidest R mistake I’ve ever made (I think). In the R package that I write, R/qtl, one of the main file formats is a comma-delimited file, where the blank cells in the second row are important, as they distinguish the initial phenotype columns from the genetic marker [...]

Read more »

The fun Package: Use R for Fun!

August 16, 2011
By
The fun Package: Use R for Fun!

A couple of days ago we released a package named fun to CRAN, but I did not dare to send an announcement to r-packages@r-project.org as usual. This package is a collection of some classical computer games (e.g. the Mine sweeper and Five in a row) as we...

Read more »

Learn Machine Learning at Stanford for free

August 16, 2011
By
Learn Machine Learning at Stanford for free

Andrew Ng’s machine learning course at Stanford is being offered free to anyone online in the (northern) fall of 2011. I’ve seen some of the notes from this course and it looks to be an excellent broad introduction to machine learning and d...

Read more »

Beware of junk journals and publishers

August 12, 2011
By
Beware of junk journals and publishers

Today I received the following email: Dear Professor, Antarctica Journal of Mathematics Archimedes Journal of Mathematics Bessel Journal of Mathematics Cayley Journal of Mathematics Diophantus Journal of Mathematics We are charging only $3 per page,...

Read more »

Markov chains don’t converge

August 10, 2011
By

I often hear people often say they’re using a burn-in period in MCMC to run a Markov chain until it converges. But Markov chains don’t converge, at least not the Markov chains that are useful in MCMC. These Markov chains wander around forever exploring the domain they’re sampling from. Any point that makes a “bad” [...]

Read more »

Rolling Window Regression of Time Series

August 10, 2011
By
Rolling Window Regression of Time Series

We demonstrate a comparison among various implementations of Rolling Regression in SAS and show that the fastest implementation is over 3200X faster than traditional BY-processing approach.More often than not, we encounter a problem where an OLS over a...

Read more »

The single big jump principle

August 9, 2011
By
The single big jump principle

Suppose you’re playing a game where you take 10 steps of a random size. Here are two variations on the game. Which will give you a better chance of ending up far from where you started? You take your steps one at a time, starting each new step from where the last one took you. You return [...]

Read more »

Le Kernel Smoothing avec rupture(s) sous SAS

August 9, 2011
By
Le Kernel Smoothing avec rupture(s) sous SAS

Voici une petite macro SAS bien utile pour tout ceux qui souhaitent faire du Kernel Smoothing. En plus de cela, elle est adaptée au cas des variables qui présentent des ruptures (1 ou 2 max), bien connu des économètres qui font des regressions sur discontinuité. Rappelons tout d’abord le principe du Kernel Smoothing (ou “lissage [...]

Read more »

Programmers Should Know R

August 6, 2011
By
Programmers Should Know R

Programmers should definitely know how to use R. I don’t mean they should switch from their current language to R, but they should think of R as a handy tool during development.Again and again I find myself working with Java code like the following. public class SomeBigProject1 { public static double logStirlingApproximation(final int n) { [...] Related posts: Why I don’t like Dynamic Typing Automatic Differentiation with Scala R examine…

Read more »

Outlier Detection with DPM Slides from JSM 2011

August 5, 2011
By
Outlier Detection with DPM Slides from JSM 2011

Here are the 14 slides I used during my talk at the Joint Statistical Meetings 2011: shotwell-jsm-2011.pdf. I’m trying hard to minimize the text in my presentation slides. But, this usually requires that I practice more. Hence, you will know which talks I have practiced thoroughly by the amount of text in the slides . [...]

Read more »

Faster Gibbs sampling MCMC from within R

July 31, 2011
By
Faster Gibbs sampling MCMC from within R

Introduction This post follows on from the previous post on Gibbs sampling in various languages. In that post a simple Gibbs sampler was implemented in various languages, and speeds were compared. It was seen that R is very slow for iterative simulation algorithms characteristic of MCMC methods such as the Gibbs sampler. Statically typed languages [...]

Read more »

RStudio 0.94.92 visited

July 30, 2011
By
RStudio 0.94.92 visited

I just updated my RStudio version to the latest, v.0.94.92 (will this asymptotically approach 1, or actually get to 1?). It was nice to see the number of improvements the development team has implemented, based I’m sure on community feedback. The team has, in my experience, been extraordinarily responsive to user feedback, and I’m sure […]

Read more »

A ggplot trick to plot different plot types in facets

July 29, 2011
By
A ggplot trick to plot different plot types in facets

At the DC useR meetup last week, Marck Vaisman (@wahalulu) showed me a neat trick he’d learned to allow different facets in a faceted ggplot graph to have different plot types. The basis for this trick is this blog post in the Learn-R blog. Marck was trying to plot different statistics on our Meetup group’s […]

Read more »

Word Cloud in R

July 27, 2011
By
Word Cloud in R

A word cloud (or tag cloud) can be an handy tool when you need to highlight the most commonly cited words in a text using a quick visualization. Of course, you can use one of the several on-line services, such as wordle or tagxedo ,...

Read more »

Math superhero in training

July 27, 2011
By

Steve Yegge has a new project. He’s in training to become a math superhero. Or at least a sidekick. He said that math/stat folks superheros and he wants to join them. In his presentation at OSCON Data 2011 on Monday, Yegge said that all the hard problems require math and statistics. So he’s quitting his job [...]

Read more »

Recommended survey papers

Recommended survey papers

Survey articles are particularly helpful in getting a foothold in a new research area, or in looking for important papers that you may have overlooked. Whatever area of research you are in, look out for survey papers and journals dedicated to publishin...

Read more »


Subscribe

Email:

  Subscribe