Blog Archives

How not to reveal your MySQL DB login/password when sharing code on GitHub or BitBucket?

March 19, 2013
By

Solution: use your ~/.my/cnfInside your ~/.my.cnf file define the connection parameters to your databases. For example, here I define two groups called local and toto:[local]user = rootpassword = ultra_secrethost = localhost[toto]user = capitaine_flamp...

Read more »

Large correlation in parallel

February 24, 2013
By
Large correlation in parallel

A little improvement to the bigcor function proposed on Rmazing to compute huge correlation matrix in R, I made the function work in parallel using all the CPU cores available on the machine. The code is here.Here is a benchmark of the 2 func...

Read more »

Air quality analysis from Beijing twitter feed.

January 14, 2013
By
Air quality analysis from Beijing twitter feed.

As air pollution in Beijing reach new high [NYT article]. I re-ran the analysis I put online a few months ago. "Crazy bad" is a good description when it reach those levels. But I am sure there are other place like Mexico city, LA etc... that also look...

Read more »

Computing an empirical pFDR in R

December 21, 2012
By

The positive false discovery rate (pFDR) has become a classical procedure to test for false positive. It is one of my favourite because it rely on a re-sampling approach.I base my implementation on John Storey PNAS paper and the technical report he pub...

Read more »

Religious restrictions index: how do countries compare?

September 21, 2012
By

The Guardian DataBlog published yesterday an interesting article exploring graphically the religious intolerance across the world. The data are coming from a report published by Pew Research Center's Forum on Religion and Public Life. I like the philosophy DataBlog a lot, providing the raw data for everyone to look at. However, I felt that the visualization could be improved. First the data are longitudinal and no temporal representation is provided.…

Read more »

Twitter analysis of air pollution in Beijing

July 31, 2012
By
Twitter analysis of air pollution in Beijing

One of the air pollution detection machine in Beijing (at the American Embassy) is connected to Twitter and tweet about the air quality in real time. By default the machine in Beijing output the 24hr summary PM2.5 air pollution information. What is PM2...

Read more »

Rcpp vs. R implementation of cosine similarity

June 10, 2012
By

While speeding up some code the other day working on a project with a colleague I ended up trying Rcpp for the first time. I re-implemented the cosine distance function using RcppArmadillo relatively easily using bits and pieces of code I found scatter...

Read more »

A new approach to discover pain related genes

June 8, 2012
By

Our latest paper in PLoS Computational Biology is out.The project spanned over 2 years starting at the end of my first year of postdoctoral training until now. It has been a truly collaborative endeavor across institutions but also across sub-disciplin...

Read more »

Obtaining a protein-protein interaction network for a gene list in R

June 4, 2012
By
Obtaining a protein-protein interaction network for a gene list in R

Building a network of interaction between a bunch of genes can help a great deal in understanding the relationships between the seemingly disparate elements from your list. It can seems challenging at first to build such network but it's less complicat...

Read more »

Another look at over-representation analysis interpretation

May 21, 2012
By
Another look at over-representation analysis interpretation

Interpreting a list of differentially regulated genes can take many forms. One of the most widely used method is looking for enrichment of functional group of genes compared to a random sampling of gene from the same universe, namely an over-representa...

Read more »


Subscribe

Email:

  Subscribe