Blog Archives

Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

February 26, 2015
By
Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

Anscombe’s quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician Francis Ansco...

Read more »

Microbial Genomics: the State of the Art in 2015

February 4, 2015
By
Microbial Genomics: the State of the Art in 2015

Current Opinion in Microbiology recently published a special issue in genomics. In an excellent editorial overview, “Genomics: The era of genomically-enabled microbiology”, Neil Hall and Jay Hinton give an overview of the state of the field in micr...

Read more »

Microbial Genomics: the State of the Art in 2015

February 4, 2015
By
Microbial Genomics: the State of the Art in 2015

Current Opinion in Microbiology recently published a special issue in genomics. In an excellent editorial overview, “Genomics: The era of genomically-enabled microbiology”, Neil Hall and Jay Hinton give an overview of the state of the field in micr...

Read more »

R + ggplot2 Graph Catalog

February 3, 2015
By
R + ggplot2 Graph Catalog

Joanna Zhao’s and Jenny Bryan’s R graph catalog is meant to be a complement to the physical book, Creating More Effective Graphs, but it’s a really nice gallery in its own right. The catalog shows a series of different data visualizations, all ma...

Read more »

Microbiome Digest Blog

January 20, 2015
By
Microbiome Digest Blog

I have a noteworthy blogs tag on this blog that I sort of forgot about, and haven't used in years. But I started reading one recently that's definitely qualified for the distinction.The Microbiome Digest is written by Elisabeth Bik, a scienti...

Read more »

Using the microbenchmark package to compare the execution time of R expressions

January 14, 2015
By
Using the microbenchmark package to compare the execution time of R expressions

I recently learned about the microbenchmark package while browsing through Hadley’s advanced R programming book. I’ve done some quick benchmarking using system.time() in a for loop and taking the average, but the microbenchmark function in the micr...

Read more »

Importing Illumina BeadArray data into R

December 8, 2014
By
Importing Illumina BeadArray data into R

A colleague needed some help getting Illumina BeadArray gene expression data loaded into R for data analysis with limma. Hopefully whoever ran your arrays can export the data as text files formatted as described in the code below. If so, you can import...

Read more »

RNA-seq Data Analysis Course Materials

November 20, 2014
By
RNA-seq Data Analysis Course Materials

Last week I ran a one-day workshop on RNA-seq data analysis in the UVA Health Sciences Library. I set up an AWS public EC2 image with all the necessary software installed. Participants logged into AWS, launched the image, and we kicked off the morning ...

Read more »

Operate on the body of a file but not the header

October 14, 2014
By
Operate on the body of a file but not the header

Sometimes you need to run some UNIX command on a file but only want to operate on the body of the file, not the header. Create a file called body somewhere in your $PATH, make it executable, and add this to it:#!/bin/bashIFS= read -r headerprintf '%s\n...

Read more »

R package to convert statistical analysis objects to tidy data frames

September 16, 2014
By
R package to convert statistical analysis objects to tidy data frames

I talked a little bit about tidy data my recent post about dplyr, but you should really go check out Hadley’s paper on the subject.R expects inputs to data analysis procedures to be in a tidy format, but the model output objects that you get back are...

Read more »


Subscribe

Email:

  Subscribe