Big Data, R and HANA: Analyze 200 Million Data Points and Later Visualize in HTML5 Using D3 – Part II

April 25, 2012
By
Big Data, R and HANA: Analyze 200 Million Data Points and Later Visualize in HTML5 Using D3 – Part II

In my last blog, Big Data, R and SAP HANA: Analyze 200 Million Data Points and Later Visualize Using Google Maps, I analyzed historical airlines performance data set using R and SAP HANA and put the aggregated analysis on Google Maps.  Undoub...

Read more »

Fog warning system: part three

April 25, 2012
By
Fog warning system: part three

Background: I am trying to evaluate the effect on traffic safety of a fog warning system deployed in California in November 1996.  The system was installed by CalTrans on a section of I-5 and SR-120 near Stockton where the accident rate is g...

Read more »

Live Longer – Choose Your Country Wisely (if you can)

April 25, 2012
By
Live Longer – Choose Your Country Wisely (if you can)

Full democracy countries are the ones in which to live.This week's story could start and end with the above graph with almost no further explanation.But that wouldn't do it justice.So, like so many of the past articles on "Graph of the Week", a bit of ...

Read more »

How do I know if my figure is too complicated?

April 25, 2012
By

One of the key things every statistician needs to learn is how to create informative figures and graphs. Sometimes, it is easy to use off-the-shelf plots like barplots, histograms, or if one is truly desperate a pie-chart.  But sometimes the informat...

Read more »

Trying Out WordPress

April 25, 2012
By
Trying Out WordPress

I’ve had my site http://r4stats.com on Google Sites for a few years now and it’s time to try something new. Most of the articles there are not very blog-like. For example, The Popularity of Data Analysis Software is an article … Continue reading →

Read more »

One dollar = one dollar?

April 25, 2012
By
One dollar = one dollar?

A dollar is a dollar, no more, no less. But why did someone spend  $70,000 in advertisements inside the  D.C. subway station for keeping the dollar bill, instead of going with one dollar coin? One ad says: “Tell Congress to stop wasting time trying to eliminate the dollar bill.” Another asks: “Do you heart the dollar?” Apparently, one [...]

Read more »

Workshop in Chicago, May 24

April 24, 2012
By
Workshop in Chicago, May 24

I'll be doing a workshop at the meeting of the Association for Psychological Science in Chicago, Thursday May 24. Details can be found here.(I'm also doing a workshop in Chicago on May 4; details here.)A list of future and past workshops can be found h...

Read more »

R, Julia and genome wide selection

April 24, 2012
By
R, Julia and genome wide selection

— “You are a pussy” emailed my friend. — “Sensu cat?” I replied. — “No. Sensu chicken” blurbed my now ex-friend. What was this about? He read my post on R, Julia and the shiny new thing, which prompted him … Continue reading →

Read more »

Insights into Quantile Regression from Arthur Charpentier

April 24, 2012
By
Insights into Quantile Regression from Arthur Charpentier

At this Monday’s Montreal R User Group meeting, Arthur Charpentier gave an interesting talk on the subject of quantile regression. One of the main messages I took away from the workshop was that quantile regression can be used to determine if extreme events are becoming more extreme. The example given was hurricane intensity since 1978.

Read more »

Priors on probability measures

April 24, 2012
By
Priors on probability measures

Hi, for the next GTB meeting at Crest, 3rd May, I will present Peter Orbanz‘ work on Projective limit random probabilities on Polish spaces. It will follow my previous presentation about Bayesian nonparametrics on the Dirichlet process. The article provides a means of constructing any arbitrary prior distribution on the set of probability measures by […]

Read more »

【Bio-Glossary】Biological Replicates vs Technical Replicates

April 24, 2012
By
【Bio-Glossary】Biological Replicates vs Technical Replicates

The meaning of the term ”Biological Replicate”  unfortunately often does not get adequately addressed in many publications. “Biological Replicate” can have multiple meanings, depending upon the context of the study. A general definition could be that biological replicates are when the same type of organism is grown/treated under the same conditions. For example, if one [...]

Read more »

On the future of personalized medicine

April 24, 2012
By

Jeff Leek, Reeves Anderson, and I recently wrote a correspondence to Nature (subscription req.) regarding the Supreme Court decision in Mayo v. Prometheus and the recent Institute of Medicine report related to the Duke Clinical Trials Saga.  The bas...

Read more »

Update

April 24, 2012
By

I never got round to doing a post last week as I’ve been sidetracked by a plethora of free courses being offered. The Stanford professors that ran the AI course I did last year are now offering courses through udacity.com. … Continue readin...

Read more »

References to Object, Functional and Structured Data

April 24, 2012
By
References to Object, Functional and Structured Data

M. A. Álvarez, L. Rosasco and N. D. Lawrence, Kernels for vector-valued functions: a review,  tech report, 2011. A. Argyriou, M. Pontil, and C.A. Micchelli, When is there a representer theorem? Vector versus matrix regularizers, Journal of Machine Learning Research, 10:2507-2529, 2009. G. Bakir, T. Hofmann, B. Schölkopf, A. Smola, B. Taskar and S. Vishwanathan (Eds.), Predicting Structured Data, MIT [...]

Read more »

line-by-line memory usage of a Python program

April 24, 2012
By

My newest project is a Python library for monitoring memory consumption of arbitrary process, and one of its most useful features is the line-by-line analysis of memory usage for Python code. I wrote a basic prototype six months ago after being surpris...

Read more »

Q-A Section 9–Information Geometry and Statistics

April 24, 2012
By
Q-A Section 9–Information Geometry and Statistics

Information Geometry is applying differential geometry to families of probability distributions, and so to statistical models. Information does however play two roles in it: Kullback-Leibler information, or relative entropy, features as a measure of divergence (not quite a metric, because it’s asymmetric), and Fisher information takes the role of curvature. One very nice thing about [...]

Read more »

Simple Moving Average Strategy with a Volatility Filter: Follow-Up Part 1

April 24, 2012
By
Simple Moving Average Strategy with a Volatility Filter: Follow-Up Part 1

Analyzing transactions in quantstrat This post will be part 1 of a follow up to the original post, Simple Moving Average Strategy with a Volatility Filter. In this follow up, I will take a closer look at the individual trades of each strategy. This may provide valuable information to explain the difference in performance of the SMA … Continue reading →

Read more »

Example 9.28: creating datasets from tables

April 23, 2012
By
Example 9.28: creating datasets from tables

RThere are often times when it is useful to create an individual level dataset from aggregated data (such as a table). While this can be done using the expand.table() function within the epitools package, it is also straightforward to do directly with...

Read more »

Coursera (and other online classes)

April 23, 2012
By
Coursera (and other online classes)

A revolution is taking place in education. Last fall, Stanford University premiered three online classes in Artificial Intelligence, Machine Learning, and Introduction to Databases. I took Machine Learning and Intro to Databases, and this spring I’m ...

Read more »

Please Vote on the "Top Confusing Stats Terms"

April 23, 2012
By

Please tell me what stats terms you think are the most confusing! Please order the terms you choose, according to how confusing they are (with #1 being most confusing). The results will dictate what topics are covered in future blogs! Blog entries for Confusing Stats Terms #10, #9, and #8 are already posted, so I'm only asking for terms #7 through #1. Thanks for your input! http://www.statsmakemecry.com/confusing-stats-terms/

Read more »

Please Vote on the "Top Confusing Stats Terms"

April 23, 2012
By

Please tell me what stats terms you think are the most confusing! Please order the terms you choose, according to how confusing they are (with #1 being most confusing). The results will dictate what topics are covered in future blogs!Blog entries for Confusing Stats Terms #10, #9, and #8 are already posted, so I'm only asking for terms #7 through #1. Thanks for your input!http://www.statsmakemecry.com/confusing-stats-terms/Related Content:Top Ten Confusing Stats Terms…

Read more »

The Explanatory Power of Data Points

April 23, 2012
By
The Explanatory Power of Data Points

As newspaper graphics go, scatterplots are a fairly advanced technique. They tend to show a reasonably large amount of data as single points, and they require the reader to have an idea what to look for. Most newspapers never bother using scatterplots for that reason, which is really too bad. With some explanation, a scatterplot can be a very effective means of displaying data, and in particular to allow the…

Read more »


Subscribe

Email:

  Subscribe