More on “data science” and “statistics”

November 19, 2013
By

After reading Rachel and Cathy’s book, I wrote that “Statistics is the least important part of data science . . . I think it would be fair to consider statistics as a subset of data science. . . . it’s not the most important part of data science, or even close.” But then I received […]The post More on “data science” and “statistics” appeared first on Statistical Modeling, Causal Inference,…

Read more »

A letter to high-school students

November 19, 2013
By

Imagine Magazine, a youth-focused journal by Johns Hopkins's Center of Talented Youth, invited me to contribute an article in celebration of statistics. I try to convey the fun and joy of working with numbers and charts. You can read it...

Read more »

R and Solr Integration Using Solr’s REST APIs

November 19, 2013
By
R and Solr Integration Using Solr’s REST APIs

Solr is the most popular, fast and reliable open source enterprise search platform from the Apache Luene project.  Among many other features, we love its powerful full-text search, hit highlighting, faceted search, and near real-time indexing. &nb...

Read more »

Predicting claims with a Bayesian network

November 19, 2013
By
Predicting claims with a Bayesian network

Here is a little Bayesian Network to predict the claims for two different types of drivers over the next year, see also example 16.15 in [1]. Let's assume there are good and bad drivers. The probabilities that a good driver will have 0, 1 or 2 claims i...

Read more »

Lucien Le Cam: “The Bayesians hold the Magic”

November 18, 2013
By
Lucien Le Cam: “The Bayesians hold the Magic”

Today is Lucien Le Cam’s birthday. He was an error statistician whose remarks in an article, “A Note on Metastatisics,” in a collection on foundations of statistics (Le Cam 1977)* had some influence on me.  A statistician at Berkeley, Le Cam was a co-editor with Neyman of the Berkeley Symposia volumes. I hadn’t mentioned him on […]

Read more »

Binomial regression model

November 18, 2013
By
Binomial regression model

Most of the time, when we introduce binomial models, such as the logistic or probit models, we discuss only Bernoulli variables, . This year (actually also the year before), I discuss extensions to multinomial regressions, where  is a function on some simplex. The multinomial logistic model was mention here. The idea is to consider, for instance with three possible classes the following model and Now, what about a real Binomial model, , where ‘s are known. How…

Read more »

Feeling optimistic after the Future of the Statistical Sciences Workshop

November 18, 2013
By

Last I week I participated in the Future of the Statistical Sciences Workshop. I arrived feeling somewhat pessimistic about the future of our discipline. My pessimism stemmed from the emergence of the term Data Science and the small role academic … Continue reading →

Read more »

Graduate Course on Copulas and Extreme Values

November 18, 2013
By

This Winter, I will be giving a (graduate) course on extreme values, and copulas (more generally multivariate models and dependence), MAT8595. It is an ISM course, and even if it will probably be given in French, I will upload information here, in English. I will upload the (detailed) syllabus of the course during the Christmas holidays. But to give an overview, for those willing to register, the first part of the course will…

Read more »

What’s my Kasparov number?

November 18, 2013
By
What’s my Kasparov number?

A colleague writes: Personally my Kasparov number is two: I beat ** in a regular tournament game, and ** beat Kasparov! That’s pretty impressive, especially given that I didn’t know this guy played chess at all! Anyway, this got me thinking, what’s my Kasparov number? OK, that’s easy. I beat Magnus Carlsen the other day […]The post What’s my Kasparov number? appeared first on Statistical Modeling, Causal Inference, and Social…

Read more »

The e-Writing Jungle Part 2: The MathML Impasse and the MathJax Solution

November 18, 2013
By

Back to LaTeX and MathJax and MathML and Python and Sphinx and IPython and R and Knitter and Firefox and Chrome and ...In Part 1, I praised e-books done as LaTeX to pdf to the web, perhaps surprisingly. Now let's go the other way, to an e-boo...

Read more »

Historical Value at Risk versus historical Expected Shortfall

November 18, 2013
By
Historical Value at Risk versus historical Expected Shortfall

Comparing the behavior of the two on the S&P 500. Previously There have been a few posts about Value at Risk (VaR) and Expected Shortfall (ES) including an introduction to Value at Risk and Expected Shortfall. Data and model The underlying data are daily returns for the S&P 500 from 1950 to the present. The VaR and … Continue reading →

Read more »

Vectorizing the construction of a structured matrix

November 18, 2013
By
Vectorizing the construction of a structured matrix

In using a vector-matrix language such as SAS/IML, MATLAB, or R, one of the challenges for programmers is learning how to vectorize computations. Often it is not intuitive how to program a computation so that you avoid looping over the rows and columns of a matrix. However, there are a [...]

Read more »

Some Options for Testing Tables

November 18, 2013
By
Some Options for Testing Tables

Contingency tables are a very good way to summarize discrete data.  They are quite easy to construct and reasonably easy to understand. However, there are many nuances with tables and care should be taken when making conclusions related to the data. Here are just a few thoughts on the topic. Dealing with sparse data On […]

Read more »

Alpha testing shinyapps.io – first impressions

November 18, 2013
By
Alpha testing shinyapps.io – first impressions

ShinyApps.io is a new server which is currently in alpha testing to host Shiny applications.  It is being designed by the RStudio team and provides some distinct features different from that of the ShinyApps.io is intended for larger applications ...

Read more »

Analysis of “Deal or No Deal” results

November 18, 2013
By
Analysis of “Deal or No Deal” results

Deal or No Deal My son, Jonathan, loves game-shows, and his current favourite is Deal or No Deal, the Australian version. It has been airing now for over ten years, and there is at least one episode available every weeknight … Continue reading →

Read more »

Hello North Carolina

November 17, 2013
By

This Wednesday, I'm giving the Big Data Seminar at NC State. Here is the announcement. *** In his new book Numbersense: How to Use Big Data to Your Advantage, Kaiser Fung (NYU & Vimeo statistician) calls attention to one aspect of the Big Data phenomenon that has not received media attention: the consumers of Big Data analyses, i.e. everyone, will face more confusion and less clarity as the volume of…

Read more »

Probabilité et géométrie

November 17, 2013
By
Probabilité et géométrie

Une des formules les plus importantes en probabilité (je trouve) est la “formule des probabilités totales” qui dit tout simplement que que l’ont peut aussi écrire, à l’aide de la formule de Bayes Une des conséquences de ce résultat est la “law of total expectation“, souvent appelé théorème de double projection, que l’on écrit souvent sous la forme raccourcie  (dans la formule de droite, le premier symbole est un espérance, c’est…

Read more »

Big bad education bureaucracy does big bad things

November 17, 2013
By

In response to some big new push for testing schoolchildren, Mark Palko writes: The announcement of a new curriculum is invariably followed by a round of hearty round of self congratulations and talk of “keeping standards high” as if adding a slide to a PowerPoint automatically made students better informed. It doesn’t work that way. […]The post Big bad education bureaucracy does big bad things appeared first on Statistical Modeling,…

Read more »

Dutch Rainwater Composition 1992-2005.

November 17, 2013
By
Dutch Rainwater Composition 1992-2005.

After reading Blog About Stats' Open Data Index Blog Post I decided to browse a bit in the Open Data Index. Choosing Netherlands and following Emission of Pollutants I ended on a page from National Institute for Public Health. The page&n...

Read more »

What should statistics do about massive open online courses?

November 17, 2013
By

Marie Davidian, the President of the American Statistical Association, writes about the JHU Biostatistics effort to deliver massive open online courses. She interviewed Jeff, Brian Caffo, and me and summarized our thoughts. All acknowledge that the future is unknown. How … Continue reading →

Read more »

Stein’s Method

November 16, 2013
By
Stein’s Method

I have mentioned Stein’s method in passing, a few times on this blog. Today I want to talk about Stein’s method in a bit of detail. 1. What Is Stein’s Method? Stein’s method, due to Charles Stein, is actually quite old, going back to 1972. But there has been a great deal of interest in […]

Read more »

Stein’s Method

November 16, 2013
By
Stein’s Method

I have mentioned Stein’s method in passing, a few times on this blog. Today I want to talk about Stein’s method in a bit of detail. 1. What Is Stein’s Method? Stein’s method, due to Charles Stein, is actually quite old, going back to 1972. But there has been a great deal of interest in […]

Read more »

S. Stanley Young: More Trouble with ‘Trouble in the Lab’ (Guest post)

November 16, 2013
By
S. Stanley Young: More Trouble with ‘Trouble in the Lab’ (Guest post)

 Stanley Young’s guest post arose in connection with Kepler’s Nov. 13, and my November 9 post,and associated comments. S. Stanley Young, PhD Assistant Director for Bioinformatics National Institute of Statistical Sciences Research Triangle Park, NC Much is made by some of the experimental biologists that their art is oh so sophisticated that mere mortals do not have […]

Read more »


Subscribe

Email:

  Subscribe