Mister P for surveys in epidemiology — using Stan!

Jon Zelner points us to this new article in the American Journal of Epidemiology, “Multilevel Regression and Poststratification: A Modelling Approach to Estimating Population Quantities From Highly Selected Survey Samples,” by Marnie Downes, Lyle Gurrin, Dallas English, Jane Pirkis, Dianne Currier, Matthew Spittal, and John Carlin, which begins: Large-scale population health studies face increasing difficulties […]

Diary For Statistical War Correspondents on the Latest Ban on Speech

When science writers, especially “statistical war correspondents”, contact you to weigh in on some article, they may talk to you until they get something spicy, and then they may or may not include the background context. So a few writers contacted me this past week regarding this article (“Retire Statistical Significance”)–a teaser, I now suppose, […]

foundations of data science [editorial]

The American Institute of Mathematical Sciences, one of eight NSF-funded mathematical institutes, is supporting a new journal on data sciences called Foundations of Data Science, with editors in chief Ajay Jasra, Kody Law, and Vasileios Maroulas. Since I know them reasonably well (!). I have asked the editors for an editorial and they obliged by […]

Monads and generalized elements

Paolo Perrone gives a nice, succinct motivation for monads in the introduction to his article on probability and monads. … a monad is like a consistent way of extending spaces to include generalized elements of a specific kind. He develops this idea briefly, and links to his dissertation where he gives a longer exposition (pages […]

Should we talk less about bad social science research and more about bad medical research?

Paul Alper pointed me to this news story, “Harvard Calls for Retraction of Dozens of Studies by Noted Cardiac Researcher: Some 31 studies by Dr. Piero Anversa contain fabricated or falsified data, officials concluded. Dr. Anversa popularized the idea of stem cell treatment for damaged hearts.” I replied: Ahhh, Harvard . . . the reporter […]

How cdata Control Table Data Transforms Work

With all of the excitement surrounding cdata style control table based data transforms (the cdata ideas being named as the “replacements” for tidyr‘s current methodology, by the tidyr authors themselves!) I thought I would take a moment to describe how they work. cdata defines two primary data manipulation operators: rowrecs_to_blocks() and blocks_to_rowrecs(). These are the … Continue reading How cdata Control Table Data Transforms Work

Mixing error-correcting codes and cryptography

Secret codes and error-correcting codes have nothing to do with each other. Except when they do! Error-correcting codes Error correcting code make digital communication possible. Without some way to detect and correct errors, the corruption of a single bit could wreak havoc. A simple example of an error-detection code is check sums. A more sophisticated […]

Yes, I really really really like fake-data simulation, and I can’t stop talking about it.

Rajesh Venkatachalapathy writes: Recently, I had a conversation with a colleague of mine about the virtues of synthetic data and their role in data analysis. I think I’ve heard a sermon/talk or two where you mention this and also in your blog entries. But having convinced my colleague of this point, I am struggling to […]

Why we Did Not Name the cdata Transforms wide/tall/long/short

We recently saw this UX (user experience) question from the tidyr author as he adapts tidyr to cdata techniques. The terminology that he is not adopting from cdata is “unpivot_to_blocks()” and “pivot_to_rowrecs()”. One of the research ideas in the cdata package is that the important thing to call out is record structure. The important point … Continue reading Why we Did Not Name the cdata Transforms wide/tall/long/short

Support Rotary to Support our World

Thank you to Win-Vector LLC General Partner Nina Zumel for stepping up her workload, allowing me take some time off from Win-Vector LLC (and time off from from revising chapter 8 of Practical Data Science with R 2nd Edition) to make time to help administer the Vietnam Rotary Global Grant mentioned below. This project is … Continue reading Support Rotary to Support our World

abandon ship [value]!!!

The Abandon Statistical Significance paper we wrote with Blakeley B. McShane, David Gal, Andrew Gelman, and Jennifer L. Tackett has now appeared in a special issue of The American Statistician, “Statistical Inference in the 21st Century: A World Beyond p < 0.05“.  A 400 page special issue with 43 papers available on-line and open-source! Food […]

Postdoc in Chicago on statistical methods for evidence-based policy

Beth Tipton writes: The Institute for Policy Research and the Department of Statistics is seeking applicants for a Postdoctoral Fellowship with Dr. Larry Hedges and Dr. Elizabeth Tipton. This fellowship will be a part of a new center which focuses on the development of statistical methods for evidence-based policy. This includes research on methods for […]

US Army applying new areas of math

Many times on this blog I’ve argued that the difference between pure and applied math is motivation. As my graduate advisor used to say, “Applied mathematics is not a subject classification. It’s an attitude.” Traditionally there was general agreement regarding what is pure math and what is applied. Number theory and topology, for example, are […]