A/B Testing Primer and the DEED framework

November 14, 2017

(This article was originally published at Big Data, Plainly Spoken (aka Numbers Rule Your World), and syndicated at StatsBlogs.)

My short lecture on A/B testing went live yesterday, streamed on Facebook Live from Harvard Business Review's office in Boston. You can watch the recording here.

The outline of the talk is as follows:

  • Expands upon the HBR article that Amy Gallo and I collaborated on titled "A refresher on A/B testing"
  • Uses a real A/B test as a running example throughout: this test was run about five years ago in reaction to an alarming trend of website users migrating en masse to mobile devices. We tested a mobile-optimized conversion page, which is the webpage where our users sign up and pay for our service
  • Explains the two distinguishing features of an A/B test: the comparison with a control group, and the random assignment of treatment
  • Discusses why a pre-post arrangement in which version A is shut down and version B is turned on at a point in time is not an A/B test, and not as good as an A/B test
  • Introduces the DEED framework - four key stages in a practical A/B testing process: Design, Execute, Examine, Deploy
  • Most teams focus energy on Execute and Examine but the more sophisticated, analytically advanced organizations spend a lot of time on the Design and the Deployment phases. I reckon my teams spend at least 50% of our time on Design
  • Design is much more than linear algebra. It's an organizational alignment exercise.
  • Explains why many executives believe that most of their A/B tests "fail". I previously wrote about this here and here
  • It is mistaken to think that a test fails when it fails to prove out a hypothesis
  • Uses the DEED framework to describe real reasons for tests failing: bad Design, bad Execution, bad Examination, bad Deployment
  • Gives an example of each type of failure in the context of the mobile optimization test
  • In particular, you should not inspect a report with a hundred metrics and pick out the ones that pass the statistical significance filter. For more on researcher's degrees of freedom, see the materials about the replication crisis here or on Andrew Gelman's blog
  • Describes why A/B testing is an organization-wide endeavor that requires bringing a broad set of players from the onset, and faciliated by a project manager
  • Discusses what skills are required at the different stages of DEED, and how to build the project team

 Now watch the video here.

Please comment on the article here: Big Data, Plainly Spoken (aka Numbers Rule Your World)

Tags: , , , , , , , , , ,