Optional Stopping

This page was last edited on 05/19/02 by Malcolm R Forster

The Problem

 Suppose you are determined to "prove" that green apples cause cancer.   An Optional Stopping strategy (OS) is where you keep looking sampling experimental data until the observed correlation between eating green applies and cancer is significantly different from 0 (where "significantly" means that the null hypothesis is rejected by standard statistical tests). That is, you follow a rule that says "Don’t stop until you reject the null hypothesis". This is also the best strategy for confirming the existence of UFOs or establishing the phenomenon of extrasensory perception (ESP). If the data are ‘noisy’ (and whose data are not?), then this will probably always work in principle, so not always in practice because you won’t live long enough to collect enough data.

OSplot3.gif (2736 bytes)

     There are two schools of thought about optional stopping examples of the kind I consider. The classical hypothesis testers say that it is a bad strategy if the probability of falsely rejecting the null hypothesis when it is true is 1. Some Bayesians, and likelihood theorists, say that it all that matters is how well the hypotheses fit the data, and it makes no difference whether you collect n data by an OS strategy, or if you collect n data with the prior intention of stopping at a sample size of n (strategy FS). About the only thing that has never been said about OS is that it is better than FS (with the same n).
     This graph show that OS is better (leads to less error, on average) than FS most of the time, at least in some situations.  In the paper, I explain this remarkable result in terms of the analogy at the right.

Publication Data

"Optional Stopping" (in preparation).  Not yet submitted

Note: You need Adobe Acrobat Reader 3.0, or later, to read and print this article.  It is free.

Draft 5 is a PDF file. Last updated 12/20/98.

An Analogy

Suppose that a blind-folded person starts at one place and is asked to head in the direction of the sun. He cannot see the sun, but he can feel the sun's rays on his face. We expect him to walk roughly in the direction of the sun, but with some random errors in   every step. You (the experimenter) do not know the direction of the sun, and your job is to infer it from the behavior of the blind-folded subject.

OSdiagram.gif (1945 bytes)

     There are two subjects, whose initials are OS and FS. (OS stands for Optional Stopping, while FS stands for Fixed Sample size.)  OS stops when he hits either of the side lines. Then you will record his position, and draw a line from that position to his starting point. That line is your estimate of the direction of the sun.
     Now draw a finish line (the vertical line in the figure) that passes through the point at which OS stopped. FS begins walking from the start, and is stopped as soon as he hits this line, and not before. Unlike OS, FS is allowed to cross the side lines any number of times. Now record FS’s position at the finish line. If it is between the side lines, you will infer that the sun is on the center line. If he is outside the side lines, then draw a line from that position to his starting point and use this line to estimate the direction of the sun.
     The question is: Which inference is the most reliable? That is, on average, which method of estimating the sun’s direction is associated with least error, where error is measured by the square of the discrepancy between the estimated line and the true line.