Inference and evidence
16/07/2003Nearly a month ago I came across a conference paper by Timothy Gregoire, heavily based on the work by Richard Royall. It was quite an interesting reading, waking up some of my concerns about the inferences we normally make when doing statistical analyses. The paper touches a few issues that we confront on a regular basis working in biometrics, particularly when is something significant and the increasing interest in power calculations (and the fallacy of their calculation a posteriori).
Many organisations that work as forest land managers are now required to present ecological indicators. The point of the exercise is to show if forest operations do change the value of those indicators and, because the direction of the burden of proof has changed, people are really wary about Type II errors. People do not want to hear that there was no change when, in fact, there was some. Then we go back to the start of this comment: How big a sample should we take to estimate change for the indicators for a given power? Do we use the appropriate techniques for making inferences? How do we handle evidence in a sensible way? Gregoire (and Royall) present a good case for the use of likelihood ratios.
Oh, I almost forgot! The calculation of power a posteriori is just another useless exercise. There is a one-to-one relationship between p-values and power, so presenting the “observed power” does not add anything to the discussion (see Hoenig, J.M. and Heisey, D.M. 2001. The abuse of power: the pervasive fallacy of power calculations for data analysis. The American Statistician 55: 19-24, for a good explanation). Using Russell Lehn’s analogy “If my car made it to the top of the hill, then it is powerful enough to climb that hill; if it didn’t, then it obviously isn’t powerful enough. Retrospective power is an obvious answer to a rather uninteresting question”.
Filed in statistics
No comments yet.