Rodger's method

Rodger's method is a statistical procedure for examining research data post hoc following an 'omnibus' analysis (e.g., after an analysis of variance – anova). The various components of this methodology were fully worked out by R. S. Rodger in the 1960s and 70s, and seven of his articles about it were published in the British Journal of Mathematical and Statistical Psychology between 1967 and 1978.[1][2][3][4][5][6][7]

Statistical procedures for finding differences between groups, along with interactions between the groups that were included in an experiment or study, can be classified along two dimensions: 1) were the statistical contrasts that will be evaluated decided upon prior to collecting the data (planned) or while trying to figure out what those data are trying to reveal (post hoc), and 2) does the procedure use a decision-based (i.e., per contrast) error rate or does it instead use an experiment-wise error rate. Rodger's method, and some others, are classified according to these dimensions in the table below.

Table 1: Some multiple comparison procedures
Planned contrasts Post hoc contrasts
Decision-based error rate t tests Duncan's method
Rodger's method
Experiment-wise error rate Bonferroni's inequality
Dunnett's method
Newman–Keuls method
Tukey's range method
Scheffé's method

Statistical power

In early 1990s, one set of researchers made this statement about their decision to use Rodger's method: "We chose Rodger’s method because it is the most powerful post hoc method available for detecting true differences among groups. This was an especially important consideration in the present experiments in which interesting conclusions could rest on null results" (Williams, Frame, & LoLordo, 1992, p. 43).[8] The most definitive evidence for the statistical power advantage that Rodger's method possesses (as compared with eight other multiple comparison procedures) is provided in a 2013 article by Rodger and Roberts.[9]

Type 1 error rate

Statistical power is an important consideration when choosing what statistical procedure to use, but it isn't the only important one. All statistical procedures permit researchers to make statistical errors and they are not all equal in their ability to control the rate of occurrence of several important types of statistical error. As Table 1 shows, statisticians can't agree on how error rate ought to be defined, but particular attention has been traditionally paid to what are called 'type 1 errors' and whether or not a statistical procedure is susceptible to type 1 error rate inflation.

On this matter, the facts about Rodger's method are straightforward and unequivocal. Rodger's method permits an absolutely unlimited amount of post hoc data snooping and this is accompanied by a guarantee that the long run expectation of type 1 errors will never exceed the commonly used rates of either 5 or 1 percent. Whenever a researcher falsely rejects a true null contrast (whether it is a planned or post hoc one) the probability of that being a type 1 error is 100%. It is the average number of such errors over the long run that Rodger's method guarantees cannot exceed Eα = 0.05 or 0.01. This statement is a logical tautology, a necessary truth, that follows from the manner in which Rodger's method was originally conceived and subsequently built. Type 1 error rate inflation is statistically impossible with Rodger's method, but every statistical decision a researcher makes that might be a type 1 error will either actually be one or it won't.

Decision-based error rate

The two features of Rodger's method that have been mentioned thus far, its increased statistical power and the impossibility of type 1 error rate inflation when using it, are direct by-products of the decision-based error rate that it utilizes. "An error occurs, in the statistical context, if and only if a decision is made that a specified relationship among population parameters either is, or is not, equal to some number (usually, zero), and the opposite is true. Rodger’s very sensible, and cogently argued, position is that statistical error rate should be based exclusively on those things in which errors may occur, and that (necessarily, by definition) can only be the statistical decisions that researchers make" (Roberts, 2011, p. 69).[10]

Implied true population means

There is a unique aspect of Rodger's method that is statistically valuable and is not dependent on its decision-based error rate. As Bird stated: "Rodger (1965, 1967a, 1967b, 1974) explored the possibility of examining the logical implications of statistical inferences on a set of J − 1 linearly independent contrasts. Rodger’s approach was formulated within the Neyman-Pearson hypothesis-testing framework [...] and required that the test of each contrast Ψi (i = 1, ... , J − 1) should result in a ‘decision’ between the null hypothesis (iH0: Ψi = 0) and a particular value δi specified a priori by the alternative hypothesis (iH1: Ψi = δi). Given the resulting set of decisions, it is possible to determine the implied values of all other contrasts" (Bird, 2011, p. 434).[11]

The statistical value that Rodger derived from the ‘implication equation’ that he invented is prominently displayed in the form of 'implied means' that are logically implied, and mathematically entailed, by the J − 1 statistical decisions that the user of his method makes. These implied true population means constitute a very precise statement about the outcome of one's research, and assist other researchers in determining the size of effect that their related research ought to seek.

Whither Rodger’s method?

Since the inception of Rodger's method, some researchers who use it have had their work published in prestigious scientific journals, and this continues to happen. Nevertheless, it is fair to currently conclude that "Rodger’s work on deduced inference has been largely ignored" (Bird, 2011, p. 434). Bird uses implication equations, similar to Rodger's, to deduce interval inferences concerning any contrasts not included in an analysis from the upper and lower limits of confidence intervals on J − 1 linearly independent planned contrasts; a procedure that Rodger himself opposes.[12]

A very different desired outcome for Rodger's method was conveyed in this statement by Roberts: "Will Rodger’s method continue to be used by only a few researchers, become extinct, or supplant most or all of the currently popular post hoc procedures following ANOVA? This article and the SPS computer program constitute an attempted intervention in the competition for dominance and survival that occurs among ideas. My hope is that the power and other virtues of Rodger’s method will become much more widely known and that, as a consequence, it will become commonly used. ... Better ideas and the ‘mousetraps’ they are instantiated in, ought, eventually, to come to the fore" (Roberts, 2011, p. 78).

The possible futures for Rodger's method mentioned in the two previous paragraphs are therefore not exhaustive, and the possibilities on a more comprehensive list are no longer mutually exclusive.

References

  1. ^ Rodger, R. S. (1974). Multiple contrasts, factors, error rate and power. British Journal of Mathematical and Statistical Psychology, 27, 179–198.
  2. ^ Rodger, R. S. (1975a). The number of non-zero, post hoc contrasts from ANOVA and error-rate I. British Journal of Mathematical and Statistical Psychology, 28, 71–78.
  3. ^ Rodger, R. S. (1975b). Setting rejection rate for contrasts selected post hoc when some nulls are false. British Journal of Mathematical and Statistical Psychology, 28, 214–232.
  4. ^ Rodger, R. S. (1978). Two-stage sampling to set sample size for post hoc tests in ANOVA with decision-based error rates. British Journal of Mathematical and Statistical Psychology, 31, 153–178.
  5. ^ Rodger, R. S. (1969). Linear hypotheses in 2xa frequency tables. British Journal of Mathematical and Statistical Psychology, 22, 29–48.
  6. ^ Rodger, R. S. (1967a). Type I errors and their decision basis. British Journal of Mathematical and Statistical Psychology, 20, 51–62.
  7. ^ Rodger, R. S. (1967b). Type II errors and their decision basis. British Journal of Mathematical and Statistical Psychology, 20, 187–204.
  8. ^ Williams, D. A., Frame, K. A., & LoLordo, V. M. (1992). Discrete signals for the unconditioned stimulus fail to overshadow contextual or temporal conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 18(1), 41–55.
  9. ^ Rodger, R.S. and Roberts, M. (2013). Comparison of power for multiple comparison procedures. Journal of Methods and Measurement in the Social Sciences, 4(1), 20–47.
  10. ^ Roberts, M. (2011). Simple, Powerful Statistics: An instantiation of a better ‘mousetrap’. Journal of Methods and Measurement in the Social Sciences, 2(2), 63–79.
  11. ^ Bird, K. D. (2011). Deduced inference in the analysis of experimental data. Psychological Methods, 16(4), 432–443.
  12. ^ Rodger, R. S. (2012). Paired comparisons, confusion, constraint, contradictions, and confidence intervals. Unpublished manuscript.