Chapter 14

Repeated Measures ANOVA

 


 All the ANOVA stuff we have done so far has had different subjects in the various cells of the experimental design

 That kind of experiment is called a between-subjects design

 Sometimes, however, we run the same subjects in some or all cells of the design

 Such a within-subjects (or repeated measures) design has two advantages:

 

Memories of the ANOVA logic

Recall that the purpose of doing an ANOVA is to see if some difference between treatment means is sufficiently large is to be unlikely to occur by chance (less than 5% chance)

When we test that … we get an estimate of the difference we are interested in, and divide it by an estimate of variation due to chance

Specifically:

 

 

Notice that this F value will increase if the difference between the means is large OR if the measurement of error is small

As you will see, repeated-measures designs allow us to reduce the error term, thereby resulting in larger Fs (more power)

 

An example: Within versus Between

This experiment will show the importance of the "articulatory loop" for retaining information in short-term memory

Between-Subjects Version

Subject

bla-bla

Subject

no bla-bla

 
1
2
3
4
5
X    X2
5    25
4    16
6    36
4    16
7    49
 
1
2
3
4
5
X    X2
7    49
6    36
5    25
6    36
6    36

 

Within-Subject Version

Subject

bla-bla

no bla-bla

 
1
2
3
4
5
X    X2
5    25
3    9
7    49
2    4
3    9
X    X2
7    49
4    16
7    49
4    16
5    25

 

Computations for Between-Subject

 

 

 

 

 

 

 

How is the within-subjects version different from the between-subjects version?

An assumption of the between-subjects ANOVA is that the observations in one level of the treatment are independent of those in the other level(s)

Hopefully you will notice that this assumption does not hold in our within-subjects version of the experiment 

The use of the same subject in more than one level of the treatment almost always builds in a dependency because subjects who do well in one level tend to also do well in the other(s)

Can we remove this dependency? In fact we can, and when we do, there is a bonus! (the kind of thing that makes statistics geeks real happy J )

 

Getting Rid of the Variability Due to Subjects

The "dependency in observations" is due to some subjects doing better than others

What we are going to do to deal with this is to literally remove the variation due to subjects from the error term

For demonstration purposes only .. you can think of this as subtracting each subjects mean from all the scores they contribute

Using the data from out class:

Subject

bla-bla

no bla-bla

 
1
2
3
4
5
X    X¢ 
5    -1
3   -.5
7     0
2    -1
3    -1
X    X¢ 
7    +1
4   +.5
7     0
4    +1
5    +1

 

Where

 

Within-Subjects Computations

 

 

 

 

 

 

 

 

 

Source Tables

Source
SS
df
MS
F
Treatment
1.60
1
1.60
1.46
Error
8.80
8
1.10
 
Total
10.40
9
   

 

Source
SS
df
MS
F
Subject
4
Treatment
1
Error
 
4
 
 
Total
9
   

 

The Advantage of Within-Subject Designs

Remember, F values are increased if the difference of interest in larger OR if the measure of variance (MSerror) gets smaller

While removing the sum of squares due to subjects does make the observations independent across levels of the treatment variable, it OFTEN reduces the MSerror, thereby resulting in increased power (larger F values)

This only occurs though if the reduction in MSerror is more than compensates for the loss in dferror .. so it is not always true

Note that you cannot remove the variance (sum of squares) due to subjects when using a between subjects design because you only have one observation per subject … thus the variance due to subjects must remain as part of the error term

Moral: Usually, it is better to use within-subject (repeated measures) designs … not only do they let you use less subjects, but they are also more powerful, statistically speaking

 

Assumption of Compound Symmetry

Remember when we did between-subject ANOVAs, one of the assumptions was that the variance in our various treatment groups were homogenous (i.e., roughly equivelant)

A similar but slightly more complex assumption underlies repeated measures designs

Specifically, we need to satisfy the "compound symmetry" assumption which is that in addition to the variances being equal, the covariances between pairs of variables are also equal

For this to make sense, I think we may have to do a B07 time travel to re-introduce the notion of covariance ….

Imagine any two variables such as …

 

Subject
Height (X)
Weight (Y)
1
69
108
2
61
130
3
68
135
4
66
135
5
66
120
6
63
115
7
72
150
8
62
105
9
62
115
10
67
145
11
66
132
12
63
120
Mean
65.42
125.83
Sum(X) = 785
Sum(Y) = 1510
Sum (X2) = 51473
Sum(Y2) = 192238

Sum (XY) = 99064

 

 The covariance of these variables is computed as:

 

 

But what does it mean?

The covariance formula should look familiar to you. If all the Ys were exchanged for Xs, the covariance formula would be the variance formula

Note what this formula is doing, however, it is capturing the degree to which pairs of points systematically vary around their respective means 

If paired X and Y values tend to both be above or below their means at the same time, this will lead to a high positive covariance

However, if the paired X and Y values tend to be on opposite sides of their respective means, this will lead to a high negative covariance

If there is no systematic tendencies of the sort mentioned above, the covariance will tend towards zero

 

The Computational Formula for Cov

Given its similarity to the variance formula, it shouldn’t surprise you that there is also a computationally more workable version of the covariance formula:

 

 

For our height versus weight example then:

 

 

Back to Compound Symmetry

OK, now let’s assume we ran a repeated measures study in which we were looking at practice effects on some task over 3 days

 

 
Day 1
Day 2
Day 3
Sub 1
700
650
620
Sub 2
520
450
430
Sub 3
600
540
500
Sub 4
650
630
620
Sub 5
750
700
690
Variance
7930
9830
10970

 

  

 

The Covariance (Variance/Covariance) Matrix

These variances and covariances are often presented in a matrix such as the following:

 

 

 

 

So, the assumption of compound symmetry is simply that the variances must all be approximately equal and the covariances must all be approximately equal

The variances need not (and often do not) equal the variances though

 

Complicating it all

So far in this chapter, we have been dealing with only one variable that has been manipulated in a within-subject manner

However, as we saw in Chapter 13, studies usually manipulate more than one variable which raises several possibilities

2 variables

2 between subject variables .. Chapter 13
1 within - 1 between
2 within

 

3 variables

3 between … Chapter 13
1 within - 2 between
2 within - 1 between
3 within

 

Computationally, we will only focus on the 2 new "2 variable" situations

However, as was the case with 3 between subject variables, I will expect you to be able to interpret 3 variable results … we will spend time doing this as well

 

One Between - One Within

Imagine the following study (raw data is presented in the text, pp. 459)

Similar to Siegel’s morphine tolerance study, King (1986) was interested in conditioned tolerance to another drug … midazolam

 

The Data, Steve Style

   

1

2

3

4

5

6

 

SS

 
 
150
44
71
59
132
74
 
55858
88
 
 
335
270
156
160
118
230
 
301885
212
 
 
149
52
91
115
43
154
 
71796
101
Control
 
159
31
127
212
71
224
 
142532
137
 
 
159
0
35
75
71
34
 
38328
62
 
 
292
125
184
246
225
170
 
274786
207
 
 
297
187
66
96
209
74
 
185907
155
 
 
170
37
42
66
114
81
 
55946
85
 

214

93

97

129

123

130

 

1127218

131

                     
 
 
346
175
177
192
239
140
 
295255
212
 
 
426
329
236
76
102
232
 
415417
234
 
 
359
238
183
123
183
30
 
268532
186
Same
 
272
60
82
85
101
98
 
111338
116
 
 
200
271
263
216
241
227
 
338876
236
 
 
366
291
263
144
220
180
 
389342
244
 
 
371
364
270
308
219
267
 
557151
300
 
 
497
402
294
216
284
255
 
687386
325
 

355

266

221

170

199

179

 

3063297

232

                     
 
 
282
186
225
134
189
169
 
246983
198
 
 
317
31
85
120
131
205
 
182261
148
 
 
362
104
144
114
115
127
 
204946
161
Differ
 
338
132
91
77
108
169
 
186103
153
 
 
263
94
141
142
120
195
 
170475
159
 
 
138
38
16
95
39
55
 
34315
64
 
 
329
62
62
6
93
67
 
129103
103
 
 
292
139
104
184
193
122
 
201390
172
 

290

98

109

109

124

139

 

1355576

145

                     
 

286

153

142

136

148

149

 

Grand

169

 

The Dreaded Computations

Just like when we had two between-subject variables, there are three effects of interest in the current experiment:

  • The main effect of Group
  • The main effect of Interval
  • The Group x Interval interaction
  • However, recall that we can (an do) use a different error term when testing within-subject effects than when testing between subject effects

     

    SStotal (by the way) = 1432293

     

    So, the first thing we must do is to decide which effects are purely between-subjects, and which have a within-subject component

    For this study, Group was manipulated between-subjects, but both Interval and the Group x Interval interaction have a between subjects component (i.e., Interval)

    OK, now we separately deal with our between and within-subject effects

     

    Between-Subject Effects

    We treat between subject effects like we always have. We calculate SStreat as the sum of squares of the treatment means times the relevant n, and we calculate SSerror as the sum of the variance of subjects within the group

     

     

     

     

     

    dfgroup = k-1 = 3-1 = 2
    dfw/grp = k(n-1) = 3(7) = 21

     

    Within-Subject Effects

    OK, for starters, the sums of squares for the Interval and interaction effects are calculated like we did in the 2 between-subject case

     

     

     

     

    SSgrp * int = SScells - SSint - SSgrp
    = 766368 - 399744 - 287472
    = 79152

     

    dfint = k-1 = 6-1 = 5
    dfgrp * int = dfgrp * dfint = 2 * 5 = 10

     

    The Within-Subject Error Term

    Remember than when we are dealing with within subject effects, we use a different error term (one that does not include the variability due to subject by subject variation)

    Given the computations we have done so far, we can get the rest by subtraction …

     

    Source
    SS
    df
    MS
    F
    Between
    6721981
    Group
    287472
    2
    143736
    7.85
    Ss / Group
    384726
    21
    18320
     
     
     
     
     
     
    Within
    7600952
    Interval
    399744
    5
    79949
    29.85
    Grp X Int
    79152
    10
    7915
    2.96
    Ss / Grp * Int
    2811993
    1054
    2678
     
    Total
    1432293
    143
     
     
     
    1obtained by adding SSgroup and SSss/group

    2obtained by subtracting SSbetween from SStotal

    3obtained by subtracting SSinterval and SSgrp * int from SSwithin

    4obtained by subtracting dfgroup, dfss/group, dfinterval and dfgrp * int from dftotal

     

     

    Critical F’s:

    F(2,21) = 3.49 F(5,105) = 2.37 F(10,105) = 1.99

     

    Conclusions from the Anova

    Main Effect of Group 

    We can reject the null hypothesis that there was no effect of group. The F-obtained for the main effect of group was greater than the critical F suggesting the there are differences among the three group means. From looking at the means it appears that this is mostly due to the mean for the "Same" group being much higher than the other two means.

     

     

    Main Effect of Interval

    We can also reject the null hypothesis that there was no effect of interval. The F-obtained for the main effect of interval was greater than the critical F suggesting that there are differences among the six interval means. From the means, it appears as though activity was very high in the first interval, then dropped of and stayed relatively constant.

     

    1
    2
    3
    4
    5
    6
    286
    153
    142
    136
    148
    149

     

    Interaction of Group * Interval

    Finally, we can also reject the null hypothesis that the effect of interval was the same for the three groups. The F-obtained for the interaction was greater than the critical F suggesting that the effect of interval is different for the three groups. From the means, it appears as though the "Same" group stayed active longer (across more of the early intervals) than the other groups.

     

     
    1
    2
    3
    4
    5
    6
    Control
    214
    93
    97
    129
    123
    130
    Same
    355
    266
    221
    170
    199
    179
    Differ
    290
    98
    109
    109
    124
    139

     

     

     

    *** Chapter 13 Flashback***

     

     
    2 mins
    5 mins
    10 mins
     
    phobic
    mean = 7
    mean = 8
    mean = 9
    8
    control
    mean = 5
    mean = 5
    mean = 5
    5
     
    6
    6.5
    7
    6.5

     

    Source
    df
    SS
    MS
    F
    Time
    2
    8
    4
    13.79
    Group
    1
    108
    108
    372.41
     
     
     
     
     
    T x G
    2
    8
    4
    13.79
     
     
     
     
     
    Within
    42
    12
    0.29
     
    Total
    47
    136
       

    *** Chapter 13 Flashback***

     

    Simple Effects for the effect of time at each level of group.

     

     

    Source
    df
    SS
    MS
    F
    T for Phob
    2
    16
    8
    27.59
    T for Cont
    2
    0
    0
    0
     
     
     
     
     
    Within
    42
    12
    0.29
     
    Total
    47
    136
     
     

     

    So, we could describe the interaction by saying that fear increased over time for phobics, but fear did not change at all over time for the controls

    *** Chapter 13 Flashback***

     

    Simple Effects

    As was the case when we had two between subject variables, we will often want to do simple-effects analyses to gain a better understanding of the interaction 

    Recall that there are two ways we could approach these analyses, we could ask

  • At which intervals was there a significant effect or group (i.e., a difference between the groups)?, or
  • For which groups was there a significant effect of interval?
  • Here it makes sense to look at the interaction and consider the experimental predictions to determine which of these approaches is likely to yield the information you want

    Since the predictions are focused primarily on potential differences between groups (or lack of differences), the first approach is the one we would want to take in this case 

    Nonetheless, we will briefly consider both situations

    Simple Effects for Within-Subject Variables

    We had decided that in our situations we were not interested in looking at the effect of interval separately for each group

    But, if we had been, then we would have been examining the effect of a within-subject variable (interval)

    For reasons that are not important, whenever you are doing simple-effects that are focused on the effect of a within-subject variable, you cannot use some general error term (like, for example SSs/grp * int)

    Instead, what you do is a separate one-way, repeated measures analysis of variance for each simple effect

    So, for example, if you were interested in the effect of interval for the control group, you would run a complete repeated measures ANOVA examining the interval variable but using only the data from the control group

     

    Simple Effects for Between-Subject Variables

    Step 1: Computing sums of squares for the effect of group at each interval

     

     

    Step 2: Mean Squareds for the group effects at each interval

    Since there are three groups at each interval, there are 2 degrees of freedom for each contrast

     

    MS = SS/df, so …

    MSGrp at Int1 = 79688 / 2 = 39844.00
    MSGrp at Int2 = 155125 / 2 = 77562.50
    MSGrp at Int3 = 74840 / 2 = 37420.00
    MSGrp at Int4 = 15472 / 2 = 7736.00
    MSGrp at Int5 = 30416 / 2 = 15208.00
    MSGrp at Int6 = 10888 / 2 = 5444.00
    Step 3: The error term

    OK, here is where we differ from the Chapter 13 way of doing things

    The appropriate error term SS is the SSSs/Cell

    We could calculate that by hand but it would take a lot of work

    In the "trust me" category, I give you the following:

    SSSs/Cell = SSSs/Group + SSSs/Grp X Int, and

    dfSs/Cell = dfSs/Group + dfSs/Grp X Int

     

    So, for our example …

     

    SSSs/Cell = SSSs/Group + SSSs/Grp X Int

    = 384726 + 281199 = 665925

     

    dfSs/Cell = dfSs/Group + dfSs/Grp X Int

    = 21 + 105 = 126

     

    MSSs/Cell = SSSs/Cell / dfSs/Cell

    = 665925 / 126 = 5285.12

     

    Step 4: Source table depicting results

    Source
    df
    SS
    MS
    F
    Grp at Int1
    2
    79688
    39844
    7.54
    Grp at Int2
    2
    155125
    77562.5
    14.68
    Grp at Int3
    2
    74840
    37420
    7.08
    Grp at Int4
    2
    15472
    7736
    1.46
    Grp at Int5
    2
    30416
    15208
    2.88
    Grp at Int6
    2
    10888
    5444
    1.03
     
     
     
     
     
    Ss / Cell
    126
    665925
    5285.12
     
    Total
    143
    1432293
       

    Fcrit(2,126) = 3.07