Chapter 13

Factorial Analysis of Variance

 


 

In Chapter 12 we focused on "one-way" analysis of variance which is the appropriate analysis when you have only one variable (or factor) with multiple levels

In the current chapter, we will focus instead on situations where we have multiple variables, each with multiple levels

For example, "fairness" of midterm as a function of gender and year:

 

 
Males
Females
 
First
 
 
 
Second
 
 
 
Third
 
 
 
Fourth+
 
 
 
 
 
 
 

 

Can ask three questions … (1) does opinions of fairness differ across the genders?, (2) does opinions of fairness differ across the years?, (3) is the effect of year on opinions different for the different genders? 

Terminology

Main effects. The first two questions are examples of what we will be calling "main effects". One way to think of main effects is the following. Assume we have variables A & B; the main effect of A would be whether there was an effect of A collapsing across levels of B. It is as if the variable B were of no interest.

Interactions. The third question is an example of an interaction. In words an interaction can be stated in the following manner. Two variables are interacting when the effect of the first variable is different at different levels of the second variable.

 Simple Effects. We will also be talking about simple effects. Simple effects relate to questions like .. if we only consider second year students, are opinions concerning the exam different depending gender.

In class example with memory for words of different imageability and frequency.

 

Stimuli

High Frequency

High imageability
Low imageability
bible
extra
chest
theme
snake
skill
china
ideal
shore
quick
grass
trust
beach
allow
smile
brief
dress
trent
chain
worse

Low Frequency

High imageability
Low imageability
yacht
audit
vault
wrath
dummy
tally
thorn
scorn
satin
hasty
witch
dread
brook
bland
coral
fraud
berry
proxy

 

Notation

 Frequency

 imageability
High
Low
 
 
 
 
High
 
 
 
 
 
4
4
3
3
4
3
4
5
mean 3.63
SS = 5.88
2
4
1
3
6
2
4
4
mean 3.25
SS = 17.50
 
 
 
 
 
Low
 
 
 
 
 
1
5
2
0
3
2
1
1
mean 1.88
SS = 16.88
1
3
2
2
2
2
2
4
mean 2.25
SS = 5.50

 

Plotting the Data


Main Effects:

Interaction:

Is the effect of frequency different at different levels of imageability 

Simple Effects:

Factorial Designs

The experiment we just ran used a factorial design. 

What that means is that we included all combinations of different levels of our two variables (sometimes called a fully crossed design)

Between versus Within- Subject Designs

We also had different subjects in each cell of the design. When you do that, you have a between subject design

We could have tested all subjects in all conditions, that would be called a complete within-subjects design because all the variables were manipulated within subjects

Finally, we could have a mixed design in which one (or more) variables are within subjects, and one (or more) other variables are between subjects 

Chapter 13 only considers between-subject designs, Chapter 14 will consider within and mixed designs

Computations in two-way ANOVA

Warning: Once again, my way of presenting this stuff will be different from the way the text does it.

 Logic:

Things that Stay the Same

SStotal - SStotal is still calculated as just the total sum of squares. Thus, ignoring all of the manipulated variables, just sum the data points and sum the squares of each data point, then:

 

 

SSwithin - SSwithin again simply equals the sum of the SS for each cell. Thus, you must first calculate the SS for each cell using the formula above, then sum them. I have done the individual SSs, from there:

 

SSwithin = SS11 + SS12 + SS21 + SS22

= 5.88 + 17.50 + 16.88 + 5.50 = 45.76

 

So, SStotal = 62.00 and SSwithin = 45.76 … so far, easy right?

 

Things that Stay Pretty Much the Same

The SStreat is also calculated in pretty much the same way as before EXCEPT now you need to do an SStreat for each variable in the design.

Thus, for each variable we are going to compute an SS which is simply the sum of squares representing the degree to which the means at each level of the variable deviate from the grand mean, multiplied by the n per cell (because of CLT .. remember?)

The grand mean for our data is 2.75

For the frequency variable, the high freq mean is 2.76, the low freq mean is 2.75. So:

 

Similarly:

 

 

Now for Something Completely Different

The last thing we want is the sum of squares due to the interaction between frequency and imageability

To get that we first calculate the SS for all of the cells in the design around the grand mean (multiplied by n per cell)

This "variance" is due to the interaction plus the two main effects, so by subtracting the main effects we are left with the SS for the interaction. So:

 

 

So, the SS for the interaction is:

SSFxI = SScells - SSfreq - SSimg
= 16.24 - 0.00 - 15.02 = 1.22
 

Now on to the source table … with a brief stop-over at degrees of freedom

 

Demonstration

 

 
M
F
 
 
M
F
 
Lo F
4
4
4
 
6
2
4
Hi F
4
4
4
 
2
6
4
 
4
4
 
 
4
4
 

 

 

 

 
M
F
 
 
M
F
 
Lo F
6
6
6
 
8
4
6
Hi F
4
4
4
 
2
6
4
 
4
4
 
 
4
4
 

 

 

Degrees of Freedom

dftotal is again N-1 … 32 - 1 = 31
dffreq is number of levels minus 1 … 2 - 1 = 1
dfimg is number of levels minus 1 … 2 - 1 = 1

dfFxI is dffreq x dfimg … 1 x 1 = 1

 dfwithin = dftotal - (dffreq + dfimg + dfFxI) = 31 - 3 = 28

 

 Source Table

Source
df
SS
MS
F
Freq
1
0.00
0.00
0.00
image
1
15.02
15.02
9.21
 
 
 
F x I
1
1.22
1.22
0.75
 
 
 
Within
28
45.76
1.63
 
Total
35
62.00
 
 

Remember Hypothesis Testing

Remember all this ANOVA stuff is done in the context of experimental hypotheses

In the case of 2 by 2 ANOVAs there are actually three null effects; one of each main effect and one for the interaction

For example:

H0: low frequency words are recalled as well as high frequency words
H0: low imageability words are recalled as well as high imageability words
H0: any possible effect of frequency is the same for high and low imageability items

Once a source table is obtained, each of these hypotheses is then tested by comparing the obtained F for that hypothesis to its appropriate critical F

 Simple Effects

Often the ANOVA will tell you that there is a significant interaction, but it stops there. 

To properly interpret an interaction we usually need more specific information than that.

For example, consider the following interactions:

In order to accurately describe these interactions, we have to know whether the effect of variable B is significant at each level of variable A

This involves simple effects tests

 

Computing Simple Effects

The computation of simple effects is no different from the other sum of squares we have been calculating except we focus in on one row or column

SS freq at hi image:

 

SS freq at lo image:

 

You evaluate these simple effects just like any other sum of squares. Divide them by there df (cells minus 1) to get a MS. Then divide that by MSerror to get an F

For the above two examples Fobtained = 0.36

This is not significant implying that there was no frequency effect at either level of imageability

 

An Example From the Top

 

 
2 mins
5 mins
10 mins
 
phobic
mean = 7
mean = 8
mean = 9
8
control
mean = 5
mean = 5
mean = 5
5
 
6
6.5
7
6.5

 

Say that I give you the following information:

n per cell = 8
SStotal = 136
SSwithin = 12

 

 

 

 

Source
df
SS
MS
F
Time
2
8
4
13.79
Group
1
108
108
372.41
 
 
 
 
T x G
2
8
4
13.79
 
 
 
 
 
Within
42
12
0.29
 
Total
47
136
   

Simple Effects

To better understand the interaction, we could either look at the effect of group at each level of time, or look at the effect of time at each level of group.

We will do the latter as it seems to make the most sense .. so:

 

 

Source
df
SS
MS
F
T for Phob
2
16
8
27.59
T for Cont
2
0
0
0
 
 
 
 
Within
42
12
0.29
 
Total
47
136
 
 

 

Multiple Comparisons

C-T1    C-T2    C-T3    Ph-T1    Ph-T2    Ph-T3
5       5       5       7        8        9 

 

 

 
 
CT1
CT2
CT3
PT1
PT2
PT3
 
 
 
5
5
5
7
8
9
Wr
C-T1
5
0
0
2*
3*
4*
0.81
C-T2
5
 
 
0
2*
3*
4*
..
C-T3
5
 
 
 
2*
3*
4*
..
Ph-T1
7
 
 
 
 
1*
2*
 
Ph-T2
8
 
 
 
 
1*
..
Ph-T3
9
 
 
 
 
 
 
 

 

C-T1    C-T2    C-T3    Ph-T1    Ph-T2    Ph-T3
5       5       5       7        8         9 
--------------------    -----    -----    -----

 

In words then, these results suggest the following.

First, for control subjects time had no effect at all as their mean fear level was not different across the three times examined

At all of the times tested, the phobic subject showed more fear than the control subjects

Each additional amount of time significantly increased the fear level of the phobic subjects such that they were more scared at 5 mins than 2 mins, and even more scared at 10 mins than 5 mins

The moral of the multiple comparisons part of this chapter is that when you do multiple comparisons in a factorial design, you basically act like it is a single factor design with each cell of the multi-factor design being like a level of the single factor design.

 

Magnitude of the Effect

As described in Chapter 11, it is often desirable to quantify the magnitude of an observed effect

This is also true in factorial designs with the only difference being that you know multiple effects that can be quanitified

Once again, one can use h 2 (The SS relevant to the effect divided by the SStotal) as a quick and dirty way of calculating how much of the total variation in the data was due the variable of interest 

However, as mentioned, h 2 is biased in that in overestimates the true magnitude of an effect

The textbook goes into a description of a revised w 2 estimate that can be calculated for factorial designs

However, for our purposes, you don’t have to worry about understanding that

Instead, know why you would want to calculate the magnitude of an effect, know how to do so via h 2, know that h 2 is a biased estimator and that w 2 is better, and know that if you ever need to calculate w 2 the text shows you how

 

Power Analysis for Factorial Experiments

Recall again that power is the probability that you will be able to reject a null hypothesis.

Power depends on the size of the effect you expect AND the number of subjects you plan to run

In Chapter 11 we said that to calculate power in a one-way ANOVA, we do the following:

Step 1: Calculate

 

 

 

Step 2: Convert to

 

 

Step 3: Get associated value for ncF table

 

Power = 1 -

 

No focus on Step 1. That formula can be restated as the square root of the sum of squares relevant to the effect we are interested in, divided by k, and then divided by the mean squared error

So, lets say we are using a 2-way factorial design … now we have 3 null hypothesis … 1) the main effect of A, the main effect of B, and the interaction of A & B.

Assuming you have some estimate of the mean squared error …

All you need to do to find the power associated with these nulls is to estimate (based on past research or an educated guess) what you think our final means will look like. With those estimates in combination with our intended n, you can compute sum of squares and use the exact logic as we did before

The only real difference is that we now have 3 power analyses we could do (assuming 2 variables)

Note: Read the meat of these sections in the text (ignoring their computations if you like)

 

Unequal Sample Sizes

Unequal sample sizes cause big problems for factorial designs because it messes with the independence of the two variables, allowing effects of one variable to produce apparent effects in the other

Consider the following example from the text:

 
Non-Drink
Drink
 
 
Michigan
 
13-15-14-16-12
 
 
mean = 14
18-20-22-29-21
23-17-18-22-20
 
mean = 20
 
18.0
 
Arizona
 
13-15-18-14-10
12-16-17-15-10
14
mean = 14
24-25-17-16-18
 
 
mean = 20
 
15.9
 
14.0
20.0
 

 

If you look at the actual cell totals, there is clearly no effect of state .. however if you look at the row totals, there appears to be an effect of state

The apparent effect of state is due the "drinking" effect and the unequal ns in the various cells

Rough Solution to Unequal ns

The row and column means we calculated are what are called "weighted" means

We could similarly compute an "unweighted" column mean which would simple be the mean of the cell means, as opposed to the mean of all the numbers that went into the cell means

Note that when ns are equal, the weighted and unweighted means are the same

However, if we calculate unweighted means in the previous example, notice that they seem to provide a better depiction of the cell data (means of 17 for both states)

We could then do our analysis using the unweighted means instead

However, in order to do this we have to "act as though" we were in an equal n condition with those row and cell means .. but what n do we use?

 

 

Higher-Order Factorial Design

So far we have been focusing on experiments that manipulate 2 variables at a time … however, often an experimenter will manipulate three or more variables

Say we have three variables .. then we actually have 3 main effects, 3 two-way interactions, and one three-way interaction

 For example:

A prof wants to better understand the factors that affect performance in Psych C08. He thinks three variables are important: 1) Understanding of basic statistics which he thinks is reflected in the student’s B07 marks, 2) The textbook, and 3) the use of quizzes to keep the students attention

So, he chooses to teach 4 versions of his class next year which represent the cells of a textbook (old vs new) by quiz (have vs not have) design. However, he also splits performance by mark in B07 (B or better vs. less than B)

Assume he gets the following data:

 

    Less than B              B or better
        Text                     Text
 
Old
New
 
 
Old
New
 
Quiz
79
74
 
Quiz
85
80
 
No Qz
68
63
 
No Qz
75
80
 
 
 
 
 
 
 
 
 

Assuming there was an equal number of subjects in each cell … then what about the following?

Main Effect of B07 grade?
Main Effect of Text Book?
Main Effect of Quiz?
Interaction of B07 by Text?
Interaction of B07 by Quiz?
Interaction of Text by Quiz?

 3-way interaction (B07 by Text by Quiz)?

P.S. - forget about the computations for now .. just worry about being able to interpret the data.

For example, on a test you might get something like we have been discussing along with the following source table:

Note: I made up the entire source table below … if you did the computations on the above you would not get these numbers

 

Source
SS
df
MS
F
B07 grade
65
1
65
6.50
Text
20
1
20
2.00
Quiz
38
1
38
3.80
 
 
 
 
 
B07 x Text
15
1
15
1.50
B07 x Quiz
34
1
34
3.40
Text x Quiz
12
1
12
1.20
B07 x T x Q

41

1

41

4.10

Within
560
56
10
 
Total
785
63
 
 

Based on this I could ask you to describe the results of the experiment … could you?