Multivariate Statistics: Concepts, Models, and Applications
David W. Stockburger
Suppose a statistics teacher gave an essay final to his class. He randomly divides the classes in half such that half the class writes the final with a bluebook and half with notebook computers. In addition the students are partitioned into three groups, no typing ability, some typing ability, and highly skilled at typing. Answers written in bluebooks will be transcribed to word processors and scoring will be done blindly. Not with a blindfold, but the instructor will not know the method or skill level of the student when scoring the final. The dependent measure will be the score on the essay part of the final exam.
The first factor will be called Method and will have two levels, bluebook and computer. The second factor will be designated as Ability and will have three levels: none, some, and lots. Each subject will be measured a single time. Any effects discovered will necessarily be between subjects or groups and hence the designation "between groups" designs.
In the case of the example data, the Ability factor has two levels while the Method factor has three. The X variable is the score on the final exam. The example data file appears below.
The analysis is done in SPSS/WIN by selecting "Statistics", "General Linear Model", and then "GLM  General Factorial." In the next screen, the Dependent Variable is X and the Fixed Factors are Ability and Method. The screen will appear as follows.
The only "Options" that will be selected in this example is the "Descriptive Statistics" option under "Display." This will produce the table of means and standard deviations.
The interpretation of the output from the General Linear Model command will focus on two parts: the table of means and the ANOVA summary table. The table of means is the primary focus of the analysis while the summary table directs attention to the interesting or statistically significant portions of the table of means.
Often the means are organized and presented in a slightly different manner than the form of the output from the GENERAL LINEAR MODEL command. The table of means may be rearranged and presented as follows:

None 
Some 
Lots 

bluebook 
26.67 
31.00 
33.33 
30.33 
computer 
28.00 
36.67 
27.00 
30.56 

27.33 
33.83 
30.17 
30.44 
The means inside the boxes are called cell means, the means in the margins are called marginal means, and the number on the bottom righthand corner is called the grand mean. An analysis of these means reveals that there is very little difference between the marginal means for the different levels of Method across the levels of Ability (30.31 vs. 30.56). The marginal means of Ability over levels of Method are different (27.33 vs. 33.83 vs. 30.17) with the mean for "Some" being the highest. The cell means show an increasing pattern for levels of Ability using a bluebook (26.67 vs. 31.00 vs. 33.33) and a different pattern for levels of Ability using a computer (28.00 vs. 36.67 vs. 27.00).
Graphs of means are often used to present information in a manner that is easier to comprehend than the tables of means. One factor is selected for presentation as the Xaxis and its levels are marked on that axis. Separate lines are drawn the height of the mean for each level of the second factor. In the following graph, the Ability, or keyboard experience, factor was selected for the Xaxis and the Method, factor was selected for the different lines.
Presenting the information in an opposite fashion would be equally correct, although some graphs are more easily understood than others, depending upon the values for the means and the number of levels of each factor. The second possible graph is presented below.
It is recommended that if there is any doubt that both versions of the graphs be attempted and the one which best illustrates the data be selected for inclusion into the statistical report. In this case it appears that the graph with Ability on the Xaxis is easier to understand than the one with Method on the Xaxis.
The results of the analysis are presented in the ANOVA summary table, presented below for the example data.
The items of primary interest in this table are the effects listed under the "Source" column and the values under the "Sig." column. As in the previous hypothesis test, if the value of "Sig" is less than the value of a as set by the experimenter, then that effect is significant. If a =.05, then the Ability main effect and the Ability BY Method interaction would be significant in this table.
Main effects are differences in means over levels of one factor collapsed over levels of the other factor. This is actually much easier than it sounds. For example, the main effect of Method is simply the difference between the means of final exam score for the two levels of Method, ignoring or collapsing over experience. As seen in the second method of presenting a table of means, the main effect of Method is whether the two marginal means associated with the Method factor are different. In the example case these means were 30.33 and 30.56 and the differences between these means was not statistically significant.
As can be seen from the summary table, the main effect of Ability is significant. This effect refers to the differences between the three marginal means associated with Ability. In this case the values for these means were 27.33, 33.83, and 30.17 and the differences between them may be attributed to a real effect.
A simple main effect is a main effect of one factor at a given level of a second factor. In the example data it would be possible to talk about the simple main effect of Ability at Method equal bluebook. That effect would be the difference between the three cell means at level a_{1} (26.67, 31.00, and 33.33). One could also talk about the simple main effect of Method at Ability equal lots (33.33 and 27.00). Simple main effects are not directly tested in this analysis. They are, however, necessary to understand an interaction.
An interaction effect is a change in the simple main effect of one variable over levels of the second. An A X B or A BY B interaction is a change in the simple main effect of B over levels of A or the change in the simple main effect of A over levels of B. In either case the cell means cannot be modeled simply by knowing the size of the main effects. An additional set of parameters must be used to explain the differences between the cell means. These parameters are collectively called an interaction.
The change in the simple main effect of one variable over levels of the other is most easily seen in the graph of the interaction. If the lines describing the simple main effects are not parallel, then a possibility of an interaction exists. As can be seen from the graph of the example data, the possibility of a significant interaction exists because the lines are not parallel. The presence of an interaction was confirmed by the significant interaction in the summary table. The following graph overlays the main effect of Ability on the graph of the interaction.
Two things can be observed from this presentation. The first is that the main effect of Ability is possibly significant, because the means are different heights. Second, the interaction is possibly significant because the simple main effects of Ability using bluebook and computer are different from the main effect of Ability.
One method of understanding how main effects and interactions work is to observe a wide variety of data and data analysis. With three effects, A, B, and A x B, which may or may not be significant there are eight possible combinations of effects. All eight are presented on the following pages.
Note that the means and graphs of the last two example data sets were identical. The ANOVA table, however, provided a quite different analysis of each data set. The data in this final set was constructed such that there was a large standard deviation within each cell. In this case the marginal and cell means were not different enough to warrant rejecting the hypothesis of no effects, thus no significant effects were observed