Linear Models and Analysis of Variance:

CONCEPTS, MODELS, AND APPLICATIONS

 

Volume II

 

 

 

 

 

First Edition

 

 

 

 

 

 

 

 

David W. Stockburger

 

Southwest Missouri State University

 

 

 

 

 

 

@Copyright 1993


 

                                           TABLE OF CONTENTS

 

Title                                                                                            Page

EXPERIMENTAL DESIGNS.............................................................................. 132

Notation.................................................................................................... 133

Kinds of Factors........................................................................................ 134

Treatment...................................................................................... 134

Group Factors............................................................................... 135

Trials Factors................................................................................ 135

Blocking........................................................................................ 135

Unit Factors.................................................................................. 137

Error Factors................................................................................. 137

Fixed and Random Factors........................................................................ 137

Fixed Factors................................................................................ 137

Random Factors............................................................................ 138

Relationships Between Factors................................................................... 139

Crossed........................................................................................ 139

Nested.......................................................................................... 140

An Example Design................................................................................... 142

A Second Example Design......................................................................... 144

A Third Example Design............................................................................ 146

Determining the Number of Subjects and Measures per Subject................. 148

Setting up the Data Matrix......................................................................... 148

A Note of Caution..................................................................................... 149

 

One Between Group ANOVA.............................................................................. 150

Why Multiple Comparisons Using t-tests is NOT the Analysis of Choice..... 150

The Bottom Line - Results and Interpretation of ANOVA.......................... 151

HYPOTHESIS TESTING THEORY UNDERLYING ANOVA.............. 153

The Sampling Distribution Reviewed.............................................. 153

Two Ways of Estimating the Population Parameter σX²................... 154

The F-ratio and F-distribution.................................................................... 157

Non-significant and Significant F-ratios....................................................... 159

Similarity of ANOVA and t-test................................................................. 162

EXAMPLE OF A NON-SIGNIFICANT ONE-WAY ANOVA.............. 164

EXAMPLE OF A SIGNIFICANT ONE-WAY ANOVA........................ 164

USING MANOVA.................................................................................. 164

The Data....................................................................................... 165

Example Output............................................................................. 166

Dot Notation............................................................................................. 168

POST-HOC Tests of Significance.............................................................. 170

Example SPSS Program using MANOVA................................................. 173

Interpretation of  Output............................................................................ 174

Graphs of Means....................................................................................... 175

The ANOVA Summary Table................................................................... 176

Main Effects.................................................................................. 177

Simple Main Effects....................................................................... 178

Interaction Effects.......................................................................... 178

Example Data Sets, Means, and Summary Tables...................................... 180

No Significant Effects..................................................................... 180

Main Effect of A............................................................................ 181

Main Effect of B............................................................................ 182

AB Interaction............................................................................... 183

Main Effects of A and B................................................................ 184

Main effect of A, AB Interaction ................................................... 185

Main Effect of B, AB Interaction.................................................... 186

Main Effects of A and B, AB Interaction........................................ 187

No Significant Effects..................................................................... 188

Dot Notation Revisited.............................................................................. 189

 

Nested Two Factor Between Groups Designs B(A)............................................... 192

The Design................................................................................................ 192

The Data................................................................................................... 192

SPSS commands....................................................................................... 193

The Analysis.............................................................................................. 193

The Table of Means....................................................................... 194

Graphs.......................................................................................... 194

The ANOVA Table....................................................................... 195

Interpretation of Output............................................................................. 195

Similarities to the A X B Analysis............................................................... 195

 

Contrasts, Special and Otherwise........................................................................... 197

Definition................................................................................................... 197

Sets of Contrasts....................................................................................... 198

Orthogonal Contrasts..................................................................... 198

Non-orthogonal Contrasts............................................................. 199

Sets of Orthogonal Contrasts......................................................... 199

Finding Sets of Orthogonal  Contrasts............................................ 199

The Data................................................................................................... 201

SPSS commands....................................................................................... 202

The Analysis.............................................................................................. 202

The Table of Means....................................................................... 202

The ANOVA table........................................................................ 203

Interpretation of Output............................................................................. 203

Constants.................................................................................................. 203

Contrasts, Designs, and Effects.................................................................. 205

Non-Orthogonal Contrasts........................................................................ 208

Smaller than Total Sum of Squares................................................. 208

Larger than Total Sum of Squares.................................................. 209

Standard Types of Orthogonal Contrasts.................................................... 209

DIFFERENCE.............................................................................. 210

SIMPLE....................................................................................... 210

POLYNOMIAL........................................................................... 210

Conclusion................................................................................................ 213

 

ANOVA and Multiple Regression.......................................................................... 214

ONE FACTOR ANOVA......................................................................... 214

ANOVA and Multiple Regression.................................................. 214

Example........................................................................................ 215

Example Using Contrasts............................................................... 216

Dummy Coding............................................................................. 216

ANOVA, Revisited....................................................................... 218

TWO FACTOR ANOVA........................................................................ 219

Example........................................................................................ 220

Example Using Contrasts............................................................... 220

Regression Analysis using Dummy Coding...................................... 221

Conclusion................................................................................................ 223

 

Unequal Cell Frequencies...................................................................................... 225

Equal Cell Frequency - Independence of Effects......................................... 225

Unequal Cell Frequency - Dependent Effects............................................. 226

Solutions for Dealing with Dependent Effects.............................................. 229

UNEQUAL CELL SIZES FROM A MULTIPLE REGRESSION VIEWPOINT        231

REGRESSION ANALYSIS OF UNEQUAL N ANOVA........................ 234

RECOMMENDATIONS......................................................................... 237

 

Subjects Crossed With Treatments  S X A............................................................. 238

The Design................................................................................................ 238

The Data................................................................................................... 239

SPSS commands....................................................................................... 239

The Correlation Matrix.................................................................. 240

The Table of Means....................................................................... 240

Graphs.......................................................................................... 241

The ANOVA Table....................................................................... 241

Interpretation of Output................................................................. 242

Additional Assumptions for Univariate S X A Designs................................ 244

SS, MS, and Expected Mean Squares (EMS)................................ 245

 

Subjects Crossed With Two Treatments -S X A X B............................................. 249

The Design................................................................................................ 249

The Data................................................................................................... 249

SPSS commands....................................................................................... 250

The Correlation Matrix.................................................................. 251

The Table of Means....................................................................... 251

Graphs.......................................................................................... 253

The ANOVA Table....................................................................... 253

Interpretation of Output................................................................. 253

Additional Assumptions for Univariate S X A X B Designs......................... 258

SS, MS, and Expected Mean Squares (EMS)................................ 258

 

Mixed Designs - S ( A ) X B................................................................................. 261

The Design................................................................................................ 261

The Data................................................................................................... 261

SPSS commands....................................................................................... 262

The Table of Means....................................................................... 263

Graphs.......................................................................................... 264

Interpretation of Output................................................................. 264

Expected Mean Squares (EMS).................................................... 267

 

Three Factor ANOVA.......................................................................................... 268

Effects....................................................................................................... 268

Main Effects.................................................................................. 269

Two-Way Interactions................................................................... 269

Three-Way Interaction................................................................... 271

Additional Examples.................................................................................. 272

All Effects Significant..................................................................... 272

 Example 3 - B, AC, and BC......................................................... 274

Two More Examples..................................................................... 275

Expected Mean Squares............................................................................ 276

Tests of Significance.................................................................................. 279

Error Terms................................................................................... 279

SPSS Output................................................................................. 279

Examples................................................................................................... 279

S ( A X B X C)............................................................................. 279

S ( A X B ) X C............................................................................ 280

S ( A ) X B X C............................................................................ 280

 

BIBLIOGRAPHY................................................................................................. 287

 

INDEX................................................................................................................. 289

 

 





                                                             Chapter

                                                                8

 

 

 

            EXPERIMENTAL DESIGNS

 

Experimental design refers to the manner in which the experiment was set up. Experimental design includes the way the treatments were administered to subjects, how subjects were grouped for analysis, how the treatments and grouping were combined. 

 

In ANOVA there is a single dependent variable or score.  In Psychology the dependent measure is usually some measure of behavior.  If more than one measure of behavior is taken, multivariate analysis of variance, or MANOVA, may be the appropriate analysis.  Because the ANOVA model breaks the score into component parts, or effects, which sum the total score, the one must assume the interval property of measurement for this variable.  Since in real life the interval property is never really met, one must be satisfied that at least an approximation of an interval scale exists for the dependent variable.  To the extent that this assumption is unwarranted, the ANOVA hypothesis testing procedure will not work.

 

In ANOVA there is at least one independent variable or factor.  There are different kinds of factors;  treatment, trial, blocking, and group.  Each will be discussed in the following section.  All factors, however, have some finite number of different levels.  Each level is the same in either some quality or quantity.   The only restriction on the number of levels is that there are fewer levels than scores, although in practice one seldom sees more than ten levels in a factor unless the data set is very large.  It is not necessary that the independent variables or factors be measured on an interval scale.  If the factors are measured on an (approximate) interval scale, then some flexibility in analysis is gained.  The continued popularity of ANOVA can partially be explained by the lack of the necessity of the interval assumption for the factors.

 


Notation     

 

Every writer of an introductory, intermediate, or advanced statistics text has his or her own pet notational system.  I have taught using a number of different systems and have unabashedly borrowed the one to be described below from Lee (1975).  In my opinion it is the easiest for students to grasp.

 

The dependent variable or score will be symbolized by the letter X.  Subscripts (usually multiple) will be tagged on this letter to differentiate the different scores.  For example, to designate a single score from a group of scores a single subscript would be necessary and the symbol Xs could be used.  In this case X1 would indicate the first subject, X2 the second, X3 the third, and so forth.

 

When it is desired to indicate a single score belonging to a given combination of factors, multiple subscripts must be used.  For example, Xabs would describe a given score for a combination of a and b.  Thus, X236 would describe the sixth score when a=2 and b=3.  Another example, X413, would describe the third score when a=4 and b=1.

 

Bolded capital letters will be used to symbolize factors.  Example factors are A, B, C, ..., Z.  Some factor names are reserved for special factors.  For example, S will always refer to  the subject factor, E will always be the error factor, and G will be the group factor.

 

Small letters with a numerical subscript are used to indicate specific levels of a factor.  For example c1 will indicate the first level of factor C, while cc will indicate a specific level of factor C, but the level is unspecified.  The number of levels of a factor are given by the unbolded capital letter of that factor.  For example there are 1, 2, ..., C  levels of factor C.

 


In an example experiment, let X, the score, be the dollar amount after playing WindowsTM Solitaire for an hour.  In this experiment the independent variable (factor) is the amount of practice, called factor A.  Let nine subjects each participate in one of four (A=4) levels of training.  The first level, a1, consists on no practice, a2 = one hour of practice, a3 = five hours of practice, and a4 = twenty hours of practice.  A given score (dollar amount) would be symbolized by Xas, where X35 would be the fifth subject in the group that received five hours of practice.

 

Kinds of Factors

 

Treatment

 

Treatments will be defined as quantitatively or qualitatively different levels of experience.  For example, in an experiment on the effects of caffeine, the treatment levels might be exposure to different amounts of caffeine, from none to .0375 milligrams.  In a very simple experiment there are two levels of treatment, none, called the control condition, and some, called the experimental condition. 

 

Treatment factors are usually the main focus of the experiment.  A treatment factor is characterized by the following two attributes (Lee, 1975):

 

1.         An investigator could assign any of his experimental subjects to any one of the levels of the factor.

 

2.         The different levels of the factor consist of explicitly distinguishable stimuli or situations in the environment of the subject.

 

In the solitaire example, practice time would be a treatment factor if the experimenter controlled the amount of time that the subject practiced.  If subject's came to the experiment having already practiced a given amount, then the experimenter could not arbitrarily or randomly assign that subject to a given practice level.  In that case the factor would no longer be considered a treatment factor.

 


In an experiment where subjects are run in groups, it sometimes is valuable to treat each group as a separate level of a factor.  There might be, for example, an obnoxious subject who affects the scores of all other subjects in that group.  In this case the second attribute would not hold and the factor would be called a group factor.

 

 

Group Factors

 

As described above, a group factor is one in which the subjects are arbitrarily assigned to a given group which differs from other groups only in that different subjects are assigned to it.  If each group had some type of distinguishing feature, other than the subjects assigned to it, then it would no longer be considered as a group factor.  If a group factor exists in an experimental design, it will be symbolized by G.

 

Trials Factors

 

If each subject is scored more than once under the same condition and the separate scores are included in the analysis, then a trials factor exists.  If the different scores for a subject are found under different levels of a treatment, then the factor would be called a treatment factor rather than a trials factor.  Trials factors will be denoted by T.

 

Trials factors are useful in examining practice or fatigue effects.  Any change in scores over time may be attributed to having previously experienced similar conditions.

 

Blocking

 


If subjects are grouped according to some pre-existing  subject similarity, then that grouping is called a blocking factor.  The experimenter has no choice but to assign the subject to one or the other of the levels of a blocking factor.  For example, gender (sex) is often used as a blocking factor.  A subject enters the experiment as either a male or female and the experimenter may not arbitrarily (randomly) assign that individual to one gender or the other.

 

Because the experimenter has no control over the assignment of subjects to a blocking factor, causal inference is made much more difficult.  For example, if in the solitaire experiment, the practice factor was based on a pre-existing condition, then any differences between the groups may be due either to practice or to the fact that some subjects liked to play solitaire, were better at the game and thus practiced more.  Since the subjects are self-selected, it is not possible to attribute the differences between groups to practice, enjoyment of the game, natural skill in playing the game, or some other reason.  It is possible, however, to say that the groups differed.

 

Even though causal inference is not possible, blocking factor can be useful.  A factor which  accounts for differences in the scores adds power to the experiment.  That is, a blocking factor which explains some of the differences between scores may make it more likely to find treatment effects.  For example,  if males and females performed significantly different in the solitaire experiment, it might be useful to include sex as a blocking factor because differences due to gender would be included in the error variance otherwise.

 

In other cases blocking factors are interesting in their own right.  It may be interesting to know that freshmen, sophomores, juniors, and seniors differ in attitude toward university authority, even though causal inferences may not be made.

 

In some cases the pre-existing condition is quantitative, as in an IQ score or weigh.  In these cases it is possible to use a median split where the scores above the median are placed in one group and the scores below the median are placed in another.  Variations of this procedure divide the scores into three, four, or more approximately equal sized groups.  Such procedures are not recommended as there are better ways of handling such data (Edwards, 1985).


 

Unit Factors

 

The unit factor is the entity from which a score is taken.  In experimental psychology, the unit factor is usually a subject (human or animal), although classrooms, dormitories, or other units may serve the same function.  In this text, the unit factor will be designated as S, with the understanding that it might be some other type of unit than subject.

 

Error Factors

 

The error factor, designated as E, is not a factor in the sense of the previous factors and is not included in the experimental design.  It is necessary for future theoretical development.

 

Fixed and Random Factors

 

Each factor in the design must be classified as either a fixed or random factor.  This is necessary in order to find the correct error term for each effect.  The MANOVA program in SPSS does not require that the user designate the type for each factor.  If the user is willing to accept the program defaults, which are correct in most cases, no problem is encountered.  There are situations, however, where the program defaults are incorrect and additional coding is necessary to do the correct hypothesis tests.

 

Fixed Factors

 

A factor is fixed if  (Lee, 1975)

 

1.         The results of the factor generalize only to the levels that were included in the experimental design.  The experimenter may wish to generalize to other levels not included in the factor, but it is done at his or her own peril.

 


2.         Any procedure is allowable to select the levels of the factor.

 

3.         If the experiment were replicated, the same levels of that factor would be included in the new experiment.

 

Random Factors

 

A factor is random if

 

1.         The results of the factor generalize to both levels that were included in the factor and levels which were not.  The experimenter wishes to generalize to a larger population of possible factor levels.

 

2.         The levels of the factor used in the experiment were selected by a random procedure.

 

3.         If the experiment were replicated, different levels of that factor would be included in the new experiment.

 

In many cases an exact determination of whether a factor is fixed or random is not possible.  In general, the subjects (S) and groups (G) factors will always be a random factor and all other factors will be considered fixed.  The default designation of MANOVA will set the subjects factor as random and all other factors as fixed.

 

Some reflection on the assumption of a random selection of subjects may cause the experimenter to question whether it is in fact a random factor.  Suppose, as often happens, subjects volunteered to participate in the experiment.  In this case the assumptions underlying the ANOVA are violated, but the procedure is used anyway.  Seldom, if ever, will all the assumptions necessary to do an ANOVA be completely satisfied.  The experimenter must examine how badly the assumptions were violated and then make a decision as to whether or not the ANOVA is useful.


In general, when in doubt as to whether a factor is fixed or random, consider it fixed.  One should never have so much doubt, however, as to consider the subjects factor as a fixed factor.

 

Relationships Between Factors

 

The following two relationships between factors describe a large number of useful designs.  Not all possible experimental designs fit neatly into categories described by the following two relationships, but most do.

 

Crossed

 

When two factors are crossed, each level of each factor appears with each level of the other factor.  A crossing relationship is indicated by an "X". 

 

For example, consider two factors, A and B, were A is gender (a1 = Females, a2 = Males) and B is practice (b1 = none, b2 = one hour, b3 = five hours, and b4 = twenty hours).  If gender was crossed with practice, A X B, then both males and females would participate in all four levels of practice.  There would be eight groups of subjects including:  ab11, females who had no practice, ab12, females who had one hour of practice, and so forth to ab24, males who practiced twenty hours.  An additional factor may be added to the design, say handedness (C), where c1 = right handed and c2 = left handed.  If the design of the experiment was A X B X C, then there would be sixteen groups, including abc231, left-handed males who practiced five hours.

 


If subjects (S) are crossed with treatments (A), S X A, each subject sees each level of the treatment conditions.  In a very simple experiment such as the effects of caffeine on alertness (A), each subject would be exposed to both a caffeine condition (a1) and a no caffeine condition (a2).  For example, using the members of a statistics class as subjects, the experiment might be conducted as follows.  On the first day of the experiment the class is divided in half with one half of the class getting coffee with caffeine and the other half getting coffee without caffeine.  A measure of alertness is taken for each individual, such as the number of yawns during the class period.  On the second day the conditions are reversed, that is, the individuals who received coffee with caffeine are now given coffee without and vice-versa. 

 

The distinguishing feature of crossing subjects with treatments is that each subject will have more than one score.  This feature is sometimes used in referring to this class of designs as repeated measures designs.  The effect also occurs within each subject, thus these designs are sometimes referred to as within subjects designs. 

 

Crossing subjects with treatments has two advantages.  One, they generally require fewer subjects, because each subject is used a number of times in the experiment.  Two, they are more likely to result in a significant effect, given the effects are real.  This is because the effects of individual differences between subjects is partitioned out of the error term.

 

Crossing subjects with treatments also has disadvantages.  One, the experimenter must be concerned about carry-over effects.  For example, individuals not used to caffeine may still feel the effects of caffeine on the second day, when they did not receive the drug.  Two, the first measurements taken may influence the second.  For example, if the measurement of interest was score on a statistics test, taking the test once may influence performance the second time the test is taken.  Three, the assumptions necessary when more than two treatment levels are employed in a crossing subjects with treatments  may be restrictive. 

 

When a factor is a blocking factor, it is not possible to cross that factor with subjects.  It is difficult to find subjects for a S X A design where A is gender.  I generally will take points off if a student attempts such a design.

 

Nested

 


Factor B is said to be nested within factor if each meaningful level of factor B occurs in conjunction with only one level of A.  This relationship is symbolized a B(A), and is read as "B nested within A".  Note that B(A) is considerably different from A(B).  In the latter, each meaningful level of A would occur in one and only one level of B.   These types of  designs are also designated as hierarchical designs in some textbooks.

 

A B(A) design occurs, for example, when the first three levels of factor B (b1 ,b3, and b3) appear only under level a1 of factor A and the next three levels of B  (b4 ,b5, and b6)  appear only under level a2 of factor A.  Depending upon the labelling scheme, b4 ,b5, and b6 may also be called b1 ,b3, and b3, respectively.  It is understood by the design designation that the b1 occurring under a1 is different from the b1 occurring under a2.

 

Nested or hierarchical designs can appear because many aspects of society are organized  hierarchically.  For example within the university, classes (sections) are nested within courses, courses are nested within departments, departments within colleges, and colleges within the university.. 

 

In experimental research it is also possible to nest treatment conditions within other treatment conditions.  For example, suppose a researcher was interested in the effect of diet on health in hamsters.  One factor (A) might be a high cholesterol (a1) or low cholesterol (a2) diet.  A second factor (B) might be type of food, peanut butter (b1), cheese (b2), red meat (b3), chicken (b4), fish (b5), or vegetables(b6).  Because type of food may be categorized as being either high or low in cholesterol, a B(A) experimental design would result.  Chicken, fish, and vegetables would be relabelled as b1 ,b3, and b3, respectively, but it would be clear from the experimental design specification that peanut butter and chicken, cheese and fish, and red meat and vegetables, were qualitatively different, even though they all share the same label.

 


While any factor may possibly be nested within any other factor, the critical nesting relationship is with respect to subjects.  If S is nested within some combination of other factors, then each subjects appear under one, and only one, combination of factors within which they are nested.  These effects are often called the Between Subjects effects.   If S is crossed with come combination of other factors, then each subject see all combinations of factors with which they are crossed.  These effects are referred to as Within Subjects effects.

 

As mentioned earlier subjects are necessarily nested within blocking factors.  Subjects are necessarily nested within the effects of gender and current religious preference, for example.

 

Treatment factors, however, may be nested or crossed with subjects.  The effect of caffeine on alertness could be studied by dividing the subjects into two groups, with one receiving a beverage with caffeine and one group not.  This design would nest subjects with caffeine and be specified as S(A), or simply A, as the S is often dropped when the design is completely between subjects. 

 

If subjects appeared under both caffeine conditions, receiving caffeine on one day and no caffeine on the other, then subjects would be crossed with caffeine.  The design would be specified as S X A.  In this case the S would remain in the design.

 

An Example Design

 

A psychologist (McGuire, 1993) was interested in studying adults' memory for medical information presented by a videotape.  She included one-hundred and four participants in which sixty-seven  ranged in age from 18 to 44 years and thirty seven ranged in age from 60 to 82 years.  Participants were randomly assigned to one of two conditions, either an organized presentation condition or an unorganized  presentation condition.  Following observation of the videotape, each participant completed an initial recall sequence consisting of free-recall and probed recall retrieval tasks.   A probed recall is like a multiple-choice test and a free-recall is like an essay test.  Following a one-week interval, participants completed the recall sequence again.

 


This experimental design provides four factors in addition to subjects (S).  The age factor (A) has two level a1=young and a2=old and would necessarily be a blocking factor.  The type of videotape factor (B) would be a treatment factor and would consist of two levels b1=organized and b2=unorganized.  The recall method factor (C) would be a form of trials factor and would have two levels c1=free-recall and c2=probed recall.  The forth factor (D) would be another trials factor where d1=immediate and d2=one week delay.

 

Each level of B appears with each level of A, thus A is crossed with B.  Since each subject appears in one and only one combination of A and B, subjects are nested within A X B.  That is, each subject is either young or old and sees either an organized or unorganized videotape.  The design notation thus far would be S ( A X B ).

 

Each type of recall (C) was done by each subject at both immediate and delayed intervals (D).  Thus subjects would be crossed with recall method and interval.  The complete design specification would be S ( A X B ) X C X D.   In words this design would be subjects nested within A and B and crossed with C and D.

 

In preparation for entering the data into a data file, the design could be viewed in a different perspective.  Listing each subject as a row and each measure as a column, the design would appear as follows:


 

     Immediate   Week Later

Age      Videotape        Subject Free     Probed                Free       Probed

 

S1

Organized         S2

Young              ...

Unorganized     S1

...

S1

Organized         S2                    

Old                              ...

Unorganized     S1

...

 

In this design, two variables would be needed.  One  to classify each subjects as either young or old, and one to document which type of videotape the subject saw.  In addition to the classification variables, each subject would require four variables to record the two types of measures taken at the two different times.

 

A score taken from the design presented above could be represented as Xabscd.  For example, the immediate probed test score taken from the third subject in the old group who viewed an organized videotape would be X21312.

 

A Second Example Design

 


The Lombard effect is a phenomenon in which a speaker or singer involuntarily raises his or her vocal intensity in the presence of high levels of sound.  In a study of the Lombard effect in choral singing (modified from Tonkinson, 1990), twenty-seven subjects, some experienced choral singers and some not,  were asked to sing the national anthem along with a choir heard through headphones.  The performances were recorded and vocal intensity readings from three selected places in the song were obtained from a graphic level recorder chart.   Each subject sang the song four times:  with a none, or a soft, medium, or loud choir accompaniment.  After some brief instructions to resist increasing vocal intensity as the choir increased, each subject again sang the national anthem four times with the four different accompaniments.  The order of accompaniments was counterbalanced over subjects.

 

In this design, there would be four factors in addition to subjects.  Subjects would be nested within experience level (A), with a1=inexperienced and a2=experienced choral singers.   This factor would be a blocking factor.   Subjects would be crossed with instructions (B), where b1=no instructions and b2=resist Lombard effect.  In addition, subjects would be crossed with accompaniment (C) and place in song  (D).  The accompaniment factor would include four levels c1=soft, c2=medium, c3=loud, and c4=none.  This factor would be considered a treatment factor.  The place in song factor could be considered a trial factor and would have three levels.

 

The experimental design could be written as S ( A ) X B X C X D.  In words, subjects were nested within experience level and crossed with instructions, accompaniment, and place in song.  In this design, one variable would be needed for the classification of each subject and twenty-four variables would be needed for each subject, one for each combination of instructions, accompaniment, and place in song.  The design could be written:

 


               No Instructions                   Resist Lombard Effect

        Soft  Medium  Loud   None   Soft  Medium  Loud   None  

Exp  S  1 2 3  1 2 3  1 2 3  1 2 3  1 2 3  1 2 3  1 2 3  1 2 3

 1   1

 1   2

  ...

 2   1

 2   2

  ...

 

A Third Example Design

 

From the Springfield News-Leader, March 1, 1993:

 

Images of beauty such at those shown by Sports Illustrated's annual swimsuit issue, are harmful to the self-esteem of all women and contribute to the number of eating disorder cases in the U. S., says a St. Louis professor who researches women's health issues.

 

In a recent study at Washington University, two groups of women - one with bulimia and one without - watched videotapes of SI models in swimsuits.

 

Afterwards, both groups reported a more negative self-image than they did before watching the tape, describing themselves as "feeling fat and flabby" and "feeling a great need to diet."

 

The experiment described above has a number of inadequacies, the lack of control conditions being the most obvious.  The original authors, unnamed in the article, may have designed a much better experiment than is described in the popular press.  In any case, this experiment will now be expanded to illustrate a complex experimental design.


The dependent measure, apparently a rating of "feeling fat and flabby" and "feeling a great need to diet", will be retained.  In addition, two neutral questions will be added, say "feeling anxious" and "feeling good about the environment."  These four statements will be rated by all subjects, thus subjects will be crossed with ratings.  The first two statements deal with body image and diet and the last two do not, thus they will form a factor in the design (called D).  Since the statements within each of body image factor share no similarity across levels of D, these statements (A) are nested within D. For example, the rating of "feeling a great need to diet" and "feeling good about the environment" share no qualitative relationship.  At this point the design may be specified as S X A(D).

 

Suppose the researcher runs the subjects in groups of six to conserve time and effort, thus creating a groups (G) factor.   In addition to the two groups, with bulimia and without (B), suppose the subjects viewed one of the following videotapes (V):  SI models, Rosanne Barr, or a show about the seals of the great northwest.  Assuming that all the subjects in each level of group either had bulimia or did not, then the design could be specified as

 

S(G(B X V)).

 

The factor B is crossed with V because each level of B appears with each level of V.  That is, subjects with and without bulimia viewed all three videotapes.  Because each group viewed only a single videotape and was composed of subjects either with bulimia or without, the groups factor is nested within the cross of B and V.  Because subjects appeared in only one group, subjects are nested within groups.

 

Combining the between subjects effects, S(G(B X V)), and the within subjects effects, A(D), yields the complete design specification

 

S(G(B X V)) X A(D).

 

 


 

Determining the Number of Subjects and Measures per Subject

 

It is important to be able to determine the number of subjects and the number of measures per subject for practical reasons, namely, is the experiment feasible?  After listening to a student propose an experiment and a little figuring, I remarked "according to my calculations, you should be able to complete the experiment sometime near the middle of the next century."  If an experimenter is limited in the time a subject is available, then the number of measures per subject is another important consideration.

 

To determine the number of subjects, multiply the number of levels of the between subjects factor together.  In the previous example, S = 6 because the subjects were run in groups of six.  Let G=4, or there be four groups of six each of combinations of bulimia and videotape.  Since there were two levels of bulimia, B=2, and three levels of videotape, V=3.  Since S(G(B X V)), then the total number of subjects needed would be S * G * B * V or 6*4*2*3 or 144.  Since half of the subjects must have bulimia, the question of whether or not 72 subjects with bulimia are available must be asked before the experiment proceeds.

 

To find the number of measures per subject, multiply the number of levels of the within subjects factors together.  In the previous example A(D), where A=2 and D=2, there would be A * D or 2 * 2 or 4 measures per subject.

 

Setting up the Data Matrix

 

             Columns

         1   2   ...   C

      1

Rows  2

      .

      R

 


A few rules simplify setting up the data matrix.  First, each subject appears on a single row of the data matrix.  Second, each measure or combination of within subjects factors appears in a column of data.  Third, each subject must be identified as to the combination of between subjects factors which he or she appears.

 

 1 1 1 3 5 4 3

 1 1 1 2 5 5 3

 1 1 1 5 5 5 4

 1 1 1 3 2 1 3

 1 1 1 2 5 3 1

 1 1 1 3 5 4 3

 2 1 1 5 4 5 5

 ...

 4 2 3 3 5 5 4

In the previous example, since there would be 144 subjects in the experiment, there would be 144 rows of data.  Each subject would be identified as to the level of  G, B, and V to which she belonged.  For example, a subject who appeared under g3 of b1 and v4 would be labelled as 3 1 4.   Since there are four measures per subject, these would appear as columns in addition to the identifiers.  An example data matrix might appear as follows.  In this example, the level of G is in the first column, B in the second, and V in the third.  The four combinations of within subjects factors appear next as ad11 ad12 ad12 ad22.

 

A Note of Caution

 

It is fairly easy to design complex experiments.  Running the experiments and interpreting the results are a different matter.  Many complex experiments are never completed because of such difficulties.  This is from personal experience.


                                                             Chapter

                                                                9

 

 

 

                One Between Group ANOVA

 

Why Multiple Comparisons Using t-tests is NOT the Analysis of Choice

 

  Group   Therapy Method      _     SX        SX²

    1       Reality         20.53   3.45   11.9025

    2       Behavior        16.32   2.98    8.8804

    3       Psychoanalysis  10.39   5.89   35.7604

    4       Gestalt         24.65   7.56   57.1536

    5       Control         10.56   5.75   33.0625

 

Suppose a researcher has performed a study of the effectiveness of various methods of individual therapy.  The methods used were:  Reality Therapy, Behavior Therapy, Psychoanalysis, Gestalt Therapy, and, of course, a control group.  Twenty patients were randomly assigned to each group.  At the conclusion of the study, changes in self-concept were found for each patient.  The purpose of the study was to determine if one method was more or less effective than the other methods.

 

At the conclusion of the experiment the researcher organizes the collected data in the following manner:

 

The researcher wishes to compare the means of the groups with each other to decide about the effectiveness of the therapy. 

 


One method of performing this analysis is by doing all possible t-tests, called multiple t-tests.  That is, Reality Therapy is first compared with Behavior Therapy, then Psychoanalysis, then Gestalt Therapy, then the Control Group.  Behavior Therapy is then individually compared with the last three groups, and so on.  Using this procedure there would be ten different t-tests performed.  Therein lies the difficulty with multiple t-tests.

 

First, because the number of t-tests increases geometrically as a function of the number of groups, analysis becomes cognitively difficult somewhere in the neighborhood of seven different tests.  An analysis of variance organizes and directs the analysis, allowing easier interpretation of results.

 

Secondly, by doing a greater number of analyses the probability of committing at least one type I error somewhere in the analysis greatly increases.  The probability of committing at least one type I error in an analysis is called the experiment-wise error rate. The researcher may desire to perform a fewer number of hypothesis tests in order to reduce the experiment-wise error rate.  The ANOVA procedure performs this function.

 

The Bottom Line - Results and Interpretation of ANOVA

 

Results of an ANOVA are usually presented in an ANOVA table.  This table contains  columns labelled "Source", "SS or Sum of Squares", "df - for degrees of freedom", "MS - for mean square", "F or F-ratio", and "p, prob,  probability, sig., or sig. of F".  The only columns that are critical for interpretation are the first and the last, the others are used mainly for intermediate computational purposes.

 

    Source       SS    df      MS      F    sig of F

  BETWEEN    5212.960   4   1303.240   4.354   .0108

  WITHIN     5986.400  20    299.320               

  TOTAL     11199.360  24

 

        An example of an ANOVA table appears below:

 


The row labelled "BETWEEN" under "Source", having a probability value associated with it, is the only one of any any great importance at this time.  The other rows are used mainly for computational purposes.  The researcher then would most probably first look at the value ".0108" located under the "sig of F" column. 

 

Of all the information presented in the ANOVA table, the major interest of the researcher will most likely be focused on the value located in the "sig of F." column.  If the number (or numbers) found in this column is (are) less than the critical value (α) set by the experimenter, then the effect is said to be significant.  Since this value is usually set at .05, any value less than this will result in significant effects, while any value greater than this value will result in nonsignificant effects.

 

If the effects are found to be significant using the above procedure, it implies that the means differ more than would be expected by chance alone.  In terms of the above experiment, it would mean that the treatments were not equally effective.  This table does not tell the researcher anything about what the effects were, just that there most likely were real effects.

 

If the effects are found to be nonsignificant, then the differences between the means are not great enough to allow the researcher to say that they are different.  In that case no further interpretation is attempted.

 

When the effects are significant, the means must then be examined in order to determine the nature of the effects.  There are procedures called "post-hoc tests" to assist the researcher in this task, but often the analysis is fairly evident simply by looking at the size of the various means.  For example, in the preceding analysis Gestalt and Reality Therapy were the most effective in terms of mean improvement.

 


In the case of significant effects, a graphical presentation of the means can sometimes assist in analysis.  For example, in the preceding analysis, the graph of mean values would appear as follows:

 

 

HYPOTHESIS TESTING THEORY UNDERLYING ANOVA

 

       The Sampling Distribution Reviewed

 

In order to explain why the above procedure may be used to simultaneously analyze a number of means, the following presents the theory on ANOVA in relation to the hypothesis testing approach discussed in earlier chapters.

 

First, a review of the sampling distribution is necessary.  If you have difficulty with this summary, please go back and read the more detailed chapter on the sampling distribution.

 

A sample is a finite number (N) of scores.  Sample statistics are numbers which describe the sample.  Example statistics are the mean (_), mode (Mo), median (Md), and standard deviation (sX). 

 

Probability models exist in a theoretical world where complete information is unavailable.  As such, they can never be known except in the mind of the mathematical statistician.  If an infinite number of infinitely precise scores were taken, the resulting distribution would be a probability model of the population.  Population models are characterized by parameters.  Two common parameters are µX and σX.

 


Sample statistics are used as estimators of the corresponding parameters in the population model.  For example, the mean and standard deviation of the sample are used as estimates of the corresponding population parameters µX and σX.

 

The sampling distribution is a distribution of a sample statistic.  It is a model of a distribution of scores, like the population distribution, except that the scores are not raw scores, but statistics.  It is a thought experiment; "what would the world be like if a person repeatedly took samples of size N from the population distribution and computed a particular statistic each time?"  The resulting distribution of statistics is called the sampling distribution of that statistic. 

 

        The sampling distribution of the mean is a special case of a sampling distribution.  It is a distribution of sample means, described with the parameters µ_  and σ_.  These parameters are closely related to the parameters of the population distribution, the relationship being described by the CENTRAL LIMIT THEOREM.  The CENTRAL LIMIT THEOREM essentially states that the mean of the sampling distribution of the mean (µ_) equals the mean of the population (µX) and that the standard error of the mean (σ_) equals the standard deviation of the population (σX) divided by the square root of N.  These relationships may be summarized as follows:

 

Two Ways of Estimating the Population Parameter σX²

 


When the data have been collected from more than one sample, there exists two independent methods of estimating the population parameter σX², called respectively the between and the within method.  The collected data are usually first described with sample statistics as demonstrated in the following example:

 

 Group   Therapy Method      _      SX        SX²

 

   1       Reality         20.53   3.45   11.9025

   2       Behavior        16.32   2.98    8.8804

   3       Psychoanalysis  10.39   5.89   35.7604

   4       Gestalt         24.65   7.56   57.1536

   5       Control         10.56   5.75   33.0625

 

           Mean            16.49          29.3519

           Variance        38.83         387.8340

 

 

            THE WITHIN METHOD

 

Since each of the sample variances may be considered an independent estimate of the parameter σX², finding the mean of the variances provides a method of combining the separate estimates of σX² into a single value.  The resulting statistic is called the MEAN SQUARES WITHIN, often represented by MSW.  It is called the within method because it computes the estimate by combining the variances within each sample.  In the above example, the Mean Squares Within would be equal to 29.3519.

 

THE BETWEEN METHOD

 

The parameter σX² may also be estimated by comparing the means of the different samples, but the logic is slightly less straightforward and employs both the concept of the sampling distribution and the Central Limit Theorem. 

 


Sampling Distribution             Actual Data

    

                       _                            _

                       _                            _

                       _                            _

                       .                              _

                       .                              _

                       .                             

 

Mean             µ_                          __

Variance       σ_²                         s_²

First, the standard error of the mean squared (σ_²) is the population variance of a distribution of sample means.  In real life in the situation where there is more than one sample, the variance of the sample means may be used as an estimate of the standard error of the mean squared (σ_²).  This is analogous to the situation where the variance of the sample (sX²) is used as an estimate of σ_².  The relationship is demonstrated below:

 

In this case the Sampling Distribution consists of an infinite number of means and the real life data consists of A (in this case 5) means.  The computed statistic s_² is thus an estimate of the theoretical parameter σ_².

 

        The relationship expressed in the Central Limit Theorem may now be used to obtain an estimate of σ².

 

 Thus the variance of the population may be found by multiplying the standard error of the mean squared (σ_²) by N, the size of each sample.

 


Since the variance of the means, s_², is an estimate of the standard error of the mean squared, σ_², the variance of the population, σX², may be estimated by multiplying the size of each sample, N, by the variance of the means.  This value is called the Mean Squares Between and is often symbolized by MSB.  The computational procedure for MSB is presented below:

 

MSB = N * s_²

 

 

MSB = N * s_²

MSB = 20 * 38.83

MSB = 776.60

The expressed value is called the Mean Squares Between because it uses the variance between the samples, that is the sample means, to compute the estimate.  Using the above procedure on the example data yields:

 

At this point it has been established that there are two methods of estimating σX², Mean Squares Within and Mean Squares Between.  It could also be demonstrated that these estimates are independent.  Because of this independence, when both are computed using the same data, in almost all cases different values will result.  For example, in the presented data MSW=29.3519 while MSB=776.60.  This difference provides the theoretical background for the F-ratio and ANOVA. 

 

 The F-ratio and F-distribution

 

        A new statistic, called the F-ratio is computed by dividing the MSB by MSW.  This is illustrated below:

 

 Fobs =  MSB / MSW 

       

 

      

 

Using the example data described earlier the computed F-ratio becomes:


 

              Fobs =  MSB / MSW

              Fobs =  776.60 / 29.3519

              Fobs =  26.4582

 

The F-ratio can be thought of as a measure of how different the means are relative to the variability within each sample.  The larger this value, the greater the likelihood that the differences between the means are due to something other than chance alone, namely real effects.  The size of the F-ratio necessary to make a decision about the reality of effects is the next topic of discussion.

 

If the difference between the means means is due only to chance, that is, there are no real effects, then the expected value of the F-ratio would be one (1.00).  This is true because both the numerator and the denominator of the F-ratio are estimates of the same parameter, σX².  Seldom will the F-ratio be exactly equal to 1.00, however, because the numerator and the denominator are estimates rather than exact, known values.  Therefore, when there are no effects the F-ratio will sometimes be greater than one, and other times less than one.

 

To review, the basic procedure used in hypothesis testing is that a model is created in which the experiment is repeated an infinite number of times when there are no effects.  A sampling distribution of a statistic is used as the model of what the world would look like if there were no effects.  The results of the experiment, a statistic, is compared with what would be expected given the model of no effects was true.  If the computed statistic is unlikely given the model, then the model is rejected along with the hypothesis that there were no effects. 

 

In an ANOVA the F-ratio is the statistic used to test the hypothesis that the effects are real, in other words, that the means are significantly different from one another.  Before the details of the hypothesis test may be presented, the sampling distribution of the F-ratio must be discussed.

 


If  the experiment were repeated an infinite number of times, each time computing the F-ratio, and there were no effects, the resulting distribution could be described by the F-distribution.  The F-distribution is a theoretical probability distribution characterized by two parameters, df1 and df2, both of which affect the shape of the distribution.  Since the F-ratio must always be positive, the F-distribution is non-symmetrical, skewed in the positive direction.

 

Two examples of an F-distribution are presented below; the first with df1=1 and df2=5, and the second with df1=10 and df2=25. 

 

 The F-distribution has a special relationship to the t-distribution described earlier.  When df1=1, the F-distribution is equal to the t-distribution squared (F=t²).  Thus the t-test and the ANOVA will always return the same decision when there are two groups.  That is, the t-test is a special case of ANOVA. 

 

 Non-significant and Significant F-ratios

 

        Theoretically, when there are no real effects, the F-distribution is an accurate model of the distribution of F-ratios.  The F-distribution will have the parameters df1=a-1 (where a-1 is the number of different groups minus one) and df2=a(N-1), where a is the number of groups and N is the number in each group. In this case an assumption is made that sample size is equal for each group. For example, if five groups of five subjects each were run in an experiment and there were no effects, then the F-ratios would be distributed with df1=k-1=5-1=4 and df2=k(n-1)=5(5-1)=5*4=20.  A visual representation of the preceding appears as follows:


 

The F-ratio in the above which cuts off various proportions of the distributions may be computed for different values of  α.  These F-ratios are called Fcrit values.  In the above example the Fcrit value for  α=.25 is 1.46, for  α=.10 results in a value of 2.25, for α=.05 the value is 2.87, and for α=.01 the value is 4.43.  These values are illustrated in the figure below:

 

When there are real effects, that is, the means of the groups are different due to something other than chance, then the F-distribution no longer describes the distribution of F-ratios.  In almost all cases the observed F-ratio will be larger than would be expected when there were no effects.  The rationale for this situation is presented below.

 

First, an assumption is made that any effects are an additive transformation of the score.  That is, the scores for each group can be modelled as a constant ( aa - the effect) plus error (eae).  The scores appear as follows:

 

                       Xae = aa + eae

 

where X is the score, aa is the treatment effect, and eae is the error.  The eea, or error, is different for each subject, while aa is constant within a given group.

 


As described in the chapter on transformations, an additive transformation changes the mean, but not the standard deviation or the variance.  Because the variance of each group is not changed by the nature of the effects, the Mean Square Within, as the mean of the variances, is not affected.  The Mean Square Between, as N time the variance of the means, will in most cases become larger because the variance of the means will most likely become larger. 

 

Imagine three individuals taking a test.  An instructor first finds the variance of the three score.  He or she then adds five points to one random individual and subtracts five from another random individual.  In most cases the variance of the three test score will increase, although it is possible that the variance could decrease if the points were added to the individual with the

     No effects              Real Effects

Group  Mean  Variance   Group    Mean    Variance

  1      µ      σ²        1       µ + a1     σ²

  2      µ      σ²        2       µ + a2     σ²

  3      µ      σ²        3       µ + a3     σ²

  4      µ      σ²        4       µ + a4     σ²

  5      µ      σ²        5       µ + a5     σ²

Mean     µ       σ²                  µ       σ²

Variance σ²/N                     >σ²/N

lowest score and subtracted from the individual with the highest score.  If the constant added and subtracted was 30 rather than 5, then the variance would almost certainly be increased.  Thus, the greater the size of the constant, the greater the likelihood of a larger increase in the variance.

 

        With respect to the sampling distribution, the model differs depending upon whether or not there are effects.  The difference is presented below:

 


 Since the MSB usually increases and MSW remains the same, the F-ratio (F=MSB/MSW) will most likely increase.  Thus, if there are real effects, then the F-ratio obtained from the experiment will most likely be larger than the critical level from the F-distribution.  The greater the size of the effects, the larger the obtained F-ratio is likely to become.

 

Thus, when there are no effects, the obtained F-ratio will be distributed as an F-distribution which may be specified.  If effects exist, then the obtained F-ratio will most likely become larger.  By comparing the obtained F-ratio with that predicted by the model of no effects, an hypothesis test may be performed to decide on the reality of effects.  If the obtained F-ratio is greater than the critical F-ratio, then the decision will be that the effects are real.  If not, then no decision about the reality of effects can be made.

 

Similarity of ANOVA and t-test

 

When the number of groups (A) equals two (2), an ANOVA and t-test will give similar results, with tCRIT²=FCRIT and tOBS²=FOBS.  This equality is demonstrated in the example below:

 


Given the following numbers for two groups:                                             

 

Mean    Variance

     Group 1 - 12 23 14 21 19 23 26 11 16    18.33     28.50

     Group 2 - 10 17 20 14 23 11 14 15 19    15.89     18.11

 

Computing the t-test

 

     s_1-_2 = Ö (s1² + s2²)/ 9 = Ö (28.50 + 18.11)/9 =  Ö 5.18 = 2.28

     tOBS = ( _1-_2 ) / s_1-_2 = 18.33 - 15.89 / 2.28 = 1.07

     t(df=16) = 2.12        for α=.05 and two-tailed test

 

Computing the ANOVA

 

     MSBETWEEN = N * s_² = 9 * 2.9768 = 26.7912

     MSWITHIN  = Mean of the Variances = ( 28.50 + 18.11 ) / 2 = 23.305

     FOBS = MSBETWEEN/MSWITHIN = 1.1495

     F(1,16) = 4.41         for α=.05 - two-tailed test is assumed

 

Comparing the results

 

     tOBS² = 1.1449      FOBS  = 1.1449

     t(16)²  = 4.49        F(1,16) = 4.49

    

The differences between the predicted and observed results can be attributed to rounding error (close enough for government work). 

 

Because the t-test is a special case of the ANOVA and will always yield similar results, most researchers perform the ANOVA because the technique is much more powerful in complex experimental designs.

 

 

 


EXAMPLE OF A NON-SIGNIFICANT ONE-WAY ANOVA

                                               MEAN VARIANCE

     7 7 5 4 2 7 5 4 1 7 5 6 7 6 3 5 2 5 1 4   4.65   4.03

     6 9 3 6 9 4 9 8 9 3 4 4 7 2 2 7 7 7 9 3   5.90   6.52

     5 5 2 5 6 2 3 3 6 8 2 1 1 2 5 7 9 6 5 7   4.50   5.63

     4 1 4 8 9 5 2 8 6 8 2 9 6 6 7 8 4 3 1 4   5.25   6.93

     3 6 1 2 3 5 8 4 1 5 4 5 6 9 4 2 4 8 9 3   4.60   6.04 

 

Computing the ANOVA

 

    MSBETWEEN = N * s_² = 20 * .351 = 7.015

    MSWITHIN  = Mean of the Variances = 5.83

    FOBS = MSBETWEEN/MSWITHIN = 1.20

    F(4,95) = 2.53  for α=.05 - non-directional test is assumed

 

 

        Given the following data for five groups, perform an ANOVA:

 

Since the FCRIT is greater than the FOBS, the means are not significantly different and no effects are said to be discovered.

 

EXAMPLE OF A SIGNIFICANT ONE-WAY ANOVA

 

Given the following data for five groups, perform an ANOVA.  Note that the numbers are similar to the previous example except that one has been subtracted from all scores in Group 3 and one has been added to all scores in Group 4.

 

In this case the FOBS is greater than FCRIT, thus the means are significantly different and we decide that the effects are real.

 

1 23

1 31

1 25

1 29

1 30

1 28

1 31

1 31

1 33

2 32

2 28

2 36

2 34

2 41

2 35

2 32

2 28

2 31

USING MANOVA

 


                                                                                         MEAN VARIANCE

     7 7 5 4  2 7 5 4 1 7 5  6 7 6 3 5 2 5 1 4   4.65   4.03

     6 9 3 6  9 4 9 8 9 3 4  4 7 2 2 7 7 7 9 3   5.90   6.52

     4 4 1 4  5 1 2 2 5 7 1  0 0 1 4 6 8 5 4 6   3.50   5.63

     5 2 5 9 10 6 3 9 7 9 3 10 7 7 8 9 5 4 2 4   6.25   6.93

     3 6 1 2  3 5 8 4 1 5 4  5 6 9 4 2 4 8 9 3   4.60   6.04 

 

Computing the ANOVA

 

     MSBETWEEN = N * s_² = 20 * 1.226 = 24.515

     MSWITHIN  = Mean of the Variances = 5.83

     FOBS = MSBETWEEN/MSWITHIN = 4.20

     F(4,95) = 2.53    for α=.05 - two-tailed test is assumed

 

While an single factor between groups ANOVA may be done using the MEANS command in SPSS, the MANOVA command is a general purpose command which allows the statistician to do almost any type of multifactor univariate or multivariate ANOVA.

 

The Data

 

The data is entered into a data file containing two columns.  One column contains the level of the factor to which the observation belongs and the second the score for the dependent variable.  A third column containing the observation number, in the example a number from one to nine, is optional.  As in all SPSS data files, the number of rows in the data file corresponds to the number of subjects and each variable is lined up neatly in each row.  In the example data file presented to the right, there are two groups of nine each.  The level of the independent variable is given in the first column of the data file, a space is entered, and the dependent variable is entered in columns 3 and 4.

 


RUN NAME   EXAMPLE FOR ANOVA BOOK - DESIGN A.

DATA LIST  FILE='DESIGNA DATA  A' /1 A 1 X 3‑4.

VALUE LABELS

   A 1 'BLUE BOOK' 2 'COMPUTER'

LIST.

MANOVA X by A(1,2)

   /PRINT CELLINFO(MEANS)

   /DESIGN.

 

The RUN NAME command of the example program gives a general description of the purpose of the program.  The second command reads in the data file.  Note that the group factor is called "A" and the dependent variable is called "X".  The value label command then describes the different levels of the group variable.  The LIST command gives a description of the data as the computer understands it.

 

The MANOVA command is followed by the name of the dependent variable, here X, and the keyword BY .  The factor name "A" is then entered, followed by the the beginning and ending levels of that factor.  In this case there were only two levels, defined by a beginning value of "1", and an ending value of "2".  The second line on the command is preceded by a slash "/'" and then the subcommand PRINT=CELLINFO(MEANS).  This command will print the means of the respective groups.  The last subcommand, "/DESIGN" , is optional at this point, but not including it will generate a WARNING.  Nothing is altered with the WARNING, but it is not neat.