Handbook for a Statistics Project
David W. Stockburger
WRITING THE RESULTS
This handbook is not intended for the researcher who wishes to publish original research in the professional journals; that requires too much attention to detail for the introductory statistics student. This section will present an overall structure which may be useful to the student in presenting quantitative information to others. An attempt will be made to be compatible with the style required in professional journals, theses, or dissertations. This chapter, however, will not be adequate for such a presentation. The interested reader may find the details of such a writing style in the APA Publication Manual (1974).
Most research papers in psychology contain four major sections: Introduction, Methods, Results, and Discussion. Each major section begins on a new page, and except for the introduction, is titled by centering and capitalizing all letters in the heading.
INTRODUCTION
The introduction provides the reader with a general frame of reference and structure for the rest of the paper. The first paragraph should introduce the central theme and state why you think it is important to study the topic. Following this, any past research should be cited which is relevant to the present study. Results of cited studies should be summarized and the relationship to the present study should be described.
The next paragraphs elaborate on the central theme. That is, they describe the manner in which you approached the problem. Is your project an extension of previous studies? If so, how is it similar? different? If not, what kind of variables have you selected in relation to the central theme. In either case the reason why the particular variables were selected should be explained to the reader.
If any hypotheses concerning the outcome of the study are being made they should occur in the introduction. That is, if an outcome will be interpreted as support of a theory or hypothesis about the world, that should be reported. If there are any predictions from the theory about how the study will turn out, they should be told to the reader in advance and their derivation explained.
In a correctly performed study, the introduction, or at least an outline of the introduction, will be written before the data are collected. This insures that some thought has been put into the project before the data collection is begun, besides making the final project much easier to write.
METHODS
The methods section is generally composed of a number of subsections, including one or more of the following: Subjects, Apparatus, Procedure, and Hypotheses. Each subsection is labeled as such by left-justifying the title and underlining it.
Subjects
- Even though subjects is the traditional expression, perhaps participants would be a better description. The subjects subsection usually appears first in the methods section. It describes both the population of interest and the subset of that population from which data were collected. The method which was used to select the sample should be described as well as characteristics of the sample, such as number, sex, and location. For this project, it is not necessary to select a random sample, although the sampling procedure should be specified and the effect of the particular sampling procedure used discussed in the discussion section.Stimuli and Apparatus
- This subsection includes a description of the physical apparatus which was used to measure the variables in the study. These may include such things as meter sticks, scales, timers, psychological tests, questionnaires, etc. If possible, brand or trade names should be used, and in the case of tests and other survey type instruments borrowed from previous studies, appropriate literature citations should be given to indicate their source. An example data collection sheet or questionnaire should appear on the page immediately following the one it was first mentioned.Procedure
- The procedure subsection describes the way the apparatus, described in the preceding subsection, was used to collect the data. The description should be detailed enough so that another researcher could carry out a study equivalent to yours. Included in this subsection are instructions to the subjects, perhaps not in exact detail, but summarized with the important points highlighted. Also included is the order and procedure for collecting data. For instance, if a questionnaire was utilized, did the researcher fill it in or did the participant? Was the participant anonymous? How long did it take? Was all the information collected at one time, or were several different time periods used? How were individuals approached and asked to participate? How many refused?RESULTS
Two general rules for organizing the results section are:
(1) The results of the study are usually presented in order of decreasing importance with the most important results presented first.
(2) The more general results are presented before specific results.
If these rules are conflicting, the logical development of the section takes precedence.
In most cases tables and figures will be used to convey most of the information to the reader. The text should serve to highlight important features of the tables and figures and direct the reader's attention to the most interesting findings, but should not duplicate the information presented by these means. A number of methods of presenting the results are appropriate depending upon the type of data collected and the kinds of questions asked. In general, no one study will utilize all the following methods, but most will use more than one.
When faced with a mass of computer output and the requirement that it be condensed, organized, and analyzed, the student is faced with a difficult task. A few tricks of the trade reduce the magnitude of the problem. The significance level or probability value of an analysis allows the reader to make a decision as to whether or not it is worth his or her time to interpret the results. Basically, if the significance level is .05 or greater, then the results are non-significant and could have occurred by chance. In that case any interpretation is subject to question in that chance could have explained the results. In general, then, the researcher will place faith in those analyses that have a significance level of less than .05. This does not mean that results which are non-significant should not be presented in the report, only that interpretation of relationships which are not significant should not be given a great deal of credibility. In general, then the first step in an analysis is to page through and mark any significant results. Some will be unimportant, for example, a significant relationship between age and class rank is of little interest. The others, however, will form the core of the report.
In some cases the researcher is interested in the results of a single variable, for example, the proportion of students who are satisfied with the service they receive at the university health clinic, or the average number of minutes students would be willing to walk to class if they had to pay $1.00 to park. The presentation of the data in the project will depend upon the level of measurement of the variable and the quality and quantity of the data. Ordinal data will not be discussed because information of this type is seldom collected for student projects.
(a) Nominal Variables
When the measurement level is clearly nominal and contains less than eight levels, then the preferred method of presentation is tabular. One successful approach has been to write the proportion or percentage of responses for each level on an example questionnaire. If eight or more levels have been used for a single variable, then the RECODE command is appropriate to reduce the number of levels to less than eight.
(b) Interval or Ratio Variables
In this case a table of means and standard deviations is most often used. They can be most easily obtained from the CORRELATIONS command using the STATISTICS = DESCRIPTIVES option. Where the distribution of a particular variable is critical to the study, a frequency polygon or histogram is often used.
In some cases it is not at all clear whether the variables are measured on a nominal, ordinal, interval, or ratio level. Some question exists, for example, when there are fewer than eight response categories, such as in scales with the following five categories:
strongly agree agree no opinion disagree strongly disagree.
In these cases either tables of proportions of responses or tables of means and standard deviations may be used, depending upon the discretion of the writer. In some cases both methods may be used. The method of choice is the one which the researcher believes most simply and accurately presents the data. The next table presents an example of organizing the results around the questionnaire with both methods.
|
Mean |
s.d. |
||||||
|
18.78 |
2.37 |
Age |
|||||
|
Males |
Females |
Mean |
s.d. |
||||
|
50% |
50% |
.50 |
.51 |
Gender |
|||
|
Freshman |
Sophomore |
Junior |
Senior |
Other |
Mean |
s.d. |
|
|
10% |
20% |
30% |
25% |
15% |
3.15 |
1.23 |
Rank |
|
|
|||||||
|
Family |
Personal |
Scholarships |
Other |
||||
|
15.8% |
31.6% |
36.8% |
15.8% |
Support |
|||
|
|
|||||||
|
Strong Disagree |
Disagree |
No Opinion |
Agree |
Strong Agree |
Mean |
s.d. |
|
|
25% |
30% |
30% |
5% |
10% |
2.45 |
1.23 |
Student Role of Apprentice |
|
35% |
25% |
30% |
10% |
0% |
2.15 |
1.04 |
Student Role of Ward |
|
0% |
35% |
40% |
25% |
0% |
2.90 |
.79 |
Student Role of Client |
|
|
|||||||
|
None |
Membership |
Voting |
Equal |
Total |
Mean |
s.d. |
|
|
20% |
30% |
45% |
5% |
2.40 |
.99 |
Curriculum |
|
|
45% |
25% |
5% |
15% |
5% |
2.05 |
1.31 |
Faculty |
|
15% |
15% |
20% |
30% |
20% |
3.25 |
1.37 |
Budget |
A related difficulty occurs when the data are dichotomous. In this case, however, either method of presentation, proportions in each category or means and standard deviations presents the reader with identical information.
The presentation of results dealing with relationships between variables present many of the same difficulties as presenting results from single variables. Because combinations of variable types must be considered, the number of possible alternatives are increased.
(a) Both Variables are nominal
In this case a contingency table is clearly the most appropriate analysis. In all cases the percentages and totals of the cells, rows, and columns should be given, along with the chi-square value. Depending upon their importance, conditional cell percentages may also be presented to the reader. The contingency table presented in the following table is taken from the analysis of two nominal variables from the example data matrix. The table is an example of two contingency tables output from the SPSS statistical computer package using the CROSSTABS command.
|
Rank |
|||||
|
Freshman |
Sophomore |
Junior |
Senior |
Other |
|
|
20% |
10% |
20% |
30% |
20% |
Male |
|
0% |
30% |
40% |
20% |
10% |
Female |
|
Support |
|||||
|
Personal |
Family |
Scholarships |
Other |
||
|
10% |
30% |
40% |
10% |
Male |
|
|
20% |
39% |
30% |
20% |
Female |
(b) One Nominal and One Interval or Ratio Variable
In this case means and standard deviations of the interval or ratio data can be calculated for each category or level of the nominal variable. In many cases a number of separate analyses may be combined into a single table by cutting and pasting the results of a number of simple tables into a single table. The following table is a presentation of the relationship between an analysis of two nominal variables and four interval variables. As can be seen from this table, a great deal of information may be presented in a simple manner utilizing this type of presentation.
|
Male |
Female |
Sig. |
|
|
18.6 |
19.0 |
.733 |
Age |
|
3.2 |
3.1 |
.861 |
Rank |
|
2.3 |
2.6 |
.600 |
Student Role of Apprentice |
|
2.1 |
2.2 |
.836 |
Student Role of Ward |
|
2.9 |
2.9 |
1.000 |
Student Role of Client |
|
2.2 |
2.6 |
.383 |
Curriculum |
|
2.33 |
1.8 |
.391 |
Faculty |
|
3.0 |
3.5 |
.430 |
Budget |
(c) Both Variables are Interval or Ratio
The appropriate summary statistic in this case is the correlation coefficient and the most convenient presentation is the correlation matrix. In the correlation matrix, all possible correlations between variables are presented as entries in a table with rows and columns consisting of the names of the variables. Only one-half of the matrix must be given because the entries are symmetric, that is, the correlation between SLEEP and STUDY is the same as the correlation between STUDY and SLEEP. As noted before, dichotomous nominal variables may be considered to be interval measures and correlations with other variables presented in the correlation matrix.

If a more detailed presentation of the relationship between two interval or ratio variables is desired, the scatterplot is the appropriate means. Because some relationships are more critical to the central hypothesis than others, especially if they are unexpectedly high or low, they should be shown in scatterplots.
The three basic methods of presenting relationships given above are not limited to only the scale types that were presented. In some cases it may be useful to break down interval or ratio variables by levels of another interval or ratio variable. It is possible to treat interval or ratio data as nominal, but not vice-versa. In other words, scale types are only a general guideline to type of preferred analysis. Type of presentation of results depends largely on the type and amount of information the writer wishes to convey to the reader.
DISCUSSION
The purpose of the discussion section is to summarize, discuss, and conclude the paper. The discussion section allows the researcher to present ideas, impressions, and possible artifacts related to the research project. The writer is not limited to facts, as in the results section, but is allowed to discuss the implications of the research.
The first paragraph of the discussion section usually summarizes the important results without recourse to numbers, figures, or tables. The content of the rest of this section is generally variable, but the following presents some possible areas which might be discussed.
One such area is whether or not the hypotheses and predictions made in the introduction were upheld. Now that more information is known about the area, what conclusions have you reached? How have your ideas and opinions been changed in light of the evidence? The readers will have probably drawn their own conclusions, but will be interested to see if your conclusions agree with their own. This part of the project should not be taken lightly because the end goal of statistical usage is to make rational conclusions and decisions. The conclusions of a good scientist are based on the evidence collected and the inductive power of the statistical method.
A second area which is sometimes included in the discussion section is an acknowledgement of factors which might have effected the study. These factors include such things as sampling method, measurement instruments, and procedure. If any of these factors were changed, how do you think they would have affected the results? If you had to do the study over again, how would you do it differently?
A third area is that of implications. Some useful questions in this area are: What implications exist for further research? What kind of changes in policy decisions of either your own life or that of society would you make on the basis of your results? A strong concluding statement will let the reader know both where you stand and that the paper is finished.
Knowing about statistics is different than applying them to a real world situation. Following the successive stages to the completion of the project, even if little worthwhile information is obtained, may deepen the appreciation of the statistical tools available for knowing about the world.