The 2000 presidential election was frustrating for many, each day waking to new impediments to a final decision. Many viewed the process as an educational experience about election and constitutional law. The process also provided real-life examples of issues psychologists encounter when we design and conduct research. Elements of the research process may be a bit easier to comprehend when considered in the context of the recent election events.
Psychologists conduct experiments to evaluate how environmental factors (independent variables) influence behaviors (dependent variables). For example, if we were interested in determining whether descriptions of real-life examples might help students learn about psychology, we could compare test scores of students who were taught with examples to those taught without them. In this example, teaching style is the independent variable and grades the dependent variable. In a sense, an election is like an experiment. We see the relative number of votes cast for the candidates as the dependent variable and the platforms and impressions made by the respective candidates as the independent variable.
Experimenters try to minimize factors other than the independent variable that influence the dependent variable. We refer to these other factors as extraneous variables. In a study of how study time might influence learning, the small influence of an extraneous variable such as noise in the hallway outside an experimental room is not going to change the difference in the number of words recalled from a list studied for a minute compared to the number recalled after studying for only 10 seconds. When we introduce a powerful manipulation that causes a large change in behavior, we can accept a large amount of extraneous error. That is not true when we study more subtle effects. Noise, educational level, mood, or other variables that influence memory might mask the influence of very small differences in study time, say 60 seconds of study time compared to 70 seconds of study time. Experimenters must maximize the impact of the independent variable relative to the influence of extraneous variables.
In most elections, we find the difference in votes for the candidates to be rather substantial, and typically error in casting and counting of the ballots is small enough so that it will not change the outcome. Of course, we would like to see that each ballot cast accurately reflects the intention of the voter, but when the differences between votes cast for candidates is very large, a close estimate of votes for the candidates is good enough. The tallies for the two candidates were extremely close in Florida's election. Only a thousand or so votes separated the two, so balloting error that would be inconsequential in most elections became very important. A few ballots stuck together, wayward chad, flawed ballots, or machine errors could account for the difference between the two candidates.
As in close elections, experiments designed to demonstrate subtle effects must eliminate or otherwise account for sources of error in order to show statistically significant differences between groups. Note that error in experiments typically is less tolerable than in elections because in most experiments we are predicting from a small sample, whereas elections evaluate responses of the entire population.
Recounts and the Florida Courts
Experiments employ the very simple logical assumption that if we treat two groups differently and they then behave differently, what we did differently to the groups must have caused the change in behavior. We must be careful when designing studies to ensure that the only difference between the groups is the change in the environment we introduced. Failure to keep everything else equal results in a confound. A confound is a situation in which one or more factors could have caused the results, confusing the interpretation of the findings. A classic confound in the psychology literature is the placebo effect discovered with drug research. Patients in early drug studies received a drug and attention. The control group received neither of these, leading to questions about whether it was the drug or attention that caused people to get better.
A very important assumption underlying statistical tests is that the sampling of individuals be at random. This helps ensure that the errors individuals bring into the study, such as variations in abilities, perception, or social background, will be independent of each other, and that the error term is itself a random normal variable. In statistical jargon, we say that the errors must be independently and identically distributed. When sampling is not done randomly, then the results are not apt to yield an unbiased estimate of the treatment effects.
Every estimated effect has an error term. For example, consider the analysis of variance, a statistical procedure commonly used to test for differences between three or more group means. In a simple one-way analysis of variance, the model statement shows that the dependent variable or measured response, Y, can be decomposed as: Y = m + a + e, where m = the grand mean of the population, a = a fixed deviation from the grand mean, and e = the error term, an independent, normally distributed random variable with mean equal to zero. Notice that when the error terms are summed over all observations, their net effect is zero. That is, they cancel out and have no net effect on the experiment. If this were not the case (i.e., the errors were not random), then it would be difficult, if not impossible, to obtain a valid estimate of the treatment effect.
In the Florida election, if the voting machine and ballot errors were randomly applied to the two candidates, then the estimates of votes for the candidates would be accurate. However, if the balloting and counting errors were systematic, then the estimates would be flawed or biased, and therefore inaccurate.
Attorneys offered legal arguments that confounds existed. Voting officials in one county said that defective voting booths did not allow voters to push the hole through the space for candidate Gore. Republicans claimed that absentee ballots sent from foreign military posts were discarded for not having a postmark, which resulted in an advantage for Gore because military personnel
tend to vote Republican. Democrats claimed that the butterfly ballot in another county created confusion for those voting for Gore but not for Bush. Some claimed that other characteristics of voters, such as age and education, may have influenced the ability to complete a ballot, and that because more people with these characteristics were in support of one of the candidates, a bias in results occurred.
Here then, the outcome of the election was being differentially influenced by variables other than the intent of the voter (i.e., extraneous variables). Hand recounts were initiated in several counties to inspect undervote ballots for evidence of intent that was not detected by voting machines.
Ideally, the outcome of an election should be a direct reflection of the intent of the voters, with the qualifications of the candidates as the only intervention. The presence of extraneous variables in the Florida election may have confounded the results by obscuring the true intent of the voters. This is an example of an "experiment" that went awry because variables which should have been controlled were not, an important point not to be overlooked by researchers when designing and conducting experiments.
Equal Protection and the Supreme Court
In any study we first must operationally define our variables. An operational definition of a dependent variable is simply a description of exactly how to measure the response. In Skinner's approach to studying operant conditioning of rats, Skinner operationally defined a lever press as the closure of an electrical switch that was connected to the lever. The definition clearly communicated how the response was measured, and others could replicate the measurement procedures. This type of objective measure is preferred by psychologists because it minimizes interpretive errors.
Poor or subjective definitions often lead to unreliable and invalid measures that ultimately invalidate the results of a study. For example, most psychological tests evaluate personality or other variables by quantifying responses on objectively scored tests, often using scoring mechanisms similar to voting machines to judge what answers the test-taker made on a scoring sheet. Considerable controversy can arise when subjective interpretation is required for interpretation, as we have seen in discussion of the validity of projective tests (Groth-Marnat, 1997).
As in many research studies, the election was flawed by a problem with an operational definition--what constitutes a vote? Ultimately, the United States Supreme Court ruled that recounts could not continue because the "standards" or definitions used to interpret the intent of the voters were not clearly specified by the Florida Supreme Court. Definitions could vary from county to county, introducing further error into the evaluation of the election outcome. The court ruled that it would take too long to develop and apply a uniform set of standards to interpret votes in a manual recount of the undervotes.
The confusion in Florida has led to a call for election reform. Some of the problems experienced in the Florida election were related to errors in measurement of votes. For example, some machines did not read votes because they were clogged with chad from the ballots. Election boards will have to seek funds for equipment that more accurately interprets the voter's intent, just as psychological researchers seek more accurate timing devices, physiological recording devices, and sensitive psychological tests. If voting machines are not improved, then voting officials must clarify the definitions used if a manual recount is necessary in the future. For example, they must decide whether a "dimpled ballot" is a vote and if it is, what constitutes a dimple. Similarly, as psychologists we must strive to use precise operational definitions.
Other biases in the election results had nothing to do with the measuring device or definitions. A portion of the election error was related to human factors. For example, the split-ballot format used on the butterfly ballots appears to create confusion in people who are used to reading left to right, resulting in errors in selection of candidates listed on the right side of the ballot. A good experimenter might have reduced the problem by having half of the ballots list Bush on the left and the other half list Gore on the left. While not eliminating the problem, it may have balanced the disadvantage for the two candidates. Psychologists should be drawn into an analysis of the voting procedure to identify and reduce this type of problem (see Baron, Roediger, & Anderson, 2000).
As psychologists are concerned that the variables they manipulated caused the observed behavior change, voters want election results to reflect the opinion of the electorate. Often our progress in the science of psychology comes as a result of our making mistakes. Let's hope the errors in this election will better our election system and maybe help psychological researchers better understand good experimental design.
Baron, J., Roediger, H. L., & Anderson, M. C. (2000). Human factors and the Palm Beach ballot. APS Observer, 13, 5-7.
Groth-Marnat, G. (1997). Handbook of psychological assessment (3rd ed.). New York: Wiley.
ABOUT THE AUTHORS: Richard Wesp, PhD, recently joined the faculty of the Department of Psychology at East Stroudsburg University where he teaches Experimental Psychology, Cognition, and Perception. Previously, while a professor at Elmira College, he served as Psi Chi advisor for over 15 years. He has written several articles on teaching the science of psychology and also has published and presented work on learning and cognition. He is the current national chairperson of the Council of Undergraduate Psychology Programs.
David Rheinheimer, EdD, is professor and tutorial coordinator for the Campus-Wide Tutoring Program at East Stroudsburg University. He has a doctorate in educational statistics and measurement and an MS in statistics from Rutgers University. His current research interests include validity studies of statistical tests, student self-efficacy, adult and nontraditional students, and other issues involving developmental education and evaluation and assessment. He serves as a statistical advisor and consultant to numerous faculty, colleagues, and several departments at ESU.
Spring 2001 issue of Eye on Psi Chi (Vol. 5, No. 3, pp. 37-39), published by Psi Chi, The National Honor Society in Psychology (Chattanooga, TN). Copyright, 2001, Psi Chi, The National Honor Society in Psychology. All rights reserved.