Paper presented at the Western Political Science Association Annual Meeting,
San Francisco, March 29-31, 2018
The publication in SCIENCE in 2009 of “Predicting Elections: Child’s Play!” by Antonakis and Dalgas rekindled interest among a good number of political scientists, psychologists and students of electoral politics in the question of role played by facial cues in voting behavior. Substantial prior research exists on this topic relating both to electoral choices and to the process of leadership selection within administrative systems. The findings reported in those early studies were based upon subjective judgments made on the basis of photographic images by a variety of types of coders (Bull et al., 1983; Huddy and Terkildsen, 1993; Chiao et al., 2008). In more recent years machine learning technology and biometric software have given rise to the capacity to analyze vast numbers of facial images and apply uniform metrics to image classification (Lewis et al. 2012; Re and Perrett, 2014). This application of machine learning technology remains in its relative infancy, but monumental strides are being made by data mining researchers whose tools will, in due course, be made available to researchers dealing with the enormous wealth of video image data being collected, transmitted via social media, and archived in contemporary society (Hai-Jew, 2017; Gates, 2011).
This paper represents a RESEARCH NOTE regarding ongoing studies at Washington State University, the University of Utah, National Taiwan University and the Taiwan Central Police University in Taipei regarding the role of facial dominance in leadership selection – in both electoral contexts and in meritocratic administrative leadership selection settings alike. This paper sets out the broader area of interest in brief detail, and moves on to present preliminary findings from the application of facial image analysis to the specific area of contested nonpartisan judicial elections in the State of Washington. Using candidate images extracted from archival voters’ pamphlets (1968-2015) provided by the Office of the Secretary of State of Washington, we use undergraduate student coders to assess judicial candidate image traits of attractiveness, trustworthiness, typicality/uniqueness, subservience/dominance, and masculinity/femininity. We likewise inquired about the likelihood of voting for the candidates portrayed, and in actual candidate pairings we asked the question: ”For which candidate would you vote?” Reported here are preliminary findings for 114 students working with facial images of 22 contested nonpartisan judicial elections (44 facial images) on: 1) mean scores on facial dominance for winners and losers in these contested elections; and, 2) the degree of correspondence between the facial image assessment-based voting preferences and the actual outcome of the elections in which the candidates in the paired images contested. We are also working on the application of machine learning and biometric analysis to the images collected for judicial candidates, and the results of that work will be reported in future papers. The appendices on this paper indicate the core image scoring elements of the machine learning and biometric work currently under way.
The Antonakis and Dalgas Article in SCIENCE: The Power of Appearances
John Antonakis and Olaf Dalgas, Professors of Organizational Behavior and Economics, respectively, on the Faculty of Business and Economics at the University of Lausanne reported on the results of two related experiments involving the use of pairs of competing candidates in the 2002 French parliamentary elections. In the first experiment the researchers presented facial images of the two finalists in these French elections to Swiss undergraduate students who had virtually no information on candidate political party, preferred public policy positions, prior record of voting or political advocacy, or personal character. The Swiss students were asked: “Which of the two candidates appears the most competent to you?” A total of 684 students provided responses on 57 candidate pairings. In 72% of the cases the image judged “most competent in appearance” was that of the actual winner in the election. Clearly, the facial cues present in the candidate images have substantial meaning for BOTH French voters and the Swiss students who are lacking any substantive knowledge of the candidates in question.
In the second related experiment the same set of 57 candidate pairings was presented to a large group of elementary and secondary school students ranging in age from 5 to 13. A total of 681 students participated in the experiment, which entailed a computer-mounted “game” in which the students were asked “Which of these two persons would be the better captain for a voyage from Troy to Ithaca?” In the case of the Swiss school children, their preferences coincided with those of French voters 71% of the time! The pattern of preferences in the candidate pairings for BOTH university students and grade school children were virtually identical, suggesting that the traits being read into the facial images in question are ones acquired in early youth and retained over the life course.
The Mazur & Mueller 1984 and 1996 Articles in ASR, PAR, andSOCIAL FORCES
Decades prior to the publication of the “Child’s Play” experiments reported upon by Antonakis and Dalgas another highly commented upon study was published by Alan Mazur (Syracuse University) and Ulrich Mueller (University of Marburg, Germany) involving administrative leadership selection. Mazur and Mueller published a series of three articles in the American Sociological Review, Public Administration Review, and Social Forces (Mazur et al., 1984; Mazur and Mueller, 1996a; 1996b). In the widely cited Social Forces publication (225 citations) Mazur and Mueller reported the results of their study of “who makes General” among the cadets at West Point (U.S. Army Military Academy).
Mazur and Mueller collected data for the West Point class of 1950 (n=416). In 1989 they administered a mail survey from The Maxwell Graduate School of Citizenship and Public Affairs at the University of Syracuse and collected data from the 1950 class. Their questionnaire featured information on parental background, family tradition of military service, pre-academy life course experiences growing up, academy-period performance, post-academy service postings and advanced educational opportunities, career placements, and record of promotion up the general officer ranks. Upon their analysis of all the possible factors that might explain making it to the top level of ranks they noted: “…to our surprise, facial dominance – measured from cadet portraits taken 20+ years earlier – significantly predicted promotion to the highest rank – to various levels of general officer. Having a dominant face was an advantage in reaching the top…” (Mazur and Mueller, 1996c: 101) No other factor rivaled that of facial dominance in the data collected from the 1950 West Point cadets.
The collection of data on facial dominance by Mazur and Mueller entailed the use of the official cadet graduation portraits, with copies of the archival photographs being loaded on to slides. All slides were made uniform, in vertical format, with the cadet’s head nearly filling the slide frame. These slides were then projected in front of 20-40 “judges” (principally undergraduate students at Syracuse University) who viewed each slide for 10 seconds and independently rated faces on a seven-point scale of dominance-submissiveness (1=very submissive; 4=neutral; 7=very dominant). The student judges were instructed that dominant persons tend to tell others what to do, are typically respected by others, are viewed as influential, and often are situated in formal leadership positions. As for submissive persons, the student judges were instructed that submissive persons are neither assertive nor viewed as influential, do not enjoy much respect from others, and are usually willing to defer to others in group decision-making settings.
The student judges were first shown several images varying widely in traits to accustom them to the range of images to be anticipated, and to avoid undue fatigue they were shown a set of 24 images for coding. Mazur and Mueller report that ratings were not affected by the order of image presentation or the right or left orientation of the image, or by the sex of the judge. The median score for each image was taken as the value of facial dominance. For 85% of the images used in the study at least 50% of the scale scores clustered between two adjacent scale points, indicating the presence of a high level of agreement among coders. In follow-up studies Mazur and Mueller report that facial dominance appears to be a rather durable trait; in the comparison of 30 original photo portraits and 30 portraits taken 20 years later (then middle-aged), for West Point cadets virtually the same rank ordering of the 30 cases on facial dominance is reported (though uniformly a bit lower in score for the later portraits).
Mazur and Mueller conclude that “…facial dominance is a robust variable, which can be measured from a variety of photographs, even when the portraiture is not uniform in style or dating.” They go on to suggest that “Interviewers might easily be trained to take portrait photos during standard face-to-face interviews, thus incorporating facial dominance as a variable in survey data sets. Portraits in historical archives, taken from diverse print and electronic media, should be adequate for facial ratings…” (1996c: 111).
Gladwell, Blink (2007), Thinking, Fast and Slow, and Ubiquitous Instinctual Thin-Slicing
In a very widely read book on the functionality of split-second, instinctual decision-making published in 2007[Blink: The Power of Thinking without Thinking], Malcolm Gladwell sets forth many examples where our proclivity to rely upon intuition and to make split-second decisions on many aspects of our lives is highly functional. As a whole, the book argues for a heightened appreciation of judgments based on less information rather than more – on expert intuition or layman instinct rather than assiduous fact seeking. In addressing the many situations wherein people commonly rely upon their “gut reactions” and “intuitions” to their benefit he elaborates the concept of “thin-slicing” whereby we can place reliance upon the unconscious mind’s ability to identity patterns and deeper meanings in minute “slices” of experience and impressions. Gladwell notes, as an example, how one prominent psychologist who maintains a profitable marriage counseling practice has the ability to predict, with 95% accuracy, whether a couple will remain together over the course of the next 15 years after a single 60-minute session. These impressions are clearly a good guide to action and must be based on the productive type of thin-slicing Gladwell describes.
Nobel laureate economist Daniel Kahneman importantly expanded our understanding of the thin-slicing phenomenon in Thinking, Fast and Slow published to wide acclaim in 2011. Kahneman relied on years of research with longtime colleague Amos Tversky to specify “system 1 (fast) thinking” and “system 2 (slow) thinking” as the two major ways in which the human brain forms thoughts underlying social and private behavior alike. BEYOND THAT, he goes on to detail the primary heuristics and biases that are associated with both types of thinking. He notes quite prominently that trouble is to be expected when situations requiring slow thinking and information seeking are approached with fast thinking shortcuts. In this regard, both Daniel Kahneman and Malcolm Gladwell agree that VOTING and PERSONNEL PROMOTION DECISIONS require going beyond thin-slicing and most properly require sustained information gathering and careful balancing of positive and negative aspects of choice. Gladwell uses Chapter 3 of Blink, entitled “The Warren Harding Error,” to illustrative how thin-slicing is to be considered inappropriate to voting. He argues that his research of historical documents from the 1920 presidential election suggest that primarily because the Republican candidate Warren G. Harding was more “presidential in appearance” than either the Democrat James M. Cox or the Socialist Eugene Debbs Americans overwhelming voted for him (404 vs. 127 electoral college votes). President Harding proved to be arguably among the very worst U.S. Presidents ever elected to that noble and demanding office; in the judgment of most scholars of the U.S. presidency, Harding is generally classified among the least capable holders of that office (Murray and Blessing, 1993; Faber and Faber, 2013).
What can be said in this regard of the “thin-slicing” outcomes evident in the Antonakis and Dalgas experiments? Those two experiments documented the similarity of electoral outcomes for “voters” and for politically naïve students – college students and grade school students alike. Was it perhaps the case that both many French voters and the Swiss students were using thin slicing in reading into the facial images of the candidates attributes of competence they may or may not possess? Likewise, what can be said of the remarkable evidence of the advantage facial dominance posed for the 1950 U.S. Military Academy graduates? Are the student coders of dominance in facial images reading into those images the same “leadership” traits as did successive U.S. Army promotion boards?
Deep Psychological Origins of Thin-Slicing for Facial Image interpretation
In his book Biosociology of Dominance and Deference (Rowmand & Littlefield, 2005) Alan Mazur lays out a line of argument as to WHY thin slicing behavior in the practice of reading presumed traits into facial images is so common among people. According to Mazur virtually all known social species engage in leadership designation – be that birds in migratory flight, wolves living in packs, elk roaming the plains and foothills in herds, or primate groups living in the wild. In each case a single leader is typically “in charge” to whom deference is accorded in decisions affecting the clan, herd, or group. Those showing deference to the leaders in question have a sense, BASED ON APPEARANCES, that the leader is both better suited to lead than are they, and that the actions decided by the leader are likely to be better for them than we if they relied upon their own judgment (Mazur, 1985). Those persons to whom others defer, in turn, must come to enjoy the status of being deferred to and strive to make good decisions and “lead” when group-related decisions must be made (Jacobsen and Anderson, 2015).
Mazur argues powerfully that such a leadership selection dynamic involving dominance and deference is at play in human groups from our earliest experiences on the kindergarten playground. The older we get the more accustomed we become to a world in which a few are “leaders” and the majority are “followers” in the journey of life. Some get to serve as captains of sport teams, as class presidents, as school principals, as police chiefs, as judges, as elected officials – and most others only get to make choices about who among them possesses “leadership qualities” based upon thin slicing heuristics. It is clear that the student coders involved in the Mazur and Mueller study and the college and grade school students involved in the Antonakis and Dalgas experiment had little trouble in reading into the facial images they were coding the traits of good character and competence that would justify deference.
In the area of evolutionary psychology a number of scholars have sought to delve deeply into these dominance/deference group dynamics in a variety of settings to understand why such selection processes are so ubiquitous (Lefevre et al., 2013; Carre et al., 2009; Carre and Olmstead, 2015; Weston et al. 2007; Geniole and McCormick, 2015). It is NOT the role of this research note to provide a thorough review and critique of this literature, but rather to indicate that there is good reason to suspect that the thin slicing, system 1 fast thinking modality prevails in much human leadership selection activity.
The Simon, Benjamin & Lovrich Research Agenda: Exploring Facial Dominance in Electoral & Leadership Selection Processes – Phase I, Phase II, and Phase III
The authors of this paper are engaged in a long-term, multi-university study to explore the potential for insight that might come from testing the limits of facial dominance in both electoral and administrative leadership selection processes. We conceptualize the long term study in terms of three distinct phases.
Phase I: Student Coders Data Source, Contested Judicial Elections
GOAL: Test the utility of facial dominance scores in contested nonpartisan judicial elections on the model of the Antonakis and Dalgas experiments. This research note reports the findings from this initial Phase of our work.
Phase II: Student Coders Data Source, Law Enforcement Academy Outcomes
GOAL: Test the utility of facial dominance scores in predicting outcomes for law enforcement academy graduates on the model of the Mazur and Mueller study. Preliminary work in this area is going on in two settings – in the U.S. context in Houston TX (featuring the Houston Police Department Academy and Sam Houston State University) and in the Taiwan context in Taipei (Taiwan Central Police University and National Taiwan University). The next paper will feature analysis of data collected in Taipei by Yu-Sheng (Linus) Lin and Edward Y. Lai; Professors Lin and Lai will be co-authors on that paper. A second paper on law enforcement selection outcomes will involve Prof. Solomon Zhao and Dr. Hector Garcia making use of data from graduate of the Houston PD Academy 20 years out from their commissioning.
Phase III: Machine Learning and Biometrics Data Source, Reanalysis of Electoral Outcomes
The final phase of the study will entail replicating the analyses of the electoral outcome prediction study by machine learning and biometric scoring in place of student coders. The first steps along this path are being developed in the Francis Benjamin’s Political Interaction Lab, Department of Psychology at Washington State University. This work is taking place in collaboration with biometric software development work done at the University of Utah.
Some of the critical pieces of the process of moving toward digitization of the image coding process are attached to this paper as appendices. Appendix 1 labeled “Facial Coding” provides a listing of elements and preliminary formulas for image scoring. Appendix 2 is entitled “For Each Candidate Code the Following Information” and it provides a listing of ‘controls’ to used in the analysis of judicial candidates image scoring in subsequent studies. And finally, Appendix 3 constitutes the Coder Survey Instrument (Qualtrics format) used to collect the data for this paper.
Findings on Contested Nonpartisan Judicial Elections in Washington
Over the course of two decades Charles Sheldon and Nicholas Lovrich and several of their doctoral students studied the judicial selection process in Washington and Oregon to contribute to the debate on the advisability of electing trial court and appellate court judges by a vote of the people (Sheldon and Lovrich, 1991). The values of accountability and judicial independence are clearly in conflict in this matter, and persons of good character often disagree on which cherished value is to be accorded primacy (Sheldon & Lovrich, 1982; 1983; Lovrich & Sheldon, 1983; 1984).
Critics of elective judiciaries can easily point to “bad choices” made – i.e., the election of judges who turn out to be an embarrassment to the bench either due to incompetence or malfeasance, or occasionally BOTH. They can likewise point to the rapidly rising costs of judicial elections and the danger of corruption attendant to judges being beholden to private interests of one type or another (Goldberg, 2008). Defenders of elective judiciaries of the nonpartisan sort, including Sheldon and Lovrich, argue that “bad judges” are usually challenged successfully in elections and are removed from the bench. They argue further that electoral accountability FORCES judicial candidates to campaign among the public, to visit with editorial boards and groups which endorse judicial candidates, to publish timely information on their campaign expenditures and contributions and contributors, and to articulate the reasons why JUDICIAL INDEPENDENCE is a proper value and how they will work hard to balance accountability to the people while maintaining their independence and objectivity in the hearing and disposing of cases coming before them (Bonneau and Hall, 2009). Such nonpartisan civic engagement activities on the part of judges serving on the bench in Washington and Oregon cause them to be connected with rather than separated from the people whose courts they administer and for whom they work (Sheldon and Maule, 1997).
WHAT IF, however, it is the case that only relatively few voters casting ballots in judicial elections have actually considered any of the slow thinking subjects of thought such as prior experience and the endorsements of editorial boards and professional associations? What if, instead, most voters have engaged in thin slicing thinking? What if they have decided to rely upon their intuition by reading into the facial images of candidates set forth in Washington State’s Official Voter Pamphlets distributed to every registered voter prior to general elections the trait of facial dominance? It is possible, ON THE OTHER HAND, that the large roll off that occurs on down-ballot elections – including judicial races – serves to concentrate the attentive electorate in the active voter population? Is it possible that slow thinking voters might out-number fast thinking voters and that facial dominance perceptions are less important than judicial qualifications and prior experience as featured on the candidate statements in the Voters’ Pamphlet for the judicial electorate? WHICH IS IT?
If it is the case that thin slicing prevails, the judgments of the student coders should prove predictive of outcomes in the same way that Swiss students could predict outcomes of the 2002 French parliamentary elections. The winners of these elections should score higher on mean facial dominance, and the students should be able to “pick winners” at a high rate when asked to indicate for whom they would vote. In 22 contested nonpartisan judicial elections facial images of sufficient quality are available to collect coding from 114 student judges providing ratings of facial dominance and their own choice between the two candidates. In addition, we know which candidate was the victor in balloting.
The results for the contested nonpartisan judicial elections are in line with those reported in the Antonaki and Delgas study. Namely,
Mean Score on 7-point facial dominance scale
% of Races in which the Winner selected as preferred choice = 63%
It seems clear that the effects of facial dominance are once more to be seen in the nonpartisan judicial election context. The student judges were quite able to ascribe to candidates, based solely on a facial image, important attributes of leadership. Likewise, the student judges were inclined to select the same “winners” as the voters had done when presented with the same choice -- and much more relevant information. It seems that many of the voters likely engaged in the same thin slicing, fast thinking behavior as the student facial image coders.
This Phase I analysis of the role of facial dominance in electoral and administrative leadership selection processes is but a first step toward a comprehensive assessment and replication of the prior Mazur and Mueller study. In the coming months we hope to analyze the data being gathered in Taipei on graduates of the Taiwan Central Police University now 20 years beyond graduation. Likewise, we hope to gather comparable data for the Houston Police Department Academy for Phase II of our work. Finally, we anticipate continued progress in Phase III in applying machine learning technology and biometric software to the coding of images from our current sources, and then adding additional images from additional professions (e.g., school principals and superintendents, law school deans) and additional national settings.
The evidence of our strong inclination to read into the faces of others particular traits – both favorable and unfavorable – with no regard for actual qualities of character and mind raises the prospect of much unfairness and social harm. Just as implicit bias bedevils our efforts to bridge the gap between black lives matter folks and blue lives matter folks (Banks et al., 2006), so does the tendency to use fast thinking in place of slow thinking in coming to our choices of leaders -- in our polity, in our professional associations, in our workplaces, and in our social groupings portend much harm in society.
Antonakis, J. & Dalgas, O. (2009). Predicting Elections: Child’s Play! Science 323: 1183.
Banks, R.R., Eberhardt, J.L. & Ross, L. (2006). Discrimination and Implicit Bias in a Racially Unequal Society. California Law Review. 94(July): 1169-1190.
Bull, R., Jenkins, M. & Stevens, J. (1983). Evaluations of Politicians’ Faces. Political Psychology 4(4): 713-716.
Carre, J.M. & Olmstead, N.A. (2015). Social Neuroendocrinology of Human Aggression: Examining the Role of Competition-Induced Testosterone Dynamics. Journal of Neuroscience 11: 29.
Chiao, J.Y., Bowman, N.E. and Gill, H. (2008). The Political Gender Gap: Gender Bias in Facial Inferences that Predict Voting Behavior. PLoS ONE 3(10): e3666.
Faber, C.F. & Faber, R.B. (2013). The American Presidents Ranked by Performance, 1789-2012, 2nd Ed. McFarland.
Gates, K.A. (2011). Our Biometric Future: Facial Recognition Technology and the Culture of Surveillance. NYU Press.
Geniole, S.N. & McCormack (2015). Facing our Ancestors: Judgments of Aggression are Consistent and Related to the Facial Width-to-Height Ratio in Men Irrespective of Beards. Evolution and Human Behavior. 36(4): 279-285.
Gladwell, M. (2007). Blink: The Power of Thinking Without Thinking, Little Brown.
Goldberg, D. (2008). New Politics of Judicial Elections: How 2000 was a Watershed Year for Big Money, Special Interest Pressure, and TV Advertising in State Supreme Court Campaigns. Diane Publishing.
Hai-Jew, S. (2917). Data Analytics in Digital Humanities. Springer.
Huddy, L. & Terkildsen, N. (1993). Gender Stereotypes and the Perception of Male and Female Candidates. American Journal of Political Science. 37: 119-147.
Jacobsen, C.B. & Anderson, L.B. (2015). Is Leadership in the Eye of the Beholder? A Study of Intended and Perceived Leadership Practices and Organizational Performance. Public Administration Review. 75(6): 829-841.
Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus & Giroux.
Lefevre, C.E., Lewis G.J., Perrett, D.I. & Penke, L. (2013). Telling Facial Metrics: Facial Width is Associated with Testosterone Levels in Men. Evolution and Human Behavior. 34(4): 273-279.
Lewis, G.J., Lefevre, C.E. & Bates, T.C. (2012). Facial Width-to-Height Ratio Predicts Achievement Drive in U.S. Presidents. Personality and Individual Differences. 52(7): 855-857.
Lovrich, N.P. and Sheldon, C.H. (1983). Voters in Contested, Non-Partisan Elections: A Responsible Electorate or a Problematic Public? Western Political Quarterly 36(2): 241-256.
Lovrich, N.P. and Sheldon, C.H. (1984). Voters in Judicial Elections: An Attentive Public or an Uninformed Electorate? The Justice System Journal. 9(1): 23-39.
Mazur, A. (1985). A Biosocial Model of Status in Face-to-Face Primate Groups. Social Forces. 64: 377-402.
Mazur, A. (2005). Biosociology of Dominance and Deference. Rowman and Littlefield.
Mazur, A., Mazur J. & Keating, C. (1984). Military Rank Attainment of a West Point Cadet: Physical Features. American Sociological Review. 90: 125-150.
Mazur, A. & Mueller, U. (1996c). Facial Dominance. Somit, A. & Peterson, S. (eds.), Research in Biopolitics, 4: 99-111.
Mazur A. & Mueller, U. (1996a). Channel Modeling: From West Point to General. Public Administration Review. 56(2): 191-198.
Mazur, A. & Mueller, U. (1996b). Facial Dominance of West Point Cadets as a Predictor of Later Military Rank. Social Forces. 74(3): 823-850.
Murray, K. & Blessing, T.H. (1993). Greatness in the White House: Rating the Presidents, from Washington Through Ronald Reagan. Pennsylvania State University Press.
Re, D.E. & Perrett, D.I. (2014). The Effects of Facial Adiposity on Attractiveness and Perceived Leadership Ability. Quarterly Journal of Experimental Psychology. 67(4): 676-686.
Sheldon, C.H. & Lovrich, N.P. (1982). Judicial Accountability vs. Responsibility: Balancing the Views of Voters and Judges. Judicature. 64: 470-479.
Sheldon, C.H. & Lovrich, N.P. (1983). Knowledge and Judicial Voting: The Oregon and Washington Experience. Judicature 65: 235-244.
Sheldon, C.H. & Lovrich, N.P. (1991). A Model for the Study of State Judicial Recruitment. In Gates, J. and Johnson, C. (eds.), American Courts: A Critical Assessment. CQ Press.
Sheldon, C.H. & Maule, L.S. (1997). Choosing Justice: The Recruitment of State and Federal Judges. Washington State University Press.
Weston, E.M., Friday, A.E. & Lio, P. (2007). Biometric Evidence that Sexual Selection has Shaped the Hominin Face. PLoS ONE 2(8), e710 hhtp://dx.doi.org/10.1371/journal.pone.0000710.