This literature survey gives a theoretical overview of Bayesian networks (BN) and discusses the application of Bayesian networks to the problem of facial expression recognition (FER). A Bayesian network is a graphical representation of a probabilistic system that models the full joint conditional probability distribution of an arbitrary problem. Facial expression recognition aims to determine the emotional state of a subject by analysis of the facial features. Different structures for Bayesian networks are reviewed such as Naïve Bayesian networks (NB), Tree-Augmented Naïve Bayesian networks (TAN) and Hybrid Bayes. Bayes is compared to other methods such as Support Vector Machines (SVM), Relevance Vector Machines (RVM) and AdaBoost. Furthermore, the small sample case is discussed with regard to the Bayesian approach, which is particularly vulnerable to a lack of sufficient training data. We discuss several solutions for the small sample case and some final recommendations are made for future research.
Summary of Contents
Summary of Contents 3
1 Introduction 5
2 Bayesian Networks 9
3 Facial Expression Recognition 21
4 The Small Sample Case 30
5 Conclusions and recommendations 38
6 References 45
The smile of the Mona Lisa is perhaps the most illustrious facial expression known to man. Although there are numerous theories as to the cause of the smile, ranging from the ‘highway blues’ theory by Bob Dylan  to the alleged self-portrait theory defended by Dr. L. Schwartz , there seems to be no controversy over Mona Lisa’s portrayed facial expression. This is quite interesting, since we may safely assume that there is no one alive today who has actually known Mona, that is to say who has extensive knowledge of her facial features, it seems particular that almost every person would recognize the slight smile upon her face. Apparently, we humans are uniquely qualified to recognize certain facial expressions and attribute some emotional state to the person portraying the observed expression. Of course we may argue that pattern recognition technology can be used to determine key points on non-rigid objects such as the human face. The relative key points can be used to systematically analyze observed facial expressions. And it has been argued that we can map facial expressions directly to emotional states. Yet when we want to use modern technology to simulate recognition in an automated way, we encounter a true challenge.
Humans communicate. We do so in a lot of different ways, often using more than one mode of communication at the time. We converse with words, gestures and facial expressions, none of which is insignificant. Although humans are capable to communicate with only written words, we often need more than an exchange of words to achieve a robust level of communication. Mere words are often context dependent, so we need some indication of their context, meaning we need to asses the current mental state of the person that is trying to communicate with us. One way humans portray their mental state is by using facial expressions. When in conversation we often use our face to clarify our words. For instance, if a person with a happy face would tell us that he had lost his wallet, we would be inclined to think that this person was telling a joke, while if he that person would have had a sad face, we would have been inclined to believe his or her statement to be true.
Figure 1.1: Sony’s AIBO
The field of Human-Computer Interaction strives towards a comparable level of communication. A computer that could interact with humans through facial expressions would advance human-computer interface towards a standard comparable to human-human interaction. Today computers can still be seen as ‘emotionally challenged’ as they fail to recognize the emotions of the humans with whom they interact. However, if it were possible in some way for computers to become emotionally sensitive, this would greatly enrich the possible communication between humans and computers. Obviously, non-verbal communication play a big part in everyday life. Not only the expressions on faces are important, but also gestures such as the wink of an eye or a stuck out tongue are a form of non-verbal communication. Facial expression recognition can be applied in number of different contexts. It can easily be imagined that a intelligent agent designed to be aware of its environment, for example Sony’s AIBO, will be greatly improved if it can say something about the emotional state of the subject with whom it is interacting. For instance it might try to cheer up a person if it detects sadness, or increase its awareness if it senses fear. Another area of application might face recognition in order to identify certain persons, in which case reverse mapping can be useful to match the person portraying a particular facial expression in a video stream back to the expression on the queried image of a neutral face. In a nutshell: facial expression recognition provides us with a way to improve the quality of communication between humans and machines.