Animation has traditionally provided a way for film makers to express themselves without having to use real people, objects and scenarios. Characters can be any shape, size or entity, and don’t have to move or act according to any specific rules. Very early animation almost never tried to be realistic, but there is a recent trend towards realism in animation, fuelled by the advent of CGI [Computer Generated Image] animation. Faster processing speeds and more sophisticated graphics techniques yield smoother, more detailed animation.
Real objects, such as animals and vegetation, do not move in very regular patterns. They behave erratically as a result of internal and external influences. An example of an internal influence would be motion compelled by conscious thought rather than by reflex, and an external influence could be wind. To achieve realism, depictions of living entities must move in the same slightly erratic way as the entities on which they are modelled.
Humans are especially difficult to animate realistically. Even when we are sitting still, we are seldom totally stationary.
Lip synchronisation and facial animation are two of the most vital elements in shots focused on the upper body. Facial expression plays a large part in human communication, as does lip synchronisation, within the context of animation. Facial expressions are fairly difficult to recreate perfectly, but lip synchronisation has always been difficult to emulate accurately, and time-consuming when accuracy is the goal. This project attempts to create a system that will make accurate lip synchronisation automatic and easy.
The first known animated film, “Un bon bock”, was produced by Emile Reynaud in 1888. It consisted of a series of drawings made on cellulose that were projected by a device of his own invention . Emile Cohl (1857-1938) produced the first animated film produced by photographing drawings on paper, called “Fantasmagorie”, in 1908 [Crandol, 1999].
The first true pioneer of character animation was Winsor McCay (1867-1934). From 1911 to 1921, he worked to elevate the art of animation from a simple camera trick to full-blown character animation. He conceived and produced all animations by himself, often taking more than a year to complete a 5 minute sequence. His film “Gertie the Dinosaur”  is considered the first landmark animated film.
Drawings were originally done on paper, but a John Bray Studio employee called Earl Hurd developed a process in 1914 whereby the drawings were done on colourless cellulose [the origin of “cel”] and then photographed to produce the final image.
Max Fleischer patented the Rotoscope in 1916. This allowed animators to trace over live action footage, which gave the characters realistic movement.
Animation studios soon sprang up, with teams working to produce animations. A senior artist draws keyframes, which are cels depicting a notable event, such as the beginning or ending of a certain motion. The keyframe cels are then be passed down to junior artists to fill in the cels producing the actual movement. After the line drawings are completed, the cel sequences are handed over to a colourist to be filled in with colour. This process has remained unchanged for decades. Cel animation is still in use by many animation houses, but is occasionally mixed with or replaced by CGI animation.
The Walt Disney Studio was created in 1928, and has dominated animation ever since. Walt Disney himself, unlike many other heads of animation houses, was involved in the production of his films. He steered his artists away from the “rubber hose” style of animation, so called because of the boneless motion given to animated characters until then, and toward more realistic and natural movement. Disney Studios produced the first full-length animated movie, “Snow White and the Seven Dwarfs” in 1940.
CGI animation is a relatively new animation method. Although CGI has been in development since the 1950’s, using it for animation has only truly taken off in the last fifteen years.
CGI animation can be used in either a 2D or a 3D environment. 2D CGI animation is very similar to traditional cel animation, but for the notable difference of medium. Instead of drawing the images on paper, the cels are produced using graphics programs on a computer, which can handle keyframes and tends to animate the images by morphing between them.
The first CGI to be used on film was created in 1961. John Whitney Snr. devised the introduction to Alfred Hitchcock’s Vertigo on an analogue computer. The first digital human was created by an employee of Boeing to be used in cockpit studies in 1963 [Carlson, 2001]. “Westworld”, made in 1972, featured a view through the eyes of a robot generated by 2D CGI, the first to be used in-film and not just to create the opening logo. It’s sequel, “Futureworld”, displayed the first use of 3D CGI in a motion picture [Morie, 1998]. 1981 brought “Tron”, which made extensive use of 3D CGI and required four major animation companies to fulfil the vision of a world inside a computer. The first commercially-released CGI animation appeared in 1992, titled “Gas Planet”1. The first full-length CGI motion picture was “Toy Story”, released by Pixar2 in 1995.
Lip synchronisation was a skill that took years for an animator to perfect. The animator would have to work frame by frame to produce convincing lip motion. A voice actor would then have to fit a vocal performance to the animation. One mistake in timing could ruin the entire sequence.
Apart from needing a very skilled animator, the big drawback to good lip synch was the amount of time it took. Creating the preliminary animations was an arduous task, and an inept voice actor could add days to a schedule by making mistakes.
With the advent of CGI, an obvious solution to the problem of accurate lip synch is to record lip motion during the actor’s performance, and then generate the appropriate lip motions using the data. This makes lip synch almost instantaneous, and frees animators up to perform other tasks.
This approach was partly tried by animators at Industrial Light and Magic during work on “Dragonheart” in 1996. The film required a dragon that could talk, with Sean Connery supplying the voice. Lip synchronisation was achieved by making 2D drawings based on Sean Connery’s lip movements while recording his part, and then using those drawings to animate the dragon’s speech [Cotta Vaz, 1996]. The problem with this approach is that it still requires an animator to draw the input pictures and then animate the model. This makes it a more realistic solution than previous attempts at lip synch, but still time-consuming.
Inaccurate lip synchronisation can effectively ruin an animation. To enjoy a movie, the viewers must get involved. To do this, they must identify and empathise with the characters. If they perceive that the characters are acting in a non-human way, then it is extremely difficult to feel any empathy with them and to become truly involved in the movie. Serious scenes may become humorous simply because the speech is slightly out of synch with the characters’ lip movements. Years of badly-dubbed Japanese films have created a running joke among the movie-going community.