[Information Interfaces and Presentation

Download 5.99 Mb.
Size5.99 Mb.
  1   2   3   4
On the naturalness of touchless: putting the “interaction” back into NUI

Kenton O’Hara, Richard Harper, Helena Mentis, Abigail Sellen, Alex Taylor

Microsoft Research, Cambridge, UK


Categories and Subject Descriptors: H 5.3 [Information Interfaces and Presentation] – Group and Organization Interfaces – Computer Supported Cooperative Work.

General Terms: Collaboration, Telepresence, Media Spaces

Additional Key Words and Phrases:

ACM File Format:

O’Hara, K., Harper, R., Mentis, H., Sellen, A. and Taylor, A.2012. On the Naturalness of Touchless Interaction. ACM Trans. on Computer Human Interaction, x, x, Article x (2009), x pages. DOI =


Authors’ addresses: K. O’Hara, R. Harper, H. Mentis, A. Sellen, A. Taylor Microsoft Research, Cambridge, UK. E-mail: v-keohar@microsoft.com;

Permission to make digital/hard copy of part of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date of appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Permission may be requested from the Publications Dept., ACM, Inc., 2 Penn Plaza, New York, NY 11201-0701, USA, fax: +1 (212) 869-0481, permission@acm.org

© 2009 ACM
1. Introduction

After many decades of research, the ability to interact with technology through touchless gestures and sensed body movements is becoming an everyday reality. The emergence of Microsoft Kinect, among a host of other related technologies, has had a profound effect on the collective imagination, inspiring and creating new interaction paradigms beyond traditional input mechanisms such as mouse and keyboard. Kinect and other technologies form part of the broader suite of innovations that have come to be characterised as Natural User Interfaces (NUI) (e.g. Widgor and Wixon, 2011, Norman, 2011). This moniker includes not only the vision techniques that form that basis of Kinect, but also natural language interfaces, pen-based input and multi touch gestural input, amongst techniques. The excitement around touchless and body-based interfaces has been accompanied by an increasingly powerful narrative, one that makes the eponymous claim that these new technologies offer an intuitive interface modality, one that does not require users to develop specialist techniques for communicating to computers. What users need to do, instead, is what comes naturally. Consider, for example, the following quote from Saffer (2009):

“The best, most natural designs, then, are those that match the behaviour of the system to the gesture humans might actually do to enable that behaviour” (Saffer, 2009, p29)
The essential argument is that drawing on existing gestures in everyday life, by identifying the physical movements used to manipulate and understand the world, new interaction paradigms can be developed that will allow people to act and communicate in ways they are naturally predisposed to. They will not have to adapt their action or communications to the peculiarities and limitations of technology; the interface will no longer be a barrier to users, the interface will be them and their gestures.

Such a narrative, of course, does serve a number of purposes: it’s good for marketing, for example, making a technology appeal in ways that it might not otherwise do. Many people do not like to use a keyboard, as a case in point, and so Kinect might be especially appealing to them. Such a narrative can also help express high-level visions that set out design and engineering challenges: these can inspire research and development communities not just in HCI, but in hardware and software engineering too; NUI can appeal across the board.

However, elements of this narrative are becoming so deeply embedded in how new forms of interaction are thought about and described that important albeit apparently minor distinctions are being elided. Indeed it is not uncommon practice for papers writing about touchless gestural and body-based interaction to deploy the term natural (and its cognate intuitive) when characterising these technologies (e.g. Bhuiyan and Picking, 2009; Varona, Jaume-i-Capó, Gonzàlez, and Perales, 2008; Corradini, 2001; Pavlovic, Sharma and Huang, 1997; Baudel and Beaudouin-Lafon, 1993; Stern, Wachs and Edan, 2008; de la Barré, Pastoor, Conomis, Przewozny, Renault, Stachel, Duckstein, and Schenke, 2005; Wexelblat, 1995; Garg, Aggarwal and Sofat, 2009; Wu and Huang, 1999, Cipolle, and Pentland, 1998; O’Hagan, Zelinsky and Rougeaux, 2002; Sánchez-Nielsen, Antón-Canalís, and Hernández-Tejera, 2003). Indeed in a review of 40 years of literature on gesture-based interaction, Karem and Schraefel (2011) cite naturalness as one of the key motivations underlying much of the work in this area. As they say: “much of the research on gesture based interactions claim that gestures can provide a more natural form of interacting with computers.”

There are a number of concerns with this treatment to be highlighted here. First of all, gestural interactions are not a homogenous entity. As various authors have articulated, gestural interactions may refer to very different kinds of activities (Quek et al, 2002, Karem and Schraefel, 2011). Based on the work of Quek et al, 2002, Karem and Schraefel (2011) identify different forms of gestural action. These include deictic gestures for pointing, manipulative gestures that are used to control an object or entity, semaphoric gestures that symbolise an object or action with communicative intent, language gestures (e.g. sign language), and gesticulation or co-verbal gestures that accompany speech. These gestural types of course have different properties but at times this is glossed in the literature in the form of conceptual homogenisation or where motivations for particular gestural interactions of one type are justified with reference to another type. The notion of natural has also been deployed in rather a loose and unquestioning fashion to mean variously, intuitive, easy to use or easy to learn - these characteristics arising, it is argued through either mimicing aspects of the real world or drawing on our existing tendencies in the areas of communicative, gesticulative, deicitic and manipulative behaviours and actions (see Widgor and Wixon 2011 for a commentary). At times, it is unclear which or all of these characteristics are being alluded to in any particular deployment of the word natural and the foundations on which the deployment is made. Aside from this lack of specificity being an important concern itself, many of these basic claims too are being called into question, a notable example here being Norman’s (2010) critique of the naturalness of gestural interfaces in terms of of their claimed intuitiveness, usability, learnability and ergonomics.

Norman’s critique is indicative of the issue that while using the word natural might have become natural, it is coming at a cost. In other words, precisely because the notion of naturalness has become so commonplace in the scientific lexicon of HCI, so it is becoming increasingly important, it seems to us, that there is a critical examination of the conceptual work being performed when it is used. There is a need, we contend, to understand the key assumptions implicit within it and how these frame approaches to design and engineering in particular ways. In our view, a close examination of these assumptions will show how they can constrain as much as enable; nuance is required when thinking about naturalness and this can help refine how touchless gesture and movement-based applications are used to innovate. In doing this, we want to adopt a somewhat different tack to Norman’s concerns. So while we would agree with Norman’s counter arguments to the various claims of intuitiveness, usability and learnability that have been applied to gestural interfaces, there is also a sense that such a critique is still operating on the same playing field (albeit on opposite sides) as the proponents of these naturalness claims. That is, attention remains focused on the interface as the potential source of explanation for (or lack) naturalness, usability, intuitiveness and learnability. In taking this focus, though, it is our contention that opportunities for better understanding of what can be done with these technologies are sometimes being lost. Broadly speaking, the argument we want to make here is that by situating the locus of naturalness in the gestural interface alone, it is simply being treated as a representational concern. But in doing this, attention is perhaps less focused on the in situ and embodied aspects of interaction with such technologies. What we want to argue, here is that such interactional concerns need to be a more fundamental feature of our discourse and understanding of naturalness and that by doing so, we can better understand the opportunities and constraints for their innovation and adoption.

2. Representation vs Interaction

The arguments we construct draw from a number of areas. These include the so-called situated interaction literature, going back to the ethnomethodological turn in CSCW (represented in the works of Bannon & Schmidt (1989), for example, as well as in Suchman & Wynn (1984) and many others. For an overview see Schmidt, 2011). This work largely derives from Garfinkel (1967) and the social theoretical implications of the later Wittgenstein (1952) (See Button, 1991). This perspective draws attention to the publically available, demonstrative and ‘accountable’ features of human action. It also draws on phenomenological approaches, represented most famously by Flores et al (1988) and subsequently in the so-called Post-Phenomenological work of Ihde (2002) and others. This places an emphasis on the body as the source of experiential awareness and subjectivity, and how, through action or praxis, engagement with world comes to be known (Lave & Wenger, 1991; for a commentary see Dourish, 2001). The combination of these views can be contrasted with those that tend to be deployed in Human Factors and Ergonomics research which treats the functioning of the brain and the body as specifiable, particularly as this functioning intersects with machinery (See, for instance, Moray, 1998). This view is sometimes called a ‘positivistic’ perspective on action. In similar ways to how these two broad camps have been used to discuss different notions of context in ubiquitous computing (Dourish, 2004) and notions of affect in Affective computing (Boehner, DePaula, Dourish, and Sengers. 2005), we apply the same contrast to thinking about notions of naturalness in relations to touchless and body-based interaction.

We begin first with a look at the predominant form of NUI narrative which can, in our view, be considered as grounded in the positivist account of action. In this perspective, the aim of natural interfaces is to leverage and “draw strength from” pre-existing actions that are used in everyday life by people to communicate and to manipulate objects in the world (e.g. Jacob, 2008). The defining idea behind these interfaces, within this perspective, is to make computer interactions through them “more like interacting with the real non-digital world” (Jacob, 2008). Similarly, as Abowd (2004) argues, “it is the goal of natural interfaces to support common forms of human expression… Humans speak, gesture and use writing utensils to communicate with humans and alter physical artefacts. These natural actions can and should be used as explicit or implicit input to ubicomp systems” (Abowd, 2004).

This perspective, then, assumes that existing communicative gestures and actions are pointers toward, and sometimes exact incarnations of, common or even universal ‘natural interactions’. These interactions are seen as having an ideal, static and definable state and, though they are not always completely clear or exactly represented in any particular instance, they are something that can be, with sufficient understanding and scientific research, represented and modelled. Such representations and models can, ultimately, form the basis for defining interfaces to the digital world that will, broadly speaking, mimic their “real-world” counterparts. The naturalness of these interactions is something that is taken as purely a problem of representation – ensuring that they are correctly represented in the interaction mechanism itself. In this sense, natural interactions are something detached from the social context in which they might be deployed; they are not constituted by the context, but brought to it.

In characterising this perspective, our intention is not to critique in a dismissive fashion. Indeed, it is important to acknowledge that such an approach has led to some important successes in terms of interface innovation. Indeed, the suggestion that there are such essential and transituational phenomena has been a cornerstone of much ergonomics for example, and this manifests itself in the design of all sorts of contemporary technologies, from kettles to large scale organisational systems, from cars to aeroplanes. It also formed the basis of the original HCI work behind the Xerox Star system (Smith et al, 1982). It is also central to much contemporary analytic philosophy, particularly the philosophy of mind deriving as it does from the causalism avowed by Donald Davidson (1963). This is showing itself in current manifestations of the theory of embodied cognition, represented in books like Clarke’s Natural-Born Cyborgs ((2003). This is also articulated in HCI, though often without the philosophical auspices being made clear (see Hornecker, 2005, Hornecker and Buur, 2006; Larssen et al, 2007). Rather, our intention is to highlight how such a perspective leads interface design in particular directions and this comes at the expense of not taking other directions. Our suggestion is that these other directions (or paths of inquiry) can lead to significant and insightful ways of understanding what human-machine interaction can entail, and this innovation can come around touchless gestural and body-based interfaces for computer systems.

As we say this positivist perspective can be contrasted with the situated and phenomenological approaches. Of significance in this general view is a distinction between the objective body and the lived body (e.g. Merleau-Ponty, 1962, 1968). The objective body can be characterised in terms of how bodily actions might be described from a third person’s point of view – an abstracted description of muscular performance that can be defined and represented. The lived body view, by contrast, concerns the way that people experience and perceive the world through bodily actions. In this perspective, the lived body is in constant rapport with the situated circumstances and it is through actions on the world that those circumstances and the role or function of the embodied actor are made meaningful. The conscious experience of the world and the way it is understood are inseparable from the process of acting in that world. This view emphasizes the subjective construction of meaning through praxis. This subjectivity is, however, publically available not solely through Husserl’s technique of introspection, but, as Merleau-Ponty wantd to point out, through everyday practices, such as through discourse, for example.

A second significant element of this perspective comes from Wittgenstein (1967), and his claim that, through action, people create shared meanings with others, and these shared meanings are the essential common ground that enable individual perception to be cohered into socially organized, understood and coordinated experiences. This draws attention to how actions come to be treated as somehow rational and accountable, as demonstrably about a known-in-common purpose. Garfinkel developed this point and highlighted how talk, situated talk, or as he put it, reflexive talk, is central to how activities come to be understood. Where Merleau-Ponty emphasized the individual subject and their bodily praxis, Wittgenstein (and hence Garfinkel) emphasized the social basis of the individual’s experience, and this pointed towards language and its use in context, to how people act together through talk and other reflexive activities.

Though it would be true to say that there are important distinctions between these two philosophers as there are indeed in the work within HCI that has derived, there is nevertheless a common perspective, particularly when it comes to understanding naturalness and natural gestures or acts. From this view, naturalness is not something to be represented but is rather an ‘occasioned property’ of action, something that is actively produced and managed together by people in particular places - particular occasions, hence the phrase. Of significance here is that these occasioned properties are not just linked to space, to locations of various sorts, but also to the set of persons who occupy those spaces and render them suited for particular actions. Lave & Wenger (1991), along with Brown & Duguid (1991), call these groupings ‘communities of practice’, by which they mean to highlight how communities cultivate and embody particular sets of skills and know-how, much of which is not articulated through verbal or documented forms but is shared through bodily proximity. Communities make space in this sense or rather make space come to represent and enable embodied learning.

In this respect, the naturalness of how a technology might be interacted with lies not in the physical form of that technology, nor in any predefined interface (natural or otherwise) but in how that form and the interface in question melds with the practices of the community that uses it. This is what is constitutive of ‘natural use’. It is not technology itself that is natural, but the ways that people can make the actions they perform with technology ‘apposite’, ‘appropriate’, or ‘fitting’ to the particular social setting and their particular community. It is in this way that it becomes sensible to say that use is natural.

By adopting this perspective, our intention here is explicitly not to use it as a means for justifying why certain types of interactions are more natural than others, it should be clear. Indeed, we would argue that it has been a somewhat unfortunate consequence of how a certain interpretation of the phrase natural interaction has been mobilised in the literature. Instead of being used to help understand how to create and explore more natural-like interfaces, the emphasis on the embodied aspects of action has led some to justify, as a case in point, why tangible computing and body-based interactions ‘work better’ for people because they are ‘more natural’ for people when compared with other forms of interaction. Or, to put this another way, it is sometimes proposed that the success of these systems is because they make better use of users kinaesthetic and proprioceptive awareness of their bodies – the systems are thus more natural (e.g. Jacob et al, 2008).

If we engage with the Wittgenstian /Merleau-Ponty perspective, we need to accept that all action is embodied, irrespective of any interaction mechanism or artefact that we may come to use; but we also need to understand that it is through praxis that understanding comes. What is important is both the claim about the centrality of the body as the vehicle for understanding and the potential for action (Larssen et al, 2007) that the deployment of the body enables; it is through this that the construction of meaning, sense, and so forth is achieved; that is to say through these actions. This is more than simply a question of material, spatial and technological determinism whereby our actions are shaped by the material structure of the physical world, then. Rather understanding and meaning of the world are made through the actions being performed.

Articulating these different perspectives is not simply a question of philosophical musing or semantic quibbling. Rather, it serves a very practical purpose of drawing our attention to a different ways of understanding touchless and body-based interaction technologies. The positivistic view helps specify what might be designed for – those gestures constitutive of natural behaviour. This view makes investigation of everyday gestures seem like a tractable problem, one that has limits: engineers simply need to build for the vocabulary of known movements. But just as this view makes the engineering seem tractable, so it also tends to close down what might be enabled by natural interaction. It does so because it elides the possibility that what is natural is much more diverse and creatively produced than is suggested by the common use of the phrase natural; different contexts and different communities of practice not only need different forms of NUI, they also sometimes make new forms of ‘the natural’. In other words, the Wittgenstein/Merleau-Ponty view draws attention to the potential for action enabled by various properties of touchless interaction, and the different communities of practice and settings in which actions are given meaning.

Because so much attention has been given to the positivistic approach to the natural, we turn to discuss how to understand the potential for innovation in this area by looking at the problem from the other view. To do this we shall explore, first of all, the kinds of touchless interactions that one might want to appropriate. We will do this by making a contrast with touch-based systems. We then look at how communities develop and cultivate different needs, and thus come to create contexts for the natural. We then explore how communities and the properties of touchless come to manifest themselves in different real world contexts, which we illustrate in a series of fieldwork examples.

Properties of Touchlessness

By starting with properties, it might seem that we are going back on our claim that the naturalness or otherwise of technology is to be understood by reference to a technologies use, rather than being intrinsic to the technology. The properties we want to start with however, are rather more prosaic features that can be brought to bear in different contexts; nevertheless one can characterise their properties without recourse to context. To help articulate them, we set them out as a series of contrast points with the properties of touch-based interaction (see Table 1). This list is not intended to be exhaustive but rather is more indicative of the kind of properties we can attend to (cf. de la Barre et al, 2009). There are undoubtedly numerous others but what is important is the subsequent ways in which these properties are then considered with respect to the different communities of practice and settings.



co-proximate with surface

distant from surface

transfer of matter

no transfer of matter

pressure on surface

no pressure on surface

momentum of object

no momentum

attrition and wear of surface

no attrition or wear

movement constrained by surface

freedom of movement

haptic feedback

no haptic feedback

Table 1. Contrasting characteristics of touch vs. touchless interaction.
Let us consider some of these further. The first point of contrast concerns the proxemic consequences of touch-based versus touchless interactions (cf. O’Hara et al 2011, Mentis et al, 2012). When we interact by touching a system we are required to be co-proximate with the surface we are touching – it has to be accessible and open to touch and it has to be in reach. With touchless interaction, by contrast, we can interact at a range of different proximities from the surface of the system. The exact distance from a surface at which touchless interaction can take place depends on the particular sensing technology in question ranging from a few centimetres to several metres.

The second property we highlight concerns the transfer of matter With touch-based interactions, because of the necessity of contact, there is a transfer of matter from the person touching to the device and vice versa from the device to the person touching. Touchless interaction, by contrast, avoids contact and therefore any transfer of matter to or from the system.

Thirdly, in touching something, there is always a certain amount of momentum and pressure applied to the surface being touched. This may cause movement, damage, erosion and attrition. In touchless interaction, by contrast, there is no application of pressure or momentum to the surface in question and therefore no potential for movement damage and erosion.

The fourth property concerns constraints on movement. With touch-based interactions, movement is bound and constrained by the shape and properties of the surface being touched. With touchless interaction technologies, by contrast, movement is free and unconstrained by the technology’s surfaces.

Finally, we consider the property of haptic feedback. With touch-based interactions, the contact with the surface can provide a rich source of haptic feedback through which manipulations can be finely tuned and refined on a moment-by-moment basis. With touchless interactions, there is an absence of haptic feedback and with that a diminished resource for fine tuning and refining manipulations in the moment.

For the purposes of simplicity in our argument, we have specified these properties at a particularly high level. For each of these properties it is possible to articulate them at much finer levels of granularity (cf. Rogers and Muller, 2006). Ultimately, the exact level of detail at which we articulate these is done with reference to the potential for action we are orienting and the significance of this to certain communities of practice in particular settings.

Communities of practice

We turn now to consider Wenger’s (1998) notion of Communities of Practice. What is significant in Wenger’s notion of practice is the coming together of meaning and action. The practices of a particular community are the ways that they experience the world through action and how it is made meaningful. Different properties of an artefact and the potential for action they entail, are seen, interpreted and made meaningful in different ways by different communities through the ways that they are enacted in their practices. Let us consider for example, the issue of transfer of matter that takes place due to the contact necessity of touch-based interactions but not for touchless interactions. As Mary Douglas (1966) has eloquently argued albeit avant le letter of the term ‘communities of practice’, people’s orientation towards matter as “clean” or “dirty” is not an inherent, fixed or absolute classification, but only makes sense with reference to a particular community of practice and the activities in question. Take for example Scientists and Engineers in “Clean Room” environments and their need to orient to dust particles and other matter in very different ways from other groups. For these scientists, the presence of even the tiniest particle of matter can be sufficient to interfere with carefully planned experiments and manufacturing processes. The meaning of contaminating matter then is very different to this community to what might be considered a contaminant in more every day behaviours and practices. This meaning in turn affects the ways that this community of practice orient towards notions of touch and touchless in the organisation of their action. Much of the practices are organised to avoid direct contact with surfaces in ways that risk the transfer of matter. Indeed, the organisation of these actions in this way is entirely natural for this group given the particular significance of contamination for them. In this respect, the non-contact property of touchless interaction has a very different meaning making potential for this community than it does for others. Through this property, an evolving set of practices would be enabled for this community that enable them to experience, interpret, and engage with their world in new ways.

Similar arguments can be applied to the other properties mentioned. Let us consider the issues of pressure and momentum that arises through touch based interaction but not present in touchless interaction. Again, if we consider scientists and engineers working in clean room environments, we can see some very particular ways that this community would orient to such concerns. Scientists and engineers in these clean room environments, who are working at the nano scale, need to orient to movement and vibration in very particular ways. Even tiny vibrations might disrupt experiments and manufacturing process for these operators that other groups simply would not be concerned with. As before, the particular meaning of pressure, momentum and vibration for these groups affects the way that activities are organised in relation to touch and not touching for these groups. This kind of range in forms of concern – almost a kind of relativity - can be seen in more everyday group concerns with respect to movement and pressure sensitivity of touch. A good example here can be seen in the use of multi touch phones. When held in the hand, the pressure property of touch is not really a worry. But when the same phone is placed on a speaker dock system, the same pressure of touch necessary to control the device puts pressure on the docking socket that with sustained use can result in damage both to the phone and docking device. Accordingly, actions are adjusted in such circumstances to avoid potential damage.

Thirdly, the interactional perspective on the naturalness of touchless interaction draws our attention to the settings in which particular communities and groups perform their activities. These settings consist, in part of the physical environment, the architectural arrangement and whole ecology of artefacts within which a piece of interactive technology might be situated. They consist also of a set of other social actors. This physical and spatial structure of these environments then, both enable and constrain how action and practices are organised with respect to information artefacts and other people in the system (e.g. Kendon, 2010, Hornecker, 2005; O’Hara et al, 2010, Hall, 1966, Marshall et al, 2011, Bardram and Bossen, 2005). The features of these settings then can be related to particular properties of touchless interaction.

For example, let us consider the notion of interaction proxemics (O’Hara et al, 2010). This concept labels the spatial consequences of particular interaction mechanisms. For touch-based technologies, the spatial need to be co-proximate with the system has consequences for how action can be organised and the particular ways this information can be incorporated into the broader practices within these settings. With touchless interaction, the requirement for proximity to information displays is not there. This different spatial relationship with the information has consequences for when, where and how this information can be incorporated into the practices in these settings. It changes, the relationship between actors and the information and creates different potential for action and meaning making through these interactions. Similarly, if we consider the property of freedom of movement of touchless interaction, it is clear that particular settings may facilitate or enable certain types of gesture and body movement. That is, freedom of movement can be physically hindered by the dimensions of a space, presence of other artefacts, the presence of other people, or the need to concurrently interact with other tools. Again, this affects the potential for how we meaningfully configure action in relationship with these settings.

Also of significance in these settings are particular collaboration and coordination concerns and how the configuration of these activities is achieved in the context of particular interaction possibilities. One might ask how actions are made visible, accountable and meaningful to the other actors in these settings and how the properties of touchless interaction can be brought to bear in meaningful ways. It will be important to recognise that this will not be a one-way relationship. It is not simply a question of how certain types of interactive gesture or body movements are visible or not to other actors in these settings but also how other features of the coordination and collaborative activities relate to the potential for touchless interaction.

For example, if we consider the need to work in close physical proximity to others in these settings, this may impact on the technical capabilities of the system to track the movements of an individual actor. Different settings too will have particular norms and expectations of appropriate behaviour that can be enacted here. The need to attend to these norms and expectations with the social context of these settings imposes important boundaries and constraints on how particular communities orient to specific properties of touchlessness in terms of the movements and actions they perform. It may be entirely appropriate to jump around and wave your arms in the comfort of your own home but such behaviour may be less appropriate for other settings such as the workplace.
Naturalness in Situ

Taking these things together then, what emerges is a different perspective on how we conceive the notion of naturalness in relation to touchless interaction. Naturalness in this perspective is not something that is bound up in a representation of our gesture and body movement; it is not about the ability to infer intent through these representations. It is not simply the exchange of information between man and machine in order to elicit some form of system response. It is not something that can be bound up and packaged solely within the interaction mechanism itself. What is significant about the embodied interaction perspective is how touchless technologies are able to reconfigure our relationship with the material and social world. Naturalness of interactions, in this sense, arises from the potential for action enabled by various properties of touchless interaction and how these properties come to be made meaningful in the practices of specific communities in particular social settings. In designing natural touchless interactions then, our concerns cannot simply be with evermore-enhanced representation and modelling of gesture, movement and domain physics. These systems should not be judged in terms or how well they approximate or fall short of the characteristics of human-human communication. Rather, we need to approach the design of these systems in terms of how they might allow a beneficial reconfiguration of practices and how we experience the world in new ways accordingly.

In order to illustrate these points in a more concrete fashion, we present some fieldwork examples for which we are designing or have deployed touchless interaction technology. The chosen settings are very different in nature, affording us the opportunity to highlight and contrast the occasioned ‘naturalness’ of touchless interaction. It takes different forms in other words, depending upon context.

The first example concerns practices around medical images in surgical settings and opportunities for touchless interaction (e.g. Johnson et al, 2011; Mentis et al, 2012; Wachs et al, 2006, 2007; Stern et al, 2008; Graetzel et al, 2004). In the second example, we consider practices around an interactive game on a large public screen display (e.g. O’Hara et al, 2008; O’Shea, 2009, 2010).

Touchless interaction in surgical settings

Our discussion here draws on fieldwork conducted in operating theatres in 2 large hospitals in the UK. The observations we undertook covered a variety of different procedures in Interventional Radiology, Neurosurgery and Vascular Surgery. Within the theatres there is a wide range of medical imaging equipment and displays used. These allow access to pre-operatively captured images such as CT scans and MRI scans, as well as images captured during the course of the procedures such as real-time fluoroscopy and angiographic image sequences. These images are used variously for reference, diagnosis, planning interventions and for real time navigation and guidance of equipment on the otherwise hidden inside of the body. The ways that the images need to be viewed interacted with and even manipulated is contingent on the particulars of the procedures in question. Currently within these hospitals, the interactions with these images are achieved through traditional touch-based interaction techniques, primarily keyboard and mouse, but also some use of touchscreens. The purposes of the fieldwork is to understand how work practices in these settings are currently organised with respect to touch-based technologies with a view to considering opportunities and implications for touchless interaction technology.

One of the key factors to which people orient in the organisation of work in these settings is the boundary between sterile and non-sterile features of the environment. Within these settings, there are areas demarked as sterile and those which are non-sterile. For the members of the surgical team who are scrubbed up (consultant surgeons, radiologists and scrub nurses) action is organised to avoid contact between sterile and non-sterile surfaces. The interaction technologies used to control the imaging systems in these settings are considered to be non-sterile and therefore not to be touched by the surgeon and others who are scrubbed. Here we see a particular orientation to the transfer of matter that is particular to this group and setting. The transfer of contaminants through touch in this setting means something significantly different to these actors than the ways we might orient to these issues in more everyday circumstances - the notion of what is “dirty” and “clean” is specific to this group. To touch here is a matter of risk to the current and future patients as well as staff – it is literally a matter of life and death. This then places restrictions on the surgeon’s interaction with the images. Let us consider an example of how this orientation to the transfer of matter is manifest in practice.