CHI 2000 Workshop on Natural Language Interfaces
The Hague, The Netherlands, April 3, 2000
Language in Multi-media
Common Ground framework for investigating the role of natural language interfaces in computer-mediated communication (CMC)
By Duska Rosenberg
Royal Holloway, University of London
(Special thanks to Renate Fruchter, CIFE and Stanley Peters, CSLI, Stanford University for support, feedback and guidance)
The work presented in this paper is 'work in progress', with two distinct but related objectives. One is to apply the Common Ground approach in studying language use in real-world workplace. The other is to examine the relevance of such studies to the conceptual design of natural language interfaces in media and virtual spaces.
The concept of the Common Ground - 'a type of information sharing' - has a long history in linguistics and cognitive psychology, including the work of Karttunen and Peters (1975), Stalnaker (1978), Clark (1992, 1996) and others (cf. Endnote 1). Conversation is regarded as the primary mode of communication and most of the studies of common ground are focused on face-to-face conversation where speakers and listeners collaboratively establish the common perspective on the communicative situation.
In contrast, in a real-life workplace, such as an open-plan office, a construction site or a manufacturing shop floor, conversation is only one means by which people communicate. It is often the case that information obtained from conversations has to be integrated with information obtained from other sources, such as documents, databases and various activities in the workplace. Increasingly nowadays, information from all these sources is obtained through facilities provided by multi-media technology.
Several key observations from workplace studies (cf. Endnote 2) guide the work on the application of the common ground framework in this context:
One of the implications of these observations for the common ground framework is that we need to take into account the ways people engaged in conversations make use of information obtained from other channels and other resources. These channels and resources provide information that is as important for creating the common ground as the information obtained from the conversation itself (Rogers, 1993). In technology-mediated conversations ways must be found to make such information accessible to speakers and listeners, as well as overhearers and absentees. Since much of this information will have to be displayed on interface screens, such requirements will have to be taken into account in user interface design. For example, it will be necessary to design computer screens to show not only speakers/listeners and their talk, but also the documents, drawings and objects they use as shared artefacts to support their talk.
Another implication concerns different degrees of individual involvement in conversations or meetings. As Clark (1992, p. 197) points out, 'listeners who participate in a conversational interaction go about understanding very differently from those who are excluded from it'. In the design of user interface we need to account for different degrees of commitment and responsibility that are required of individuals, as well as identifying the appropriateness conditions for changes in their involvement. There seem to be significant differences in the nature of informational resources required by 'listeners', 'overhearers' and 'absentees' who will pick up information about the conversation indirectly, through records or other shared artefacts.
Theoretical issues that arise in this context include:
One of the key issues for the analysis of interaction in this context concerns the ways different degrees of participation are made publicly visible, whereby the signals for joining and departing from a given conversation make an essential contribution to this visibility. Such mechanisms form an integral part of implicit communication that constitutes and makes use of the common ground. In technology-mediated spaces communication tends to be overly explicit, and work is often interrupted by the need for 'interaction management'. This leads to breakdowns, cumulative misunderstandings, and may result in sheer exasperation on the part of the participants (cf. Hindmarsh et.al, 1998).
In a study of computer-mediated communication (Fruchter, 1998, Rosenberg, 2000b), a group of designers were observed over a period of three months as they worked collaboratively on a shared task. Three group members were co-located and one (KC) was at a remote location, linked with the others through state-of-the-art CMC technology. One of the dominant features of CMC conversations was the 'KC are you there?' problem. KC's presence was very hard to establish. Making contact with him took considerably longer than normal, although it improved as the participants learnt to trust the technology, or found effective methods for repairing the breakdowns. Maintaining contact was often done explicitly, as direct questions - 'KC are you there?', 'KC can you see us?' 'KC, do you have the PowerPoint file open?' Much of the implicit common ground shared by the co-located participants was not, or was not assumed to be, also shared by KC. Verbal commentaries, such as in 'KC, Kate just went to fetch the documents '. frequently interrupted the flow of work.
Perhaps the most serious problems concerned 'situational ambiguity' of KC's circumstances. If there was no movement on the screen, the others would assume that the technology link broke down, or that KC may have gone away and the picture on the screen was several minutes old, or, that KC was thinking about the problem. The appropriate repair strategies based on each of these assumptions would be quite different, and a wrong assumption would often cause further misunderstandings.
Linguistic analysis of conversations under these circumstances was based on the assumption that the dominant use of language was 'diagnostic', indicating that a communication breakdown occurred or was suspected. Shared artefacts that provided the visual channels for the display of diagrams, web sites and other resources were often used opportunistically to augment the linguistic channels and resources ('Open my CAD file on drive T and you'll see the kind of trajectory I'm talking about.' or 'OK, check the spelling of the file name in my email.) Various channels were sometimes used interchangeably to support the communicative effectiveness of the media space.
However, linguistic channel does have a distinct contribution to make that is not interchangeable with any other channel. It is essential for expressing how an individual participant views a particular communicative situation and how participants jointly establish 'the reciprocity of perspectives'. This can best be illustrated by an observation of a conversation in a design office in the centre of London. The office was in a smoke-free building and the smokers were 'banned' to the fire escape stairs at the back. It was a cold February morning when two smokers stood on a narrow landing, shivering, smoking and talking. A third one saw them from his desk, finished what he was doing, took a packet of cigarettes out of his pocket and waved it to them, knocked at the door, opened it slightly and asked 'Can I come in?'. 'Of course', they said unanimously and made room for him on the landing.
The argument presented in this paper is that the media and virtual spaces should be designed so as to facilitate and support language use in the ways that are so natural when all concerned are in the same place at the same time. The design of natural language interfaces must therefore take place in the context of designing other channels, so that we can focus on the interaction between the language channel and the others provided in a multi-media information environment. In other words, we must concentrate in the first instance on trying to understand empirically the unique contribution of linguistic channels to the creation of the common ground. In the workshop, I would like to propose a method for explicating the role of language in multi-media, based on a distinction between observable actions, changes in shared artefacts and the communicative effect on speaker/listeners, overhearers and absentees.
References
Clark, H., 1992: Arenas of Language Use. The University of Chicago Press
& Center for the Study of Language and Information.
Clark, H., 1996: Using Language. Cambridge University Press.
Fruchter, R. (1998) Roles of computing in PBL: problem-, project-, product-, process-, and people based learning. Artificial Intelligence for Engineering Design and Manufacturing, 12, p.65-67.
Hindmarsh J., Fraser M., Heath C., Benford S. & Greenhalgh C., 1998, Fragmented Interaction: Establishing mutual orientation in virtual environments, in ACM98 Proceedings of CSCW98, Seattle, November 14-18, pp 217-226
Karttunen, L. and Peters, S., 1975: Conventional implicature of Montague grammar. Berkeley Linguistic Society, 1, 266-278.
Rogers Y., 1993: Co-ordinating Computer-Mediated Work, CSCW 1: 295-315, Kluwer Academic Publishers, The Netherlands.
Rosenberg, D., 2000a: Three Steps to Ethnography - A Discussion of Inter-disciplinary Contributions. In L. Pemberton (Ed.): Communication in Design, AI&Society, Vol. 15, No.1.
Rosenberg, D., 2000b: Online Information Environments. In Proceedings of ICCBEE-VIII, August 2000, Stanford University.
Stalnaker, R.C., 1978: Assertion. In P. Cole (Ed.), Syntax and Semantics, vol. 9: Pragmatics. New York: Academic, 315-332.
1. Clark (1992, 1996) provides a comprehensive survey of related literature.
2. The workplace studies were carried out as part of the EU funded project - AC017, September 1995 - March 1999. See also Rosenberg, 2000a for details of these and related empirical studies of collaboration and communication in manufacturing and construction.