CHI 2000 Workshop on Natural Language
Interfaces
Motivating the cross-fertilization
between HCI and Natural Language Processing
Cécile Paris and Nadine
Ozkan
CSIRO/MIS
Locked bag 17, North Ryde
email: {firstname.lastname@cmis.csiro.au}
1. Introduction
Over the past few years, our research group has been
comprised of researchers from both the Human-Computer Interaction (HCI) and the
Natural Language Processing (NLP) communities, and we have thus been exploring how
the two communities can benefit each other. In this position paper, we present
our views on this topic, as well as some examples of how the two disciplines
meet in specific projects. Our expertise in NLP is mostly in a sub-area, namely
Natural Language Generation (NLG), in which one is concerned with producing
text. Our discussion will thus focus on the relationships that can exist
between HCI and NLG.
HCI and NLG have different methodologies and
theories. For example, one important
methodology in HCI is the user-centered design approach, in which a user
requirement analysis and potentially a task modeling analysis are done through
observations, focus groups, story-boarding, etc. In contrast, the most accepted
methodology in NLG research is corpus analysis, where a set of representative
human-authored texts is collected and analysed in depth. Corpus analysis is
used to identify the characteristics of the target texts in terms of textual,
semantic, lexical and syntactic attributes. It seems to us that both
communities would benefit from using each other’s methodology in specific
contexts.
2.1 Complementing corpus
analysis with a user-centered design approach
NLG is concerned with developing systems that produce texts (in a broad sense: written or verbal, self-contained or as part of a human-machine dialogue). These systems are mostly designed by NLG experts and in the absence of HCI expertise. Consequently, the design method used tends to be solely corpus analysis, which provides the target text the system should emulate, and questions of interaction proper are often disregarded. Sometimes, a corpus of the probable input to the system is also collected, and the problem becomes characterizing the mapping that must take place. This still ignores some of the interaction issues that ought to be addressed to design a system. We thus argue that, while corpus analysis does have its utility, it also has limitations for the design of a system, because it is only concerned with the system’s output. A design perspective based solely on the required output is insufficient for good usability. We explore these limitations more fully below.
First, the corpus is often a collection of texts which, depending on how they are obtained and annotated, may be removed from their broader context of use (e.g., the environment in which the text was produced and understood). It is precisely this broad context of use that is being addressed in a user-centered approach to design, and it would be beneficial to add this dimension for the design of a generation system.
The second limitation concerns the fact that the texts in the corpus are human-authored. A generation system then tries to emulate these texts. Yet, the texts generated by a machine will not be like those written by authors. We can expect that, given a machine-generated text, the reader’s expectations will be different from the expectations regarding human-authored texts. Thus, the texts in the corpus should not be the only objects of study for the design of the NLG system. The corpus study should be complemented by a study centered on users and their interaction with the potential generated text.
More generally, researchers in NLP take human-human communication as their model. Yet, the introduction of the computer into this communication, or the replacement of one of the participants by a computer changes the interaction, and potentially also changes user expectations and behavior. It is indeed not clear that a user would “talk to”, or interact with, a computer in the same way as he or she does with another human being, at least not given the current technology. Therefore, researchers designing NLG systems may benefit from elucidating users expectations regarding generated text, thus complementing their approach with the types of analysis done in HCI.
Finally, an NLG system, just like any other system, should be designed with the user in mind. Issues often studied in HCI are important in that context, and they are not necessarily studied by looking at the language to be produced. Examples in the context of a speech system include:
· ensuring that the user’s expectations regarding the scope of the system are set appropriately at the beginning of an interaction;
· determining whether switching from a pre-recorded set of utterances – e.g., the introductory message – to an utterance produced from a synthesizer is likely to confuse the hearer.
While the texts produced are of course an important aspect of an NLG system, they do not appear in isolation nor form the whole system. The system’s context of use, and the users of that system are also important.
2.2 Using a corpus analysis
as part of an HCI study
While corpus
analysis has the limitations we have just outlined, it can be a valuable tool
for HCI practitioners. After all, one of HCI’s strong principles is to speak
the user’s language. Corpus analysis can refine the knowledge that current HCI
methods yield, so as to capture not only individual words and idioms used by
the user population, but also the syntax, level of language, etc. Another area of HCI where corpus analysis
would be useful is in the design of human-computer dialogue systems in the broad
sense (i.e., including hypertext, form filling, etc.), to ensure that these
dialogues are natural, or at least
coherent and efficient. For example, it is not uncommon to see web sites
asking users to go through a series of questions, some of which seem to be
irrelevant to the end-result, or to occur at unnatural points in time. These dialogues are awkward because they are
far from the dialogues one would have with a human being. A corpus study of how two people interact in
similar situations may have informed the designer as to what users are likely
to expect.
2.3 Decoupling semantics and
realization
HCI could also benefit from the general approach taken
in NLG, for example in the design of multimedia systems. NLG separates the semantic
information to be expressed in a text from its linguistic realization, and
attempts to characterize clearly the mapping from semantics to text. This mapping is then done “on the fly”. Similarly, it could be fruitful for the
design of multimedia systems to represent, at the design phase, the concepts
relevant to the system and their relationships in abstract ways (ways which are
independent of the medium). This would help inform decisions about the medium
to be used and the level of redundancy which is desirable among the various
media. The separation between these two levels could be useful as a design tool
(a priori) in HCI. Note that this type
of separation is currently attempted in Coutaz’ recent work (Coutaz, 1998,
Thevenin & Coutaz, 1999), in the context of designing interfaces for the
same application but to different delivery channels/environments (e.g., palm
top, PC, mobile phone, etc). In that
work, Coutaz attempts to identify attributes of the target environment that
play a role in the design of an interface -- such as size of display. These
attributes can then be decoupled from the representation of the task enabled by
the application.
2.4 Exploiting theories from
both language and HCI to study effective communication
At the theoretical
level, once again, the two communities base their foundations in different
theories. Theories of language try to
explain (among other phenomena) what makes a discourse (or dialogue) coherent,
socially acceptable, and appropriate for the reader/listener at hand, and how
various communicative functions are realized in language at the various levels
of the linguistic systems (e.g., semantic, lexical and syntactic). In HCI, the two main theories of influence
are cognitive psychology, which focuses on human perceptual and problem-solving
capabilities, and sociology, in its attempts to understand how people make
sense of their environment (with the world and with each other). HCI has
heavily borrowed and adapted methodologies from both disciplines.
Yet, both HCI and NLG are concerned with the effectiveness of communication, and we can see parallels between their various concerns. HCI design practitioners are concerned with such issues as information grouping and differentiation, consistency with the ways users perform their tasks, and clear specification of the purpose of each interface element. This is analogous to ensuring in NLG that a chunk of text is coherent and achieves one or more specific communicative goals the user can recognize, and that a sequence of such chunks (or moves in a dialogue) is also coherent.
Although not explicitly mentioned in these terms, this view is essentially that being exploited by NLG researchers working on generating “flexible” or “adaptive” hypertext (e.g., Brusilovsky, 1996, Fink et al., 1997 and Dale et al., 1998). In these contexts, web interaction is modeled as a dialogue between the user and the computer, and what is presented in a succession of screens follows that dialogue model. It may be interesting to study the analogy further, with theories from both HCI and NLP/linguistics shedding light on the problem.
2.5 Addressing the evaluation problem in NLG
Finally, evaluation is often a problem for NLG researchers (e.g., Dale & Mellish, 1998). System evaluation is often done through evaluation of the text produced, either against characteristics of the corpus studied (e.g., Vander Linden & Martin, 1995), or against human judgement of readability and coherence as compared to corresponding human-authored texts (e.g., Coch, 1996 and Lester & Porter, 1997). But when designing a system for real use, this type of evaluation is not sufficient, and may not even be appropriate. There are a number of methodologies for evaluation in HCI which should be applied (and adapted) for NLG.
3. Where HCI and
NLP meet
3.1 Language
interfaces
Of course, HCI and NLP should meet in one obvious place: the natural language interface. The main paradigm in HCI design today is direct manipulation. However, natural language interfaces have several advantages over direct manipulation: they allow references to objects that are not directly visible and to events that have occurred in the past or will occur in the future (Cohen et al., 1994). In addition, with the increasing number of small displays (e.g., mobile phones) and mobile devices, vocal interaction between user and on-line services will probably become more prominent. This is an obvious instance where NLG and HCI experts should collaborate.
Speech interfaces are not the
only point of contact between HCI and NLG, though. Another type of interface
where the two disciplines meet is one in which documents act as interface. This is the case, for example, for web
pages, or any form of hypertext. There,
interaction occurs within the document/text. While issues related to language
and dialogue are important here, so are other interactional issues. An example
of these issues is the trade-off between the number of hypertext links the user
must traverse to arrive at the appropriate information and the amount of text
to be presented at each point. Another
example concerns the positioning of new windows and whether the old window
disappears or not. A third example concerns
the way a hypertext anchor is specified, and if and how information about the
target page should be provided. These
issues relate to the interface proper, and the interaction between the user and
the computer.
3.2. Computer
Supporter Collaborative Work (CSCW) or GroupWare
CSCW or GroupWare systems address computer-mediated human communication. The design of these systems can benefit from the extensive studies in NLG, and more specifically: (1) from typologies of communicative intentions; (2) from existing templates for specific types of messages; and (3) from the way formal relationships among the communication partners affect the communication format and language. In addition, NLG studies can inform the design of CSCW systems which propose guidance to users as to message structure, tone, and phrasing.
3.3 Exploiting the
output of HCI as the input for NLG
An important problem in NLG is obtaining and representing in the system the knowledge required producing texts. This includes the knowledge from which texts are generated (i.e., the information underlying the texts) and linguistic knowledge required to produce the texts. In some cases, information produced by HCI researchers (or practitioners) can be exploited in this way for NLG systems. For example, task models can be exploited to generate documentation and on-line help (e.g., Paris et al., 1998): they can provide both the information to be included in the texts, and guide the structure of the texts to be generated. Another example (although not explicitly presented as exploiting an HCI artifact) can be found in (Grosz & Sidner, 1986), where the structure of the dialogue rests on simple task descriptions. An interesting research direction might be to explore whether sophisticated task formalisms which have been generated by the HCI community (e.g., task formalisms which can represent complex task hierarchies, with relationships such as boolean operations, strict and loose precedence, iteration, etc.) can support other type of dialogues. These are two examples of models and tools used by one community that can be exploited by the other.
4. Some
examples of the cross-fertilization as experienced in specific projects
We
will briefly review in this section two projects in which we have been
involved, and show how our joint HCI and NLG expertise benefit them.
4.1 The Isolde Project
The Isolde[1]
project is concerned with the design and development of a tool to support the
production of hypertext-based on-line help for software systems, using language
technology (Paris et al., 1998). The project’s
emphasis was to try to address some of the limitations of current language
technology that prevent its use in realistic settings. In particular, our
concern was with the knowledge acquisition issue: how to obtain the knowledge
required for the generation of on-line help. While the project started as an
NLG project, we quickly found that both our HCI and NLG expertise were required
if we wanted to design and develop a realistic system:
(1)
We
needed to find a representation for the information from which the text was to
be generated. We analysed a corpus of on-line help, and observed that the part
of the documentation that may be generated automatically would be the
procedural on-line help. It is only for
this aspect of the documentation that it may be possible to obtain the
information underlying the text. We
further noted that task models like the ones produced by HCI experts during
system design would provide much of that information. This step required both our NLG and HCI expertise.
(2)
We
studied the task of the prospective system’s end users (the technical writers)
to understand better the place of a system like the one we were proposing to
develop. This was crucial to understand
the scope of the system’s functionality, the potential sources of its input,
etc. This was performed doing a user
requirements analysis and a task analysis, requiring our HCI expertise.
(3)
Once
again working with the potential end users, we designed the possible interface
for creating and manipulating the task models that would provide the input to
the NLG module. We also ensured the readability and usability of the formalism
chosen to represent the task models. This required HCI expertise.
(4)
We
studied the output required of our system, including the nature of the
hypertext required. We used
methodologies from both NLP and
HCI here. We would like to be able to answer several questions: Can we get criteria for good on-line help,
good documentation, and good hypertext-based help? Can we characterize the
differences between manuals and on-line help? Can we characterize how hypertext
is best presented? (As mentioned before, we believe there are some interaction
issues not related only to language, and these issues are often ignored, at
least in NLG systems.). We believe our joint expertise is required to answer
these questions.
(5)
If
the resulting system is to be realistic, the end-users need to be able to
augment or change its linguistic resources. We thus need to understand (1)
which aspect of the linguistic resources could be tuned (NLG expertise),
and (2) how is the user to interact
with them (HCI expertise).
(6)
Finally,
we need to evaluate the system. This should not be an evaluation of only the
generated on-line help, but also of the usability of the system as a whole (HCI
and NLG expertise).
4.2 Tailored Delivery on the
Web
We are also involved in a new internal project
concerned with the tailored delivery of information on the web. This project’s
emphasis is to combine information retrieval, language technology and user
modeling techniques for tailored delivery.
Once again, we find that our HCI expertise complements our language
expertise, in two specific areas:
(1)
In
the design of the interface, and, in particular, in the user interaction
required to build the user model that supports tailored delivery.
(2)
In
the design of the system, as we want to ensure the system can produce a
tailored presentation onto a variety of medium (e.g., written text, web-based
hypertext, display on a palm pilot, etc.). We are applying both our expertise
to study how to represent the constraints of the delivery medium and provide a
mapping between the information to be presented and the medium.
We believe strongly
that the HCI and NLP communities should work together on a wide variety of
problems. We have outlined above where we think cross-fertilization can occur,
and why the combination of the two types of expertise could be beneficial. We
have also presented some specific examples from projects in which we are
involved, pointing out where it has been necessary to apply our joint
expertise.
Bibliography
Brusilovsky
P. (1996) Methods and techniques of adaptive hypermedia. User
Modeling and User Adapted Interaction, 6(2-3).
Coch, J. (1996). Evaluating and comparing three text production techniques. In Proceedings of the Sixteen International Conference on Computational Linguistics (COLING’96).
Cohen, P. R.,
Oviatt, S. L. The Role of Voice in Human-Machine Communication. In Voice Communication Between Humans and
Machines, pp. 34-75, National Academy Press, 1994.
Coutaz, J. Portability of User Interfaces: “Writing it once” is not enough. Invited talk at HCI 98, Sheffield, UK.
Dale, R. & Mellish, C. (1998). Towards evaluation in natural language generation. In Proceedings of the First International Conference on Language Resources and Evaluation, Granada, Spain.
Dale, R. Oberlander, J.,
Milosavljevic, M. & Knott, A. (1998)
Integrating Natural Language Generation and Hypertext to Produce Dynamic
Documents. Interacting with Computers, 11 (2), December.
Fink
J., Kobsa A. A. & Schreck J. (1997) Personalised Hypermedia Information through
Adaptive and Adaptable System Features: User Modeling, Privacy and Security
Issues. In Mullery A., Besson M.,
Campolargo R. and Reed R (eds), Intelligence
in Services and Networks: Technology for Cooperative Competition. Pages 459-467. Berlin Heidelberg: Springer.
Grosz, B. & Sidner, C. (1986). Attention, intention and the structure of discourse. Computational Linguistics, 12:175-206.
Lester, J. & Porter, B. (1997). Developing and empirically evaluating robust explanation generators: The KNIGHT experiments. Computational Linguistics, 23 (1): 65-101.
Thevenin,
D. & Coutaz, J. (1999) Presentations for Mobile Users Adaptation and
Plasticity of User Interfaces. Presented to i3-spring99 Workshop on Adaptive Design of Interactive
Multimedia.
Vander Linden K & Martin, J. (1995). Expressing rhetorical relations in instructional texts: A case study of the purpose relation. Computational Linguistics, 21 (1): 29-57.
[1] Isolde stands for an Integrated Software and On-Line Documentation Environment. The Isolde project is partially supported by the Office of Naval Research ONR) – Grant N00014096-1-0465.