European Institute of Cognitive Sciences and Engineering
Center for Spoken Language Understanding
In this paper we review the unsettled definitions of initiative and mixed-initiative, discuss a set of dialogue fragments or situations that raise questions about the role of initiative (as traditionally defined), and introduce some ideas leading toward a theory of initiative as a composition of more fundamental factors such as choice of speaker, task and outcome.
Mixed-initiative interaction seems to be one of those things that people think that they can recognize when they see it even if they can't define it. The earliest work on mixed-initiative didn't define the term explicitly (cf., Carbonell, 1970). Initiative is typically expressed as "taking the conversational lead" (Walker & Whittaker, 1990) or "driving the task" (Smith, 1994).
What these expressions mean is a much cloudier question. It is often held (e.g., Smith, 1994) that initiative does not include turn-taking but some (e.g., Novick, 1988; van Lier, 1988) have used "mixed initiative" to describe the case where the turns are negotiated rather than rotely determined by a single party or the modality of interaction. Another definition involves characterizing varying initiative in terms of multiple, interacting speaker-specific plans versus a single joint plan (Kitano & Van Ess-Dykema, 1991). Other definitions distinguish whether information was independently contributed to the conversation or, in contrast, was produced to meet some sort of dialogue obligation (e.g., Fischer, 1990; Walker & Whittaker, 1990).
Models of mixed initiative also differ with respect to whether they can be characterized by (1) a single axis running from one conversant to the other (e.g., Novick, 1988; Walker & Whittaker, 1990; Smith, 1994; Guinn, 1995) or (2) multiple independent axes (e.g., van Lier, 1988; Kinginger, 1994).
From the tangle of definitions, it is possible to identify some individual threads. That is, there seem to be a number of factors that compose the general notion of initiative. We suggest in Section 3 that it is, in fact, appropriate to view initiative as a multi-factor rather than uni-factor concept.
We explore the issue of factors of initiative through presentation and analysis of a series of "scenarios," which are fictitious exchanges between a spoken-dialogue system (S) and a user (U). Utterances labeled with letters (a, b, c) are alternatives for purposes of the analysis. The domain of the scenarios involves scheduling a telephone service appointment. This domain is discussed in more detail by Fanty et al. (1995).
1.1a S: I can schedule an appointment for you. What day?
1.1b S: I can schedule an appointment for you. Available days and times include: Monday at 9:30 a.m., 10:15 a.m., 2:15 p.m., and 4:45 p.m; Tuesday at 8:00 a.m., 8:15 a.m. ...
In Scenario 1, conversant S's choice of approach determines who gets to speak and when. In turn, this affects who introduces possible times for the appointment. From U's perspective, utterance 1.1b is less desirable, although this merely suggests that it is not a good idea to hog the turn if the other party has the dialogue's key information. In a multi-party interaction, the turn may be allocated to a specific individuals or addressed to everyone. In both dyadic and multi-party interactions, any participant may choose to interrupt the speaker thereby self-allocating the floor.
2.1 S: I can schedule an appointment for you. What day?
2.2a U: Tuesday
2.2b U: Tuesday at 11:30 a.m. would work for me.
2.2c U: I need an appointment in the evening.
2.2d U: So I have to wait to get this service?
Scenario 2 shows how different responses by conversant U indicate different actions with respect to control of the topic. Utterance 2.2a conforms to the topic set by conversant S by providing the requested information. In utterance 2.2b, U (helpfully?) tries to "speed up" the conversation. She does so by providing additional information which extending the topic that conversant S had explicitly set. This action can be regarded as taking control of topic by U. This is a fairly minor change of topic (cf. utterances 2.2c and 2.2d); nevertheless, U's action has gone beyond the "day" topic to a new, "time" topic that was prefigured but implicitly excluded by S's utterance.
This sort of control, which we can generally refer to as that of "task," can be viewed as a contest--more or less polite--between conversants about the boundaries of relevance within Grice's maxim of relation (Grice, 1989). Utterances 2.2c and 2.2d challenge assumptions thought by U to underlie S's utterance 2.1. In 2.2c, this assumption is that "day" refers to "during the day" rather than to "a date". In 2.2d, this assumption is that an appointment is necessary at all. Utterance 2.2b also tests Grice's maxim of quantity, except that here U is not challenging an assumption about S's intentions; instead, U is relying on this assumption--that an appointment has both a day and a time. Although one could characterize these utterances as uncooperative, if we assume that conversant U is being sincere, 2.2b and 2.2c probably should both be characterized as cooperative. This suggests that changes of control of task do not depend on whether the assumptions underlying the preceding utterance are being challenged. Rather, we can see initiative as a capability to set task based on the cooperative effort of a conversant to achieve her goals.
3.1 S: Would you like to have an appointment?
3.2 U: Yes.
3.3a S: Okay, how about Friday at 2:30 p.m.
3.3b S: Okay, on what day?
In scenario 3, the question of interest centers on determination of task outcome. Note that in this conversation, there's agreement as to topic but that utterances 3.3a and 3.3b differ with respect to who will set the time of the appointment. In 3.3b S appears to keep control of the conversation with respect to topic but has transferred to U control with respect to outcome. Note that 3.3a is a felicitous dialogue act if, for example, this was the only appointment available; one can imagine the burden on U if S were overly deferent and invited U to "choose" the one available slot through a lengthy sequence of suggestions.
4.1 S: I can schedule an appointment for you. What day?
4.2 U: Tuesday.
4.3 S: What time on Tuesday?
4.4 U: In the morning.
4.5a S: What time on Tuesday morning?
4.5b S: What about 9:30 a.m. Tuesday morning?
Initiative with respect to outcome can be much more subtle than that displayed in Scenario 3. For example, Scenario 4 demonstrates a case where we believe it would be infelicitous for S not to take control of outcome. According to Grice's maxim of quantity, the fact that in utterance 4.4, U chooses to suggest a time range rather than select a specific time implies that she has some flexibility. If S then declines to take control of the outcome, as in 4.5a, the result is a repetitive and rather unconstructive set of exchanges.
From the elements discussed with respect to the scenarios, we suggest that it is possible to extract three factors that help untangle aspects of previous models. These factors are choice of task, choice of speaker, and choice of outcome:
Choice of Task. One factor is choice of task. The initiative models of Grosz and Sidner (1986), Walker and Whittaker (1990), Carbonell (1970), Smith (1994), and Guinn (1995), can all be viewed as involving determination of what the conversation is about. For example, Guinn describes initiative in terms of determining which goal decomposition will be followed in the conversation. In Smith's (1994) model, the "directive" level of control is characterized by the computer's recommending a task goal for completion.
Choice of Speaker. Another factor is control over choice of speaker. The initiative models of Novick (1988), van Lier (1988), Kinginger (1994) and Burke (1994) all explicitly consider turn-taking as a component or indicator of initiative. Initiative, in this sense, has consequences for the structure of the dialogue. Burke (p. 99) noted that the "initiatory turn of a sequence thus controls not only the floor at the time of the turn itself, but also controls other turns in the sequence and, therefore, constrains subsequent turns to conform to the rules of the sequence." This is one of the factors that differentiates, for example, an automated teller machine from a human teller.
Choice of Outcome. A third factor involves choice of outcome. This is a logical extension of control of task, in the sense that "to complete the goal" (Smith, 1994) means producing a result intended by one or more of the conversants. The choice of outcome then follows the determination of task and involves allocating the decision or action necessary to achieve the task.
It might also be argued that whether information is "volunteered" in the course of a conversation is also a factor in initiative. Models of initiative that to some extent incorporate this concept include those of Walker and Whittaker (1990) ("Conversational partners ... feel free to volunteer information that is not requested..."), Fischer (1990) ("volunteering information"), and Smith (1994) ("the computer is free to mention relevant, though not required, facts as a response"). This idea is an appealing one, if only because it touches on the intuitive feeling that conversational initiative involves the freedom to do what one wants. In our view, though, the idea of volunteering is an effect rather than a factor. That is, communicative behaviors determined by factors of choice of task, speaker and outcome show up as instances of volunteering. Thus an attempt to change the task (or the "topic") is implemented via an utterance that appears to be volunteering (see, e.g., utterance 2.2b).
It is clear from the literature that no consensus has yet been reached as to formal definitions of the terms "initiative" and "mixed-initiative." We have summarized the leading views and opinions that make up the tangle of definitions in the literature. We have explored the role of initiative by examining a series of scenarios that illustrate aspects of dialogue control that we believe are fundamental to a theory of initiative. These factors include choice of task, choice of speaker and choice of outcome.
The authors thank Brian Hansen and Karen Ward for their discussions and analyses. This research was supported by the National Science Foundation, the Advanced Research Projects Agency, and the member organizations of the Center for Spoken Language Understanding.
Burke, P. (1994). Segmentation and control of a dissertation defense. In Grimshaw, A. (ed.), What's going on here? Complementary studies of talk. Norwood, NJ: Ablex. 95-124.
Carbonell, J. R. (1970). AI in CAI: An artificial intelligence approach to computer-assisted instruction. IEEE Transactions on Man-Machine Systems, 11(4), 190-202.
Fanty, M., Sutton, S., Novick, D., and Cole, R. (1995). Automated appointment scheduling, ESCA Workshop on Spoken Dialogue Systems, Vigso, Denmark, May,1995, 144-47.
Fischer, G. (1990). Communication requirement for cooperative problem solving systems. Information Systems, 15(1).
Grice, P. (1989). Studies in the way of words. Cambridge, MA: Harvard University Press.
Grosz, B., and Sidner, C. (1986). Attention, intentions, and the structures of discourse, Computational Linguistics, 12(3), 175-204.
Guinn, C. (1995). The role of computer-computer dialogues in human-computer dialogue system development. Empirical Methods in Discourse Interpretation and Generation, AAAI Spring Symposium Series, March, 1995, 47-52.
Jones, J. (1993). Explanation in mixed-initiative systems. Proceedings of the Conference on Artificial Intelligence Applications, 456.
Kinginger, C. (1994). Learner initiative in conversation management: An application of van Lier's pilot coding scheme. The Modern Language Journal, 78(1), 29-40.
Kitano, H., and Van Ess-Dykema, C. (1991). Toward a plan-based understanding model for mixed-initiative dialogues. Proceedings of the 29th Meeting of the ACL, 25-32.
Novick, D. G. (1988). Control of mixed-initiative discourse through meta-locutionary acts: a computational model. Technical Report CIS-TR-88-18, Department of Computer and Information Science, University of Oregon.
Smith, R. (1994). Spoken variable-initiative dialogue: An adaptable natural-language interface. IEEE Expert, Feb., 1994, 45-50.
Smith, R., Hipp, R., and Biermann, A. (1995). An architecture for voice dialogue systems based on Prolog-style theorem proving. Computational Linguistics, 21(3), 281-320.
van Lier, L. (1988). The classroom and the language learner: Ethnography and second-language classroom research. Essex, England: Longman.
Walker, M., and Whittaker, S. (1990). Mixed initiative in dialogue: An investigation into discourse segmentation. Proceedings of the 28th Meeting of the ACL, 70-78.
Published as Novick, D., and Sutton, S. (1997). What is mixed-initiative interaction?, Papers from the 1997 AAAI Spring Symposium on Computational Models for Mixed Initiative Interaction, Stanford University, March 24-26, 1997, Technical Report SS-97-04, AAAI Press.