EURISCO
4, avenue Edouard Belin
31400 Toulouse, France
The purpose of this paper is to
examine a study of user needs that was inherently tied to domain elements, and
to distinguish areas of possible cross-domain knowledge and methods from areas
that are likely to remain domain-specific. The paper will present a study of
changes to user manuals by operators of a complex interface, in this case for a
commercial aircraft. From an examination of this study, I will review the
usefulness and limitations of cross-domain application with respect to methods,
taxonomic classifications, and use of manuals as a source for user-oriented
design knowledge.
The use of complex interfaces in
safety-critical domains can be characterized, in large part, through
accommodations to domain-specific-and even user-specific-factors. One way of
assessing the effects of domain factors on use is to examine changes in use
made by classes of users. In the case of aircraft cockpit systems, for example,
uses of the aircraft's interfaces are codified in an operating manual (OM),
which is published by the manufacturer. And, of great value to the study of use
of interfaces, some airlines publish, in whole or in part, a revised OM that
details uses specific to the airline. Because of the interface-procedure
duality (Boy, 1998; Novick, in press), user-initiated changes to user-interface
procedures reveal user attitudes and preferences about the underlying
interface. So in the aviation domain, researchers have available a written
record of the changes in use that result from user or situational
characteristics. These changes are reflected in differences in procedures for a
particular aircraft.
To understand the domain-, user-
and situation-specific reasons for changes in aircraft operating procedures by
airlines, I undertook, as part of a project funded by Aerospatiale Airbus, a
review and analysis of the rationale of airlines for representative changes to
their OMs. The study was motivated by the promise of helping developers of
procedures for new interfaces in taking advantage of industry experience and
meeting more directly the customers' needs. This paper will report on the
methods used, present a taxonomic analysis of OM change rationale, and evaluate
the validity of the taxonomy.
The study gathered data on OM change
rationale from three airlines, which I will call AX, AY and AZ. Airline AX is a
North-American carrier, airline AY is an Asian carrier, and airline AZ is
European carrier. All three airlines have world-wide routes.
The study used two approaches in
identifying OM change rationales. The first approach ("closed-ended")
involved selection of known OM changes. Selections were made on the basis of
obtaining a variety of kinds of change and possible variety of change
rationales. In particular, the criteria revolved around apparent changes to the
substance of a procedure or some other significant difference in content
presented to the crew. In this approach, I reviewed prior research reports in
which OM changes had been identified for airline AZ. Queries were sent to AZ
about the reasons for these changes, and the airline responded with an
exhaustive account of the change rationales for the items identified.
The second approach
("open-ended") involved asking key airlines to identify and explain
changes to their OM. Queries were sent to airlines AX and AY. Airline AY
responded with written materials. Airline AX provided written materials and
generously sent two airline employees to EURISCO for interviews.
Based on the materials generated
in the airlines' written responses and through the AX interview, the rationales
for specific changes in the OMs of airlines AX, AY and AZ were identified and
listed. The identified rationales were then classified in terms of their
component factors. The results of this analysis are reported below.
The data provide 68 instances of
OM change rationale identified. From these, categories and super-categories of
rationale were developed. In the terms of Lampert and Ervin-Tripp (1993), the
taxonomic analysis was empirically derived rather than based on an a priori
theory. The approach used was an iterative alternation of bottom-up and
top-down analysis, which was intended to generate the analysis's implicit
theory from the data, with refinement of the theory based on knowledge of the
field. The multiple bottom-up and top-down passes led to the creation of new
categories, recategorization of instances, grouping of sub-categories,
formation of super-categories, combining categories, and renaming categories.
Overall, the change rationale can
be classified in terms of five taxonomic categories: safety, explicitness,
efficiency, regulatory requirements, and economy. The safety category includes
six sub-categories: minimizing the effects of error, improving crew
coordination, using spatially based routines, leave safety items armed, keep
most critical pages of system information displayed, operations and
consistency. The explicitness category includes four sub-categories: add
specific procedure items for items otherwise left unclear or implicit, add
extra material providing rationale, other background or local specifics,
clarify unclear terms, and inform crew early about operational limits. The
efficiency category includes three sub-categories: avoid delays, omit
unnecessary items, and omit items covered in other ways.
Here is the final taxonomy, with
the rationale instances identified by numbers associated with the airline
identifier. That is, AZ 3 refers to the third instance of change rationale from
airline AZ. In some cases, rationale instances have been classified in more
than one category. The details of the change rationale instances were used in
the classification but are beyond the scope of this paper.
1. Safety 1.1 Minimize effects of error (AZ 1; AX 49) 1.2 Improve crew coordination (AY 1; AX 15) 1.2.1 Do not use checklists in the air (AX 2) 1.2.2 Use calls for abnormal rather than normal conditions (AX 36) 1.2.3 Eliminate or move items out of high-load times to keep crew inthe loop (AX 38, 39, 44) 1.3 Use spatially based routines (AX 14) 1.4 Leave safety systems armed (AX 32) 1.5 Keep most critical pages of system information displayed (AX 44) 1.6 Operations (e.g., avoid flameouts from ingesting snow) (AX 45) 1.7 Consistency 1.7.1 Consistency between OM and cockpit labels (AX 13, 23) 1.7.2 Consistency among OM procedures (AX 28, 51) 1.7.3 Fleet commonality/company standard (AZ 8; AY 2; AX 6, 9, 10,19, 25, 32, 34, 47, 54)2. Explicitness 2.1 Add specific procedure items for items otherwise left unclear orimplicit, including calls and their wording (AZ 7; AY 4; AX 12, 17, 24, 25, 42) 2.2 Add extra material providing rationale, other background, or localspecifics (AY 3, 5; AX 47) 2.3 Clarify unclear terms (AX 29, 34, 51) 2.4 Inform crew early about operational limits (AZ 2)3. Efficiency 3.1 Avoid delays (AZ 4; AX 4, 16, 20) 3.2 Omit unnecessary items (AZ 6; AX 11, 22, 33, 53) 3.3 Omit items covered in other ways 3.3.1 Items covered by other personnel (e.g., dispatch office, cabincrew) (AZ 5; AX 3, 8, 21, 30) 3.3.2 Items covered elsewhere in OM or operations manual (AZ 3; AX26, 41, 48) 3.3.3 Crew's prior knowledge 3.3.3.1 Knowledge of system (AZ 3; AX 1) 3.3.3.2 Knowledge of airmanship (AX 27, 28, 31)4. Regulatory requirement (AY 6; AX 5, 37, 43, 46)5. Economy 5.1 Save fuel (AX 7, 50) 5.2 Use packs rather than auxilliary power unit (AX 40)
Using four factors for assessing
the taxonomic categories indicates that the categories have a reasonable prima
facie validity.
Taxonomic analysis, such as
categorization of items like activity or rationale, is often simply a one- or
two-step process (see, e.g., Kirwan & Ainsworth, 1992). In taxonomic
analyses involving repeated application of the categorization to large sets of
data, it is possible to validate the taxonomy through measures of interrater
reliability, such as Cohen's Kappa (Lampert & Ervin-Tripp, 1993; Carletta,
1996). Where the data are limited or the research is exploratory, however,
single-analyst categorization is accepted (see, e.g., Levinson, 1992; Wood,
1996).In such cases, the validity of the taxonomy depends on its inherent
plausibility rather than external assessment; the reader as much as the analyst
is entitled to assess independently the correctness and usefulness of the
taxonomy, although some prima facie indices of validity can be
applied. These indices, adapted from Lampert and Ervin-Tripp (1993), include:
To these indices, I can add a
fourth:
Applying the four indices of
validity to the rationale taxonomy suggests a reasonably high level of a
priori validity. First, the categories relate directly to primary concerns
of aircraft operators and OM authors. Safety, efficiency, regulatory
requirement clearly cut across both kinds of users. The category of
explicitness is less obvious in its interest, but the category stems directly
from the language of the airlines themselves, and has a clear theoretical basis
in the safe operation of aircraft (see, e.g., Novick & Juillet, 1998).
Airline AX's emphasis on explicitness may be attributable to the airline's
multicultural environment.
Second, the categories are, for
the most part, clearly defined. The cases in which I am least confident involve
categories 2.2 ("Add extra material providing rationale, other background,
or local specifics") and 5.2 ("Use packs rather than APU"). In
the former, this is partly due to the relatively general nature of some of
airline AX's explanations. In the latter, this is due to my limited
understanding of the economics of the relationship between the use of packs and
the APU. Otherwise, the categories are reasonably self-explanatory.
Third, the categories are
exhaustive for this data set. There are no miscellaneous categories.
Fourth, all but eight of the 32
categories are supported by multiple instances of change rationale. Of these
eight, five have a high degree of face validity:
1.2.1 Do not use checklists in the air 1.2.2 Use calls for abnormal rather than normal conditions 1.3 Use spatially based routines
The first three of these cases
consist of single instances of rationale that have, according to the airline,
been applied across many sections of the OM. They represent explicit company
policy.
1.6 Operations (e.g., avoid flameouts from ingesting snow)
This case appears to
representative of a class of specific changes resulting from experience with
operating conditions.
2.4 Inform crew early about operational limits
This case arises from the
explicit comment of one of the airlines. It is possible that the study did not
find additional instances because the study looked at change rationale rather
than authoring rationale in general. That is, this justification may exist for
many unchanged elements in the original OM, where the text would not have had
to be changed by the airline.
I have less confidence in the
remaining three categories:
1.4 Leave safety systems armed
As a category, this case is
plausible. However, it may represent an incorrect generalization of the
underlying rationale. After all, not all safety systems are always armed.
1.5 Keep most critical pages of system information displayed 5.2 Use packs rather than auxilliary power unit
These are cases where there is
likely to a more general category (such as "Display salient
information") but where the data did not contain other like instances. As
it is, the categories seem over-specific. It could be the case that the
instance supporting category 5.2 might fall in the "save fuel"
category if, for instance, it turned out that use of the packs was less
expensive than the fuel consumed by the auxilliary power unit.
Overall, then, application of the
four validity factors suggests that the taxonomic categories, particularly for
the top level, are reasonably good.
Having, at some length, presented
a domain-based example of taxonomy development for understanding requirements
for human-system interfaces, I now turn to an analysis of the common ground and
key differences across domains arising from the change-rationale study. I will
discuss the study's implications for cross-domain application with respect to
methods, taxonomic classifications, and, briefly, use of manuals as a source
for user-oriented design knowledge.
The methods for taxonomy
development and evaluation are not inherently tied to the aviation domain. In
fact, the antecedants for these methods came from other, unrelated domains.
The taxonomy-development method
is domain-independent with respect to its bottom-up/top-down approach, but
requires extended domain knowledge for the groupings and classifications. In
other words, the method is domain-neutral but relies on domain knowledge.
The taxonomy-evaluation method is
also nominally domain neutral, but some criteria are more linked to domain
knowledge than others. Thus the criteria
are truly domain neutral, because
they are measure absence of "miscellaneous" categories and the number
of items classified into each category. I note, though, that whether an item should
be classified in a particular category depends on domain knowledge. Consider
the study's analysis of the categories supported by only a single example of
change rationale. The analysis uses concepts such as face validity and
experience, and discusses the effects of limitations of the analyst's domain
knowledge on the outcome of the classification.
The other two criteria
can be applied to virtually any
domain but, again, depend on domain knowledge. The extent of this dependence
can be judged, in part, by reviewing the study's analysis, which uses concepts
such as (a) judgments about whether categories relate directly to primary
concerns of aircraft operators, (b) independently obtained knowledge of an
airline's operating environment or culture, and (c) the limitations of the
analyst's domain knowledge.
I suggest that these methods are
thus usable across domains but that confidence in their results depends in
large part on the strength of the analyst's knowledge of the domain.
To what extent did the
rationale-change study produce domain-independent results? By this, I mean to
ask whether the top-level (or even some of the other-level) categories are
sufficiently broad or domain neutral that theycould be serve a useful
analytical role in the assessment of interface requirements in domains other
than commercial aircraft cockpits.
Some top-level categories do seem
relatively domain independent. These categories include efficiency and,
perhaps, economy. Other categories are still general but could easily not apply
to many domains. These categories include safety and regulatory requirements.
Finally, the category of explicitness seems especially linked to the domain of
manuals for safety-critical systems.
Looking at the sub-categories,
again some appear more domain-independent than others. For example, categories
like "Clarify unclear terms" and "Omit unnecessary items"
would appear to have broad application across many domains. I also can identify
some elements that have a clear, domain-specific nature. Even such apparently
application-independent factors such as consistency across fleets is a
characteristic of the aviation domain. Such characteristics, thus, are likely
to be shared with other domains in which heterogeneity of interfaces and
systems plays a role.
Other elements are so
domain-specific that they appear to be of limited use for purposed of
generalization across domains. Operational conditions such as avoiding
ingestion of snow appears to be in this category. However, this sort of example
may be useful for interface and procedure designers in that it points out that
variations in conditions of use may be so extreme as to lead to systematic
changes in manners of use.
This discussion suggests that
there are further levels of abstraction possible from the change-rationale
taxonomy's top-level categories. Moreover, the differences in domain-dependence
among categories both at the top-level and in sub-categories suggests that the
taxonomy contains categories that are related at possibly inconsistent levels.
That is, a better taxonomy might ensure that categories at a given level of
abstraction have similar levels of domain dependence and domain independence.
One other kind of domain
dependence involves the sources for the user requirements that were the focus
of the study: changes to user manuals. At first blush, one might think that the
case of having customers customize their user manuals was so rare that, as a
technique, analysis of change rationale would be limited to a very small set of
domains. While it is true that publishing customized manuals may be infrequent,
there are analogous sources of information in many domains. In particular,
study of individuals' copies of manuals is likely to disclose notes by
the user that indicate things like key items, needs for clarification, and
awkward procedures. Even for a walk-up-and-use kiosk, consider looking at a
kiosk in use to see if anyone's written graffiti that reflects use of or
reaction to the user interface!
This research was funded by
Aérospatiale Aéronautique. I also thank Guy Boy for his helpful comments on
this paper and the airlines contributing to the study for their valuable
assistance.
Boy, G. Cognitive function
analysis for human-centered automation of safety-critical systems. Proceedings
of CHI 98 (Los Angeles, CA, April, 1998), 265-272.
Carletta, J. (1996). Assessing
agreement on classification tasks: The Kappa statistic. Computational
Linguistics 22, 2, 249-254.
Kirwan, B., and Ainsworth, L.
(1992). A guide to task analysis. London: Taylor & Francis.
Lampert, M., and Ervin-Tripp, S.
(1993). Structured coding for the study of language. In Edwards, J., and
Lampert, M. (eds.), Talking data: Transcription and coding in discourse
research. Hillsdale, NJ: Lawrence Erlbaum Associates.
Levinson, S. (1992). Activity
types and language. In Drew, P., and Heritage, J. (eds.), Talk at work.
Cambridge: Cambridge University Press.
Novick, D., and Juillet, J.
(1998). Documentation integrity for safety-critical applications: The COHERE
project, Proceedings of SIGDOC 98, Quebec, September, 1998, 51-57.
Novick, D. (in press). Using the
cognitive walkthrough for operating procedures, Interactions.
Wood, L. (1996). The ethnographic
interview in user-centered work/task analysis. In Wixon, D., and Ramey, J.
(eds.), Field methods casebook for software design. .New York: John
Wiley & Sons.