Most models of computer programming explain the programmer's behaviour
by a single design strategy. This paper presents a cognitive architecture
that uses cue-based search to model multiple design strategies,
including procedural, functional, means-end or focal, and opportunistic
design. The model has been implemented in an AI system that generates
Pascal programs from English specifications.
Knowledge is represented as nodes that reside in internal or external
memory, where a node encodes an action that may range from a line
of code or a routine in size. A program is built by linking nodes
through a search cue of the form <role, goal, object>. The
cue is broadcast to memory, and any matching node is returned;
the cue provides a question to ask, and the return provides the
answer. A cue on the newly linked node is then selected as a new
focus, and the search process repeated. Each design strategy defines
a specific node visiting order that traverses the program structure
through its links.
All comments should be sent to R. S. Rist
School of Computing Sciences
University of Technology, Sydney
PO Box 123 Broadway,
Sydney, N. S. W. 2007
Australia
email: rist@socs.uts.edu.au
My thanks to Clayton Lewis for pieces of the puzzle, and to the reviewers for their excellent comments. The model owes a great debt to the work of Donald Norman, Roger Schank, and Bertrand Meyer. This research was supported by the Australian Research Council, grant number AC9130654.
Table of contents1. Procedural structure 22. Plan structure 52.1
Representation 62.2 Algorithm 72.3 Linear structure 83. Cognitive
structure 93.1 Cues, nodes, and search 103.2 Goal 113.3 Object
123.4 Role 144. Cognitive architecture 154.1 Search 154.2 Search
command 164.3 Memory 174.4 Control structure 195. Model simulation
236. Protocol analysis 257. Discussion 307.1 Related work 327.2
Applications 337.3 Summary 34References 36Appendix A: CLOS slot
structure used in Zippy 39A1: Client structure 39A2: Inheritance
structure 39 This paper models program design as a process of
search through the program structure. Some of this structure is
captured in the final program code, but much of the fine structure
is implicit in the design process and is not apparent in the final
artifact. Viewed from a procedural perspective, a program consists
of a set of roles connected by linear order; this is the way a
serial computer views a program. Viewed from a functional perspective,
a program consists of a set of actions connected by data and control
flow dependencies, so they define a non-linear plan that can be
used by a parallel machine. A person sees a program as a set of
nodes connected through their data flow, control flow, or linear
order. Nodes occur at many levels of description, from the most
basic concrete node that corresponds to a line of code, up to
an abstract node that may consists of one or more routines of
arbitrary size.
A node is described or indexed by a tuple of the form <role,
goal, object>; this is the cue for the node. Program design
occurs when a cue is selected, broadcast to memory, and the returned
node linked to the search cue. The search mechanism used to dynamically
link nodes is based on Norman and Bobrow's (1979) model of description,
cue or content based search. Once the data and control flow links
between a set of nodes have been defined, a minimal linear order
can be defined by constraint propagation on the links. Nodes provide
the static structure that is dynamically linked by cues, so program
design is implemented by memory search, and a solution state is
defined by the nodes used in the solution and the cues that link
these nodes.
A design strategy is defined by a start cue, a level, a link type,
and a direction to search. The abstract cue of <find, average,
rainfall> and a direction of backward search through the data
structure, for example, defines the question "Is there a
plan that produces the average rainfall?". The search cue
is broadcast to memory, and a matching target cue is returned.
The search typically returns a node that is linked to the search
cue, and provides a set of new cues, such as <find, sum, rainfall>
and <find, count, rainfall>. A new cue is then selected
and used to search memory, so design proceeds by asking a series
of questions, where the current question depends on the last answer.
If the search fails, then each slot of the cue is used as a separate
search cue to find a matching node structure, and the three returned
structures are merged to define a program node. Knowledge is thus
stored as plan schemas, objects, and roles, combined dynamically
when needed, and the combination stored for later retrieval.
The choice of start cue, level, link type, and direction defines
a design strategy and creates a specific visiting order that traverses
the program structure. Each application of a design strategy creates
a design episode, that finds and connects pieces of the program
structure until there are no places left to search from the initial
point in the given direction in the given dimension; at that point,
a new location, dimension, or direction has to be chosen to continue
design. Procedural design is forward and role based, functional
design is backward and goal based, and object oriented design
is object based. Top-down and bottom-up design traverse the levels
of abstraction in the cues in different directions. This unifying
framework is a gross simplification of the actual design process,
due to the interaction between a search strategy and opportunistic
design, plan creation and retrieval, working memory limitations,
and the structure of the specification and the program.
The process of design can be captured by specifying the type of
questions that are asked, and the knowledge that supplies the
answers. Knowledge can be stored in both internal and external
memory. Internal memory consists of semantic and episodic memory,
and external memory consists of the problem specification, scrap
notes made during design, and the program itself. When a question
is asked, these data sources are searched in turn until an answer
is found, and the answer provides a new piece that is linked to
the existing program structure. The new piece may then be used
to ask the next question, or some other part of the solution may
be chosen as a focus for design. Search continues until there
are no more questions to ask and the solution is complete, or
until an answer cannot be found and the solution cannot be completed.
There is no overall guidance or control; because a design choice
is defined by the question "Where do I go from here?",
all planning is local or situated at the current focus of attention,
and the moving focus of attention defines a trajectory through
the linked structure that is a program. This model of design is
massively redundant, because design continues as needed until
all the vacant slots in the program structure have been filled.
The model has been implemented in an AI system called Zippy, that
allows a user to specify the design strategies to be followed,
the order in which the internal and external memories are searched,
and the content of each memory. The system then generates code
in the order defined by the strategy and knowledge, so the generation
order implied by any particular strategy is completely defined.
The code generation order seen in a human design protocol is easily
matched to the system's generation order, and the system can be
set to reproduce that order. The variability that is characteristic
of human design means that each protocol requires a different
set of parameters, so only a single protocol is described in this
paper.
This paper presents a single cognitive architecture that gives operational definitions of and executable mechanisms for the main program design strategies described in the literature. The first three sections describe the structure of a program: first the procedural or linear structure, then the functional or plan structure, and then the cognitive or node structure of a program. The next three sections show how this structure is used to design a program: a general architecture for program design is presented, simulated, and applied to explain a human design protocol. The last section compares the model to related work in memory search and planning and explores the power of the model.
The procedural or linear structure of a program is defined by
its control flow. More formally, the procedural structure is built
from blocks of code, where each block or unit has a single entry
and a single exit point in the control flow. A single control
flow unit is called a proper (Hartman, 1991), and corresponds
to a unit that encapsulates a sequence, a selection, or an iteration.
The three unit types form the basis of structured programming,
and any program may be built from these three types of blocks.
Most formal analysis of program structure in computer science
(Dijkstra, 1976, Dromey, 1989) has used the control flow to define
a basic unit of analysis.
The standard method for describing the structure of a software
system is a top-down calling chart, that shows the modules in
the system and the control flow between modules. In such a chart,
the control flow links define the order in which modules are called
and executed, so the control flow is explicit in the calling graph.
The modules are executed in a depth-first, left to right order
in the chart, so data flow is implicitly left to right but is
not explicitly shown. Top-down design has been the dominant prescribed
design strategy in the literature for the last twenty years (Dijkstra,
1968; Naur, 1972; Sommerville, 1989; Wirth, 1971; Yourdon &
Constantine, 1979) and has often been observed as a common expert
design strategy (Adelson & Soloway, 1985; Jeffries, Turner,
Polson & Atwood, 1981). Top-down design starts with an abstract
solution that is decomposed and refined until the basic actions
can be implemented in code. Top-down design is reflected in the
calling chart, where the top-level abstract solution corresponds
to the top-level control module, and each level in the chart corresponds
to a more specific version of the solution.
To illustrate the different types of structure used in this paper,
I will use a variant of the classic Noah problem. The problem
statement is: "Write a program that shows the average and
maximum rainfall for a period. The end of the period is signalled
by an input value of -99". A solution program is shown in
Table 1.
1. program Noah (input, output);
2. const endValue = -99;
3. var sum, rain, day, max: integer;
4. average: real;
5. begin
6. sum := 0;
7. max := 0;
8. day := 0;
9. write ('Enter rainfall for day ', (day + 1), ': ');
10. readln (rain);
11. while rain <> endValue do begin
12. day := day + 1;
13. sum := sum + rain;
14. if rain > max
15. then max := rain;
16. write ('Enter rainfall for day ', (day + 1), ': ');
17. readln (rain);
18. end;
19. writeln ('The largest rainfall was ', max);
20. average := 0.0;
21. if day > 0
22. then average := sum/day;
23. writeln ('The average rainfall was ', average);
24. end.
The procedural structure of this program consists of three main
units: the sequence before, within, and after the loop. Each of
these three proper units is easily decomposed in turn, and the
process can be continued until the level of program code is reached.
A unit in a calling chart usually corresponds to a single routine,
so a link corresponds to a routine call and the semantics of the
chart are clean and simple. A chart for the Noah program is shown
in Figure 1, where a link corresponds to a procedural decomposition;
the bottom level of the chart has been omitted for simplicity.
A procedure is easily wrapped around the code in each unit, to
transform the decomposition chart into a strict calling chart.
The order in which actions are executed is directly captured in
the left to right order at any level in the chart. This order
is retained by both the conceptual decomposition (abstract to
concrete) and the physical decomposition (routine calls), so it
is more general to speak of the order of actions, instead of the
calling order; in the code shown in Table 1, for example, there
is a strict order and sequence of propers, but no routines are
called. Each chunk in the chart is based on the role of the code
within that chunk; the example reflects a common procedural decomposition
into the three roles of input, calculate, and output. The procedural
structure of a program is derived from the role structure of the
code, where the roles provide the basic transform for each line
of code. The set of concrete roles for Pascal is shown to the
left of Table 2, and more abstract versions are shown to the right;
there are many more abstract roles than are listed here. In the
Programmer's Apprentice project (Rich & Waters, 1990), a role
is a slot to be filled by some type of data. The usage of the
word "role" in this paper is more based on the role
in a play (Goffman, 1974; Schank & Abelson, 1977) and refers
to a type of slot filler, rather than the actual individual that
fills the slot.
Concrete Abstract
Data flow read input initialize
assign calculate accumulate
write output find
Control flow if test choose
else select
case filter
while loop
repeat
for
Examination of the code in Table 1 and the structure of Figure
1 shows that the structure does not directly correspond to the
set of propers in the code. Looking at the code before the loop,
for example, the proper decomposes into five unit sequences, not
into the initialisations and the read. The abstract roles are
useful descriptive terms for the decomposition, but they do not
correpond exactly to an algorithmic analysis of the code into
propers. A complete analysis into propers is easily done, but
then it becomes very difficult to give labels to each box in the
chart. The reason for this difficulty is that an analysis into
propers does not capture the conceptual chunks in the program;
it captures the executable chunks that reflect the way that the
computer views the code.
The cognitive basis for top-down design is that it allows a problem
to be broken down into parts, so that each of these sub-problems
can be solved in relative isolation. Indeed, top-down design is
often called top-down functional decomposition, to indicate that
the problem is divided into functional units. In the Noah problem
specification, for example, the problem clearly breaks down into
two main sub-problems, to find the average rainfall and to find
the maximum rainfall. Each of these sub-problems can be broken
down in turn, to define a top-down functional decomposition chart;
to find the average, for example, we need to read in a stream
of data, find the sum and count of the data values, and calculate
the average. A partial top-down functional decomposition chart
for the Noah problem is shown in Figure 2.
This chart shows the main conceptual chunks into which the problem
decomposes, but the semantics of the chart are problematic. First,
the lines in the chart no longer indicate control flow, but functional
decomposition. Second, the order of the boxes does not reflect
the linear order of the actions in the code. Third, the functional
chunks do not correspond to the execution chunks (propers) in
the structure chart; indeed, the elements of a functional chunk
may be spread out in the program, as is visible for the sum
chunk. Finally, the order of elements is in part arbitrary; the
sum and count have to be calculated, but either the sum or the
count may be calculated first. The advantages of the structure
chart notation - simple control flow and linear order - are lost
in the top-down functional decomposition. The semantics of the
chart are a mix of functional decomposition and linear order,
an extremely useful combination that has a complex semantics.
The functional decomposition chart captures the rough linear order
of the major pieces within each functional chunk, but loses the
control flow and the total linear order.
Top-down functional decomposition may provide a basis for expert design, but it does not explain novice design. Anderson, Farrell and Sauers (1984) based their GRAPES simulation of novice design on top-down, depth-first flow of control for a set of simple problems. For these problems, analogical mapping from retrieved solutions were successfully used to guide the design of a solution. Kessler and Anderson (1986) presented novice programmers with a solution to one problem, and asked them to modify it to solve a similar problem. The unsuccessful programmers used the linear structure of the solution as a guide, and tried to modify each line of code in turn, from the first to the last, by analogical mapping from the surface structure of the code. Adelson (1984) noted that novices do not possess the abstract knowledge needed to build a top-down solution structure, so novice design in general cannot be explained as search through a (non-existant) top-down chart. These results indicate that the procedural structure of the code, by itself, is not a good representation to support design.
There is considerable evidence in the empirical study of programming
that the plan is the basic cognitive chunk used in program design
and understanding. Exactly what is meant by a program plan, however,
has varied considerably between authors. Two main usages may be
observed in the literature, that may be called top-down and bottom-up
plans. Early accounts of program design emphasized top-down planning,
but attention has now shifted to more bottom-up components of
design. The shift in emphasis and usage is similar to the shift
from scripts (Schank & Abelson, 1977) to memory organization
packets (Schank, 1982). Scripts are large, monolithic structures
that are difficult to modify and index, where MOPs are much smaller
pieces of knowledge that are flexibly combined to form the sequence
of behaviors defined by a script. In the same way, simple plans
are combined to form the long sequence of actions that make up
a serial program.
Top-down plan analysis in programs was pioneered by Soloway, Bonar,
and Ehrlich (1983) who showed that a small change in the code
was enough to change a program from "plan-like" to not
"plan-like". Soloway, Ehrlich, Bonar, and Greenspan
(1984) defined three levels of planning called strategic, tactical,
and implementation plans, where a strategic plan defines a global
strategy, a tactical plan specifies a local strategy, and an implementation
plan specifies the language dependent techniques used to implement
strategic and tactical plans. Spohrer, Soloway and Pope (1985)
analysed the structure of program code into a goal and plan (GAP)
tree by defining a set of abstract sequential or top-down goals
for their sample programs. Using this serial structure, they then
showed that program variation could be described as the choice
of different plans for the same goal.
Bottom-up plan analysis separates the solution structure from
the problem structure. Kant and Newell (1984; Kant, 1985) showed
that design can occur in two main spaces, characterised as algorithm
design space (the solution structure, reflected in the linear
structure of the program) and application domain space (the problem
structure, reflected in the non-linear plan structure of the program).
Design often began with kernel idea in the problem space, that
was then developed by stepwise refinement in the solution space.
Development of the initial idea was uneven when the designer could
not retrieve a known solution, and had to design that part in
more detail. Rist (1986) used a code sorting task to show that
experts group 'related' code on the basis of its non-linear or
plan structure, but novices focus more on the control structure
and on syntactic cues. Letovsky (Letovsky &Soloway, 1986)
showed that programs in which the plans were delocalised were
difficult to understand; a delocalised plan is one in which a
plan's actions are widely separated in the program structure.
Code is often added to a program one plan chunk at a time (Detienne,
1995; Rist, 1990, 1991). Novices are forced to do this because
they code each line as soon as it is designed, so the actions
in the plan appear one at a time, backward from the goal. Experts
can retrieve a complete plan schema as a single chunk, or plan
a solution mentally before writing any code; in these cases, the
code appears in linear order from the first to the last action
(Davies, 1991; Rist, 1989, 1991). For languages in which the plan
structure is clear such as Pascal, code tends to be added in plan
chunks, but not for languages such as Basic where the plan links
are not apparent (Davies, 1990). When a programmer learns a new
language, the plan structure of plans in the first language is
transferred to the new language (Scholtz & Wiedenbeck, 1993),
whether it is appropriate or not.
This section formally and algorithmically defines the plan structure of a program as a set of non-linear plans. Once the plan structure has been defined, the actions in the plans can be placed in a minimal linear order by simple constraint propagation, grouped into chunks that share a role, goal, or object (Rist, 1990), and enclosed in routines. The key to program design is thus the plan structure and not the linear order; plan structure is defined in this section, and the remainder of the paper shows how plan structure is generated during design.
Plan structure shows the backward data and control flow links
between lines of code, that are often defined before the execution
structure of the plan is known. These links define a set of piecewise
dependencies between individual lines of code; when the links
are traced backward from the goal, they define a backward slice
through the program (Horwitz, Prins, and Reps, 1989; Weiser, 1982,
1984; Weiser & Lyle, 1986); the algorithm used in this paper
is given in Rist (1994). The backward dependency structure is
a non-linear plan (Sacerdoti, 1975). Given this plan structure,
it is a simple exercise to order the lines of code by constraint
propagation (Stefik, 1981), so the program can be executed on
a serial computer.
A line of code may be linked to other lines of code in four ways; it may use data from one or more lines, make data for one or more lines, obey another line, or control one or more lines. The external structure of a node is shown in Figure 3.
A line of code is a concrete node in the plan structure, with up to four sets of links and an internal action that generates the program code. A node has four ports, called :use, :make, :obey, and :control; a port is labelled with the CLOS slot notation (:use) to differentiate the port from the general concept (use). The content of each port is the set of nodes that are linked to the current node. Data is passed between nodes through variables, so the :use and :make ports have an entry for each variable used or made by the node. A line can use no, one or more variables, and it can make no, one, or more data values. A line can obey no, or one line; it can control no, one, or many other lines.
The goal of a program is given in the problem specification as
an output, to the user, to a file, or to another progam. The function
of the program is to produce this output, so the functional structure
of the program is defined by the dependency or plan structure
backward from the goal. A complete plan structure for a program
is a backward slice through the program code from an output. The
PARE (Plan Analysis by Reverse Engineering) system produces plan
structure from Pascal code (Rist and Bevemyr, 1991; Rist, 1994);
it is described here to show that "program plan" has
a clear and algorithmic definition that is easily related to the
literature on plan schemas.
From each output, PARE traces back through the data then the control
flow links for that output until it finds a set of terminals:
constants, read statements, or an assignment that uses only literals.
In this way, links are found and stored from each line of code
to the lines that produce data for it, or control it, for every
line that is used in the program. When a routine is found, both
the plan structure in the calling module and the structure within
the routine must be built. The plan structure in the calling module
depends on the type of parameter passed. A call by value uses
a value, but makes nothing; it has no output data flow. A call
by reference uses a value, and makes a value; it has data flow
both into and out of the calling parameters. The plan structure
for a routine is built by tracing back from its outputs, then
the routine is unfolded (Burstall & Darlington, 1977; Feather,
1982) and its plan structure connected to the caller's structure.
Recursive routines are unfolded once.
The product of the tracing process is a directed graph. Joins
are common, when two lines obey the same line, or use the same
data value. When a join occurs, the tracing process terminates
at that node, since all further links from that node are already
defined, and recursively resumes at the last missing link. If
a program has several outputs, the tracing process is repeated
for each goal. The complete plan structure is a directed cyclic
graph with multiple roots.
The Noah program introduced in the previous section has two goals, two values that must be output: the average and maximum rainfall values. The rest of the program is required purely to create and display these two values. The lines of code that define the plan for each goal are found by tracing back from the outputs along first the data, and then the control flow links. The resulting plan structures are shown in Figures 4 (average) and 5 (maximum); in the figure, single arrows indicate data flow, and double arrows indicate control flow. Plan tracing begins with the last write statement, in this case for the average value, and continues back through first the data flow, and then the control flow links. Joins to the plan structure (both to other lines and self-reference) are omitted in the figures for clarity.
The plan structure for the maximum plan is shown in Figure 5. The maximum plan accepts one data flow (a stream of rainfall) and produces one data flow (the maximum rainfall). Because much of the loop control and data structure was already built when tracing the average plan, the maximum plan structure does not include them; the joins from the data and control flow are indicated in the figure. The plan structure separates the plan cleanly and automatically from its context.
The PARE algorithm defines a specific order of link traversal (data then control, recursively), so the links shown in the figure are precisely those created from the algorithm, minus the joins. Readers familiar with the program schema literature will recognise the count, sum, max, and read loop plans. Each plan is a single branch of the plan structure, and the plan actions are spread out or delocalised in the execution structure. The non-linear nature of program plans, and their difference from the execution structure of the program, may be seen by examining the three initialisations (lines 6-8). These lines are separate in the plan structure, and even come from different plans, but are adjacent in the serial execution structure. Plan structure defines the essential dependencies and is used in design, but is hidden in the final execution structure.
A plan is converted to executable code by placing the plan actions
in a serial execution order by a process of constraint propagation
on the dependency structure. Plan dependencies impose a necessary
order on the plan actions, but they usually do not impose a single
order; because the dependencies are piecewise, there are often
many possible orders that obey the piecewise constraints.
There are two constraint rules for Pascal programs, each with
a single exception. For data flow, a producer precedes the user
of the data. The exception is a while loop with two producers,
one before the loop and one in the loop; in that case, the producer
inside the loop is placed after all its users. For control flow,
a line that obeys another is placed after and in the scope of
its master. The exception is the repeat loop, where the
control condition is at the bottom of the loop, below the lines
it controls. One general, additional rule is required when a data
value is destroyed: the value must be used before it is destroyed.
The number of ordering rules may increase as more Pascal constructs
and plans are tested, but these three rules have been sufficient
to order the code in programs built from input, output, assignment,
if, for, while, repeat, simple data types (integer, real, character,
and boolean), and arrays, records and files. Sets and pointers
are more complex data structures, but they obey the same data
and control flow order constraints.
Plan structure defines the functional structure of a program. Given the non-linear plan structure, the code can be ordered by constraint propagation to define a serial order. The exact serial order is determined by the necessary order of the code, and an additional set of decisions that impose an organisation on the 'free' code. When code is added to an evolving solution, the plan structure is often generated first, and then the actions are ordered and executed. The process of code generation is complex, however, because many different design strategies may be used within a single solution attempt. A model that generates plan structure by linking nodes is described in the remainder of the paper; design within a node is discussed in Rist (1989).
A cognitive structure for program design is outlined in this section,
based on the idea of description, content, or cue-based search
(Norman, 1981; Norman & Bobrow, 1979; Williams & Hollan,
1981). In cue-based search, a description or cue is formed and
used to search for information that matches the cue. An external
source of information may be searched with the cue, such as the
specification or the program, or an internal source may be searched,
such as working or long term memory. A description may be more
or less precise. If the cue finds its target, that information
is returned by the search and can be used immediately. If not,
the cue is modified (Kolodner, 1984; Williams, 1984) and the new,
refined cue is broadcast. The process is repeated until the target
information is found.
An executable model of cue-based search through nodes has been
implemented in an AI system called Zippy (Griffith, 1980). In
the system, a search cue of the form <role, goal, object>
is broadcast and matched to a target cue on the appropriate port
of a node; the match is called the return from the search.
A search cue defines a question to be asked and the return provides
the answer. A node is returned from a search, linked to the search
cue, and cues on the ports of the new node can be used to continue
search. The addition of a node to the solution creates a new state
of the solution and the order that nodes are added defines a path
through a problem behaviour graph (Waterman & Newell, 1971).
Design can thus be viewed as search through a state space, where
a state is a partial solution composed from nodes and cues. Nodes
define the static structure or knowledge used during design, and
cues provide the dynamic mechanism used to link nodes and generate
a complete program.
Because the first return for a cue is accepted and used immediately,
there is no provision in the current model for generating and
comparing different choices; a single solution is gradually built
up during design, by adding new nodes or new information on existing
nodes. Variability in the current model is caused by different
knowledge and design strategies between runs, not by choosing
between several competing returns; Zippy is a satisficing (Simon,
1969), not an optimising model of design. Cues and nodes are implemented
in Zippy as CLOS objects. The basic design process is slot-filling
with a three-valued logic: a slot value is initially unknown,
then it is filled by an object or by the special value none.
Design continues until all of the needed slots have been filled,
so the basic design process is massively redundant; search continues
in any order until the slots have been filled. The exact order
in which slots are filled depends on the expertise and search
strategy of the designer, plus the form and content of the program
specification.
This section presents the cognitive structures that underlie and generate program code. The basic representation of cues and nodes is defined in the first section, then the goal, object, plan, and role structures are illustrated.
A cue is a tuple of three keys of the form <role, goal,
object>, that forms an index to memory. A role cue has
only the role slot defined, such as <read, -, ->, <find,
-, ->, or <test, -, ->. A goal cue has only the
goal slot defined, such as <-, average, ->, <-, maximum,
->, or <-, volume, ->; similarly, an object cue
has only the object slot defined, such as <-, -, rain>,
<-, -, password>, or <-, -, house>. A plan cue
has both the goal and object slots filled, such as <-, average,
rain>, or <-, volume, house>. When a cue is used as a
variable in the program, it is given a name and a type.
Zippy follows the usual heuristic used by programmers to derive
the variable name from the goal key ("volume"), the
object key ("prism"), or a combination of the two ("volPrism").
The cue and node representation is shown in Figure 6 for a single
level of abstraction; the full CLOS class and slot structure used
in Zippy is shown in Appendix A. A role cue is used to search
for a role structure, a goal cue is used to search for a goal
structure, and an object cue is used to find and link an object
structure; the goal and object structures define general knowledge
about goals and objects. A plan cue is used to find and link specific
knowledge about a goal and object combined, such as the formula
to find the volume of a sphere. The program structure is built
by linking the node returned from plan search, or by merging the
nodes returned from goal and object search to define the plan
node, and then adding role structure.
Compound objects such as <-, maximum, <-, count, rain>>
are decomposed through their goal and object slots during design.
The slots in a cue can be combined to define compound cues; the
three slots define seven possible indices. Compound objects and
compound cues can be easily decomposed into simpler parts, so
problem decomposition is automatic and simple; it is solution
recomposition that is hard. Each type of cue defines a dimension
of search (Rist, 1994): search through the role, goal, and object
structures. The goal and object keys generate the plan structure
of a program, and provide the basis for functional design. The
role key is used to identify the mechanism used in part of a plan,
and provides the basis for procedural design.
A link consists of a set of six cues that point to the
role, goal, object, plan, all, and episodic nodes. A cue is a
multi-level structure, so each cue points to any cues that are
more or less specific descriptions (:up and :down), any nodes
on either side of it in the plan structure (:forward and :backward),
and any nodes on either side of it in the linear structure (:before
and :after). A multi-level link on a data port corresponds to
a program variable; Figure 6 shows a link for a single variable,
at a single level of abstraction, for a single link type. In the
plan structure, the node above a cue provides the definition for
that cue, where the node below it links the cue to where it is
used. Goal, object, and plan structure connect the nodes into
a set of chains, while the role structure is simply applied to
the current node. The final program code corresponds to the episodic
links between nodes.
A node has four ports, that capture the data flow (:use,
:make) and the control flow (:obey, :control) of that node, and
a port contains one or more cues. The :make port contains
the cue that indexes the node during plan retrieval. An abstract
node has the cues on its ports linked to internal data and
control flow, and a list of the nodes within its boundaries in
linear order. A concrete node has its ports, an operator,
and an internal expression defined. The operator and a
numeric (data) or boolean (control) expression are used to produce
the text of the program code, although several nodes may be merged
into a single line of code. A node is most simply equated with
a single line of code, but it can equally describe an arbitrarily
large chunk of code. This flexibility is made possible by the
explicit separation of the plan from the linear structure of code,
and separating the external from the internal structure of a node.
One powerful result of this chunking is that an abstract node
can be used as a unit in planning even when the planner has little
idea about how to implement its internal details.
There are four main types of search that connect cues and nodes. The simplest form of search fills traverses all the cues in a link to create a complete link at a single level. The second type of search examines one type of structure at a time, so the goal, object or plan structure may be searched separately, or a single part of each structure may be combined through the links. The third type of search links nodes through their data flow, control flow, or linear order. These three varieties of search all create structure at a single level of abstraction; the fourth type of search changes the level of description up or down a level, to use an abstract cue or add a new level of detail to the existing structure.
The goal structure stores knowledge about how to achieve the goal,
abstracted over the set of possible objects. A plan schema
is a plan in which the object has been abstracted, so it is a
goal structure. More formally, a plan schema is an abstract node
indexed by a goal cue, with an internal plan structure and a stored
linear order. The terminals of the schema are the cues in the
abstract enclosing node, so the schema can be retrieved by its
goal, the :use cues located, and the schema actions coded in forward
linear order.
The goal structure for the average plan is shown in Figure 7,
as two abstract and one concrete levels of structure; since this
is a goal structure, the object key in each cue is vacant. The
first level of abstraction consists of the read loop and average
nodes. For the average, the second level of abstraction consists
of three nodes, the sum and count schemas, plus a schema for the
two cases when the count equals zero and when it does not. The
internal, concrete nodes for each of these abstract nodes are
shown to the right of the figure, first the read loop, then the
sum and count schemas, and then the guarded average calculation
schema.
The focus of the sum plan is the node that adds a new data value
to the existing sum. There are three inputs to this node, the
object to add, the initial value of the sum, and the accumulated
value of the sum. Only a single cue for the sum values is shown
at the top of the sum node, because the two values have the same
goal and object but different roles; the initialisation is <init,
sum, ->, and the accumulation is <acc, sum, ->; roles
have been omitted from the figure for simplicity. The use of the
initial and accumulated values are internal to the sum plan, so
those cues do not appear on the abstract node; a similar situation
occurs for the count plan.
In this example, there are three levels of abstraction for the average goal structure. The object structure is trivial, because the object is a real number such as the daily rainfall. When the complete average plan schema is returned and merged with the object structure, the plan structure is defined at all levels of abstraction, so no further search is required and the program code can immediately be written. If a plan schema is not known, then further search at a more concrete level is required to locate and link the complete plan structure.
An object cue is broadcast, matched to a target cue, and the returned
node defines the object structure. The goal and object structures
are merged to generate the program plan structure, and the plan
and role structures are merged to define the program structure.
Two examples of object structure are discussed in this section,
one to show how an object is defined at many levels of abstraction,
and the other to show how an embedded object is handled in the
model.
Consider a specification that asks the programmer to find the
volume of a house, where the house consists of a single room with
an attic on top; the room is a box shape and the attic is a triangular
prism. The program has to read in the length, height, and depth
of the objects, calculate the two volumes, and add them together.
The volume calculations are simple. The object structure is quite
complex, however, due to the fact that the house splits into two
objects so that the length and depth of the house, room and attic
are the same, but their heights are different.
The goal, object, and plan structures to find the volume for a prism (the attic) are shown in Table 3; the role key has been omitted for simplicity, because all roles are "calculate". Small circles represent a cue, and the number in each circle indicates the cue's link. The goal structure provides a set of cues that are matched through their links to the corresponding cues in the object. At the abstract level, the plan requires a height, length, and depth that are exactly matched to the object cues. At the concrete level, the "area" goal requires a "front" surface, that is matched to the concrete object "triangle" in the object structure; the goal cues essentially filter the object cues to define the plan structure. The plan structure may be found by search if it is indexed by the plan cue, or it may be derived from the goal and object structures by unifying the cues and merging the nodes. If the area of a triangle was not known, then it would be defined (incorrectly) by the general formula supplied in the goal structure.
The goal structure represents the general knowledge that a volume
is calculated from the dimensions of the object, and more specifically
that a volume is the product of the area and the depth, and the
area is the product of the height and length. The object structure
represents the general knowledge that a prism is a three dimensional
object, and more specifically it is a triangle with depth. The
derived plan structure is created by merging the nodes and cues
in the goal and object structures. In this case the derived structure
can be used to generate code (that may or may not be correct),
but in other cases a plan search is needed to find a formula.
Internal cues within a returned node, such as <area, ->
and <- , triangle> may have their goal, object and plan
structures already stored with the cue, in which case no further
search is needed within each structure, and the structures can
be merged at each level. Alternately, the cues may not be linked
to a node; in this case, further search is needed to fill the
slots in the goal and object structures. The knowledge structures
are easily tailored to define different knowledge, and different
degrees of cohesion (stored links) of this knowledge.
A larger view of the plan structure is shown in Figure 8, where
the object structure splits the object "house"
into two parts, the room and the attic; the concrete node that
merges the volumes is shown at the bottom of node 1 in the figure.
The room is a box, so search using that object key returns the
object structure for a box, that is merged with the known goal
structure for the volume; the detailed story for the prism was
given above. Each object has an internal structure, so all the
boundary cues for the box, rectangle, prism, triangle, and house
must be unified in the final structure. As always, this knowledge
may be already stored in the node returned from semantic memory,
or found by a process of search and unification.
The length is described at different levels of abstraction as
the length of the house, the box, and the rectangle. It is also
the length of the prism and the triangle. The room is a box, and
the attic is a prism. These part-of and synonym relations are
captured by the multiple levels of cue, where each level may have
a different key, and synonyms are stored at each level of abstraction;
each cue points to the node at that level of abstraction. The
data flow is a single data value when the program is run, that
is described and used in planning at three levels of abstraction.
A compound cue may have embedded goals and objects, in which case a search through the object structure may be required. Consider the specification goal to find the longest sunny spell in a period of days, where a sunny day is one with zero rainfall. The cue derived from the specification has the content <longest, sunny spell>. A sunny spell is a count of sunny days, so the object slot has to be decomposed into the cue <count, sunny day>. A sunny day is one with no rainfall, so this cue in turn has to be decomposed into the cue <sunny, day>. This cue can finally be matched to the definition given in the specification. This process of decomposition and search through the object structure is shown in Figure 9.
Decomposition through the object structure allows a programmer to identify the basic objects that are needed, with no analysis of the goal or plan structure. It identifies the basic data that is needed to achieve the goal of the problem, without requiring the designer to have a plan for how that data is used in the solution. Once the input data has been defined, it can be used as a focus for detailed, forward design.
The role provides the operator for a program node, so it is most
simply thought of as an attribute of the node. However, it can
exist in a search cue and be used to search for the matching type
of node, so the role key exists independent of and before a node
has been linked. The role and its operator define the type of
data or control transform for the node, and the transform in turn
creates the behaviour for the node.
A role may be combined with a goal or object structure; a read
loop, for example, is indexed by the cue <read, stream, ->.
The code, node structure, and cue contents for a read loop are
shown in Figure 10 at two levels of abstraction, one for the read
loop schema and one for the concrete nodes. In the cues shown
at the right of the figure, both abstract (first, next) and concrete
(read) roles are shown. In this example, the role and goal form
the index for the read loop schema, so the object slot is vacant
in all the cues.
Roles cut across the plan structure, because the same role can
occur in many plans. For this reason, they are very general; essentially,
a role is a construct that is independent of, and can thus be
be applied to, any plan and object. The role structure seems to
have been the main focus of interest in automatic programming,
with its emphasis on mechanism, procedural structure, and behaviour.
The Programmer's Apprentice project, for example, is concerned
with the interaction between roles and objects and their effect
on behaviour. In this paper, search through the plan structure
is the focus of interest; roles are terminals in the node structure,
and can be considered after the plan structure has been built.
The complexity in a plan structure may come from the role, goal or object structure. For an average, the object is simple but the goal structure is complex; for a volume, the goal structure is simple but the object is complex; for a count, the plan and object are simple but the roles are complex. Role, goal and object keys index general knowledge, that is retrieved and combined as needed. A plan cue combines the goal and object keys, to locate a specific plan for this combination. If no specific plan is returned, then the general knowledge is used to decompose the task and create a plan.
The basic mechanism of design is to start at a cue and link nodes from there, in the order defined by the design strategy. A design strategy defines a starting cue, a direction, a level, and a link type to explore from the current focus, that determines which slot to fill next, and thereby defines the next state of the solution. Any decision about the next link to pursue is local to the current node; there is no remote knowledge and no supervising controller. Search continues until there is no next step from the current state, when a new focus or strategy has to be chosen. When all the desired slots are filled, the solution is complete. The design strategies currently implemented in Zippy are procedural, functional, focal, and opportunistic design.
The three keys in a cue define eight possible search cues or link
types, shown in Table 4. The node structure is built by linking
or merging the returned nodes. The link structure of the cue fills
up as more complex structures are built, from one to two to three
keys. After a link is returned or built, it is merged with the
existing program structure, so the program structure is the final
repository of all the merged node structures; it is a single,
multi-level structure. In the current implementation of the model,
only the role, goal, object, plan, and complete links are used.
cue example name return
<-, -, -> <-, -, -> node
<r, -, -> <read, -, -> role cue role structure
<-, g, -> <-, average, -> goal cue goal structure
<-, -, o> <-, -, house> object cue object structure
<-, g, o> <-, vol, house> plan cue plan structure
<r, g, -> <read, stream, -> node
<r, -, o> <read, -, rain> node
<r, g, o> <init, count, rain> complete cue node
In the model, search is conceptually parallel, self-terminating, and based on activation levels (Anderson, 1983; Sternberg, 1966). This parallel search is approximated by linear search, where each target cue is tested in turn, until an exact match has been found or the memory has been searched. Human memory search can fail, in which case a great deal of time and energy has to be spent finding an index that will match; this is the exception, however, not the rule. In the model, backward search always succeeds eventually; if it fails, then a solution cannot be found. Forward and linear search can, and often does, fail.
Search is controlled by the current cue and a search command.
The current cue defines the current location in the node structure,
and the search command is applied to the current cue to define
the next current cue. Ths process repeats until there are no new
nodes to add and search terminates. The same strategy or command
may be used throughout design, or a series of different strategies
may be used.
A search command is built from two parts, that can be called
the plan and act stages: the plan part defines what is meant by
"next", and the act part defines what is done to the
next node. A search command specifies a design strategy by listing
the direction to search, the type of link to pursue, and the level
of search. Search may proceed from a cue through the goal, object,
or plan structure, or up or down the abstraction structure. Search
may proceed from a node through the data flow, the control flow,
or the linear order. The planning slots in a search command and
their possible values are shown in Table 5, and the action slots
are shown in Table 6.
Direction Flow Link Level
forward data role top
backward control goal down
linear object up
plan bottom
Search is directed from the current focus through the choice of
the next slot to fill. A port is selected by the search direction
and flow, and the first vacant link at that level in a cue in
that port becomes the next focus of the search. The search command
is then applied to find and link a new node, and define a new
focus. Search continues along the chain of cues until a set of
filled cues or terminals are found; a chain of searches with the
same start cue and command is called a search episode.
At the end of an episode, the designer has come to the end of
a chain of thought and must start a new chain. The same command
and focus can then be applied to generate another branch of the
plan. For a single program goal and search command, the complete
plan structure can be generated by a series of episodes. When
there is nothing left to explore in the current plan, then a new
beginning must be made to start a new search. The current focus
may be used with a new command, the same command may be applied
from a new focus, or both focus and command may be changed; the
most common choice is to select a new focus and retain the same
design strategy. To find a new focus, the data sources can be
scanned in some order; when there is no new focus at all, the
program is complete.
The overall behaviour seen during search can thus be broken into
three stages. First and simplest, a chain of nodes are linked
until the end of the chain is reached, so there is a cycle through
the nodes in the chain. Second, another branch of the plan structure
is chosen and chained, so there is a cycle through branches of
the graph. Third, complete plan graphs within a node are added,
then a new focus is selected and a new node added, so there is
a cycle through goals in the program.
Object Action
cue write
node code
The action is either a note written on scrap paper, or code that is added to the program. The name of a cue can be written on scrap paper, or coded as a variable declaration; in the same way an expression in a concrete node may be written on paper, or coded.
Both external and internal sources of data may be searched with
a cue. External data sources are the specification, notes written
on scrap paper, or the program itself; these are implemented in
Zippy as specification, scrap, and program windows, where the
specification is a read-only window. Internal data sources are
working memory, episodic memory, and semantic memory. Internal
memory consists of cues and nodes, and is indexed directly by
a cue; external memory (Norman, 1986) provides a key, such as
a word in the specification or a variable name in the program,
that has to be matched to a cue before it can be used. Usually,
the specification contains the major goals and the input values
and semantic memory provides plans for the cues mentioned in the
specification.
Memory can be searched to find a target, or scanned to find a
search cue; the search and scan orders are shown in Table 7. If
there is a search cue, then the various memories are searched
for a target; this occurs during the first two stages of design
to generate a chain and a graph from an existing focus. Backward
search returns a plan, so semantic memory is the major data source
for initial planning. Forward search to find a use for the current
cue does not examine semantic memory, because there are lots of
ways to use something, nearly all of which are irrelevant to the
current goal. If there is no search cue, then the memories are
scanned to find a search cue, that is returned; this occurs when
a node is complete and a new focus is sought.
search order :make: working, episodic, semantic, scrap, spec, program
search order :use: working, episodic, scrap, spec, program
scan order: working, scrap, spec, program
A data source is scanned in linear order, either forward or backward,
from the start, the end, or the current focus. The number of data
sources can be easily extended to include program listings, so
a programmer may recall that a solution exists in a listing, find
the listing, and then use the coded solution as a plan for the
current problem (Visser, 1987).
When the specification is read, each phrase is translated by a
simple process of keyword matching to identify the roal, goal,
object, or expression in the phrase, and create a node from the
pieces. Because the output representation is simple and the specifications
are written in simple, active, and declarative phrases, a sophisticated
NLP system has not yet been required. The specification could
be completely read and then one node selected as an initial focus
for search (the current strategy), or a node from the specification
can be used as a search cue as soon as it is created.
When a search cue is selected, it is placed at the front of episodic
memory and any returned nodes are attached to that current search
cue; working memory is the first seven nodes of episodic memory.
As each new cue is added to working memory, an old cue drops off
the end of working memory. If the intermediate results of an extended
search are not dumped to scrap paper or the program code, they
will be lost from working memory (Anderson and Jeffries, 1985).
A node is a persistent object that is built in memory during design,
so it can be indexed and retrieved from episodic memory at any
time. Episodic memory is implemented as a link between the current
and the next search cue, so there is no separate type of memory,
just another way of linking nodes.
A cue is placed at the front of episodic memory when it is broadcast,
and any matching target is placed at the front of semantic memory
when it is matched (activated). A serial search of semantic memory
thus roughly corresponds to a search based on the level of activation
(Anderson, 1983). It is interesting to speculate that the apparent
extended working memory shown by experts (Chase and Ericsson,
1981) can be explained by the larger schemas possessed by experts.
Since working memory is used only for search and not for merging,
the effective memory limit increases as the size of the returned
nodes increases, with no change in the number of search cues stored
on working memory.
Because nodes and cues can be stored in external memory, they
have pointers to the relevant text in the specification, scrap,
or code windows. These pointers provide a location in the window,
so a search may be directed forward or backward from that location.
If there is no current search cue, then the text can be used to
generate or index a node in internal memory. This part of the
model implements the parsing-gnisrap model of coding proposed
by Green, Bellamy, and Parker (1987), where external memory is
used to store the results of planning. Unlike that model, the
text is used here as an index to the plan structure already built
in memory, so the plan is retrieved instead of rebuilt.
A search strategy is defined by choosing the next slot in the next cue. The cue is broadcast, and the search returns a node that is linked to the search cue, and provides a new set of cues on its ports. A new cue is chosen, and search continues until nothing is returned from a search, at which point a new focus has to be found. The links between cues and nodes are shown in Figure 11. Given the current cue, search may occur through the goal, object, or plan structure at that level, or a new level may be chosen. The next cue may be chosen in the current node within the same port, in the current node in another port, or in the next node in plan or linear order. The notion of location is fundamental to Zippy, because the location defines the knowledge that can be accessed from there. The location is defined by the current cue, and search from the current cue defines the behaviour of Zippy: from here, there are lots of places to go next, and the choice about where to go next defines the new location.
The search strategy defines the order in which nodes are added to the evolving solution. Three main search or design strategies are reported in the literature and captured in the model, called procedural, functional, and focal design. Procedural design is based on the role and linear structure of the program, and design starts at the inputs and adds new nodes forward and down in linear order; the linear order is defined by the specification or by constraint propagation. Functional design is based on the plan structure of the program, and design starts with the goals and adds new nodes backward and up in plan structure order. Focal design is based on the separate goal and object structure of the program, and design starts with the most basic goal and object and adds nodes before and after this focus.
The power of the control structure comes from the ability to specify
and change the values in the search command. There are two main
components to this power. First, a single command slot can be
changed and the others retained, so a new strategy is easy to
define. Second, a series of search commands can capture the distinction
between batched and interleaved processing.
Batch processing means "do this one thing to all";
hold the slot, and vary over nodes. Interleaved processing
means "do everything to this one"; hold the node, and
vary over slots. These are actually the extremes of a continuum,
because any number of slots can be attempted within a single node.
In batch processing, design occurs in a series of passes over
the nodes; an example is to declare all the variables, then write
all the code. In interleaved processing, design occurs in a single
pass in which all the slots are filled in each node; an example
is to code one line and declare its variables, then process the
next line, and so on. A slot in a search command may contain many
slot values, and a design strategy may use many search commands.
If multiple slot values are specified, then each slot is filled
in turn for the current node; this is how interleaved processing
is implemented. A series of search commands can be listed that
differ in a single slot value; this is how batch processing is
implemented. Switching to a new focus starts the list from the
first command.
The key to understanding how this control structure generates
behaviour during design is to look at the interaction between
forward and backward search at different levels of abstraction.
Given a command to do abstract backward search, Zippy moves the
current focus backward through the abstract plan structure. Each
node provides a new cue, so the current cue moves from the end
to the front of the plan structure. Backward search always succeeds,
so at some point a terminal will be reached, and design shifts
to forward search down one level. From the current (terminal)
cue, the cue moves down level and then forward search occurs from
the :use port within that node. If forward search succeeds, then
that node can be coded and its :make port defines the new focus.
If forward search fails, then backward search is resumed from
the :make port of the abstract node.
Switching from forward to backward at the next level down creates
a fractal pattern of expansion, as each level is traced backward,
the first internal node is selected, and that node is then searched
forward or expanded. If code is written when the concrete structure
is generated, then the observable behaviour this strategy generates
is overall forward coding in plan chunks, but backward coding
within a chunk: design is forward globally, but everywhere backward
locally! If a schema is retrieved, then it can be coded forward
through the stored linear structure, and design will be generally
forward with some chunks showing forward, and some backward, generation
order; this was the pattern of coding reported in Rist (1989,
1991).
The level of search can be specified as the top level, the bottom
level, or up or down from a current level. If a cue has no current
level, then a command of "up" starts at the first
up level, the concrete level; similarly, the first command "down"
from any empty cue starts search at the most abstract level. The
nicest thing about this convention is that the basic search strategies
are literally forward top down and backward bottom up. Several
search strategies are outlined in Table 8, first with the fractal
pattern discussed above, then the typical novice and expert design
strategies, and then focal design. In the fractal example, details
about the link types and the action have been omitted so the main
pattern is more easily seen. In the table, slot values that are
interleaved are shown across the page, and batched slot values
are shown down the page. Interleaved values are abbreviated in
the table; the reader should examine Tables 6 and 7 for the full
names. An initial empty slot in the table means that slot is not
used, and a subsequent empty slot retains the previous value.
The typical novice design strategy is to code forward from the
first action to the last by tracing through first the data, and
then the control links. Each variable is declared as soon as it
is encountered in a line of code, then the line of code is added
to the program. This strategy fails when a concrete node cannot
be found that uses the current cue, so the first enclosed node
in the plan structure is then traced backward from its :make port.
Novices seem to focus on the role structure when designing, in
two ways. First, novices use the role as a search cue; they try
to read everything, then calculate everything, and then output
everything (Rist, 1986). Second, they try to fill the role slot
in a cue immediately, so they search for the matching node that
is indexed by all three keys. This is the smallest and most concrete
node in the program structure, because the more general role,
goal, object, and plan structures are not considered.
Direction Flow Link Level Object Action
Fractal: backward top
forward down
Novice: forward l, d, c r, g, o, p bottom c, n code
backward top
Expert: backward l, d, c g, o, p, r top
forward down node write
forward linear cue code
node
Focal: backward data g, o down cue
forward d, c p node code
The most common expert design strategy described in the literature
is forward, top-down, breadth-first design, followed by a single
pass to code the complete solution. The linear structure allows
the internal nodes to be listed in linear order and expanded in
that order; for this to happen, the linear order must be available.
There is evidence (Guindon, Krasner and Curtis, 1987; Parnas and
Clements, 1986) that this strategy is reported more often than
it occurs. An interesting result of the model is that the expert
search commands will create markedly different behaviours for
novice and expert programmers. Experts will show forward, breadth-first
design because their retrieved knowledge (node) has an internal
structure, but forward design often fails for novices because
they cannot retrieve schemas.
The idea of a program focus recurs in programming. In program
design, it is called a key (Kant and Newell, 1984) or focus (Rist,
1989), in program understanding it is called a beacon (Brooks,
1977) or key (Harandi & Ning, 1990). Expertise is needed to
define the focus of a plan a priori, but experts select the same
line as the focus of a program (Wiedenbeck, 1986) when asked to
choose an action. Focal design (Rist, 1989) occurs when a problem
is decomposed into the simplest and most basic action and object
that defines the focus of the solution, and then the rest of the
solution is built around the focus. Essentially, the focus is
where you break out of theory into action, out of the abstract
into the concrete level of design. This strategy is implemented
by search through the goal structure to find the operator that
achieves the simplest goal, followed by search through the object
structure to define the objects for that operator. The focal line
of a program is the line that contains the focal operator and
objects. For the average plan, the focal operator is divide;
for the sum and count plans, it is add; for the maximum
plan, it is the greater than operator. Focal design is
similar to means-ends analysis (Newell and Simon, 1972) in which
the most important operator is chosen first during planning; both
methods begin design with a key operator that is located in the
middle of the linear plan, and build the plan outward from the
focus.
Care must be taken to separate four concepts that define the "most
important" line in a plan: the basic action, the basic object,
the basic result, and the beacon. The lines that correspond to
each of these definitions of important are shown in Table 9 for
a set of examples; in the table, the focal line contains the focal
action and object.
plan action object focus result beacon
average / sum sum / count sum / count sum / count
sum + sum sum + num sum + num sum + num
maximum > max if this > max max := this ?
bubble sort > next if this > next next := this swap
insertion sort > new if this > new here := this move down
selection sort > max if this > max max := this ?
The focal line is the line that contains the focal operator and
object. The focal or basic operator implements the goal and uses
the focal or basic object. A sort routine places the elements
in a list in sorted order, where each element is greater than
(or equal) to the previous element, so for a sort the focal operator
is the test. The objects used by this operator define different
types of sort routine. If the objects are adjacent elements, then
a bubble sort is defined. If the objects are the largest element
and an element of the list, then a selection sort is defined.
If the objects are a new element, and an element of the list,
then an insertion sort is defined. Looking at the insertion sort,
the line of code that places the new number in its correct place
is the key line in the final plan structure, because all the other
lines are added to support it; however, it is not the focal line
in the design of the plan.
A beacon is used to identify a plan, such as the swap in a bubble
sort. The beacon is thus a "most differentiating" line,
that separates out one plan from its competitors, and is thus
central in recognition. In a particular solution, of course, these
four concepts may all map onto the same line, but this happy coincidence
is not necessary.
Opportunistic design is implemented in the model as a deviation
from the dominant search strategy (Hayes-Roth & Hayes-Roth,
1979) caused by an unexpected vacant slot, when the vacant slot
becomes a new focus for design. A common opportunistic pattern
is forward design through the plan or linear structure, that is
interrupted when a vacant :use slot is encountered. At that point,
forward design is replaced by backward design to find a plan that
:makes the vacant cue. The missing slot is often a plan link,
but opportunistic design can also occur within the goal and object
links.
Situated design is inherent in the structure of the model, because
the notions of "current" and "next" create
the model's behaviour. Search from the current location is seen
at five levels of detail, each further from the current cue. Within
a cue, the levels of abstraction provide a set of nodes to explore.
Within a link, a goal structure may supply an abstract cue (such
as "front") that is used to search the linked object
(and find "triangle"). Within a node, a use port examines
the make port of its node for an object definition, before it
searches further afield. Within a data source, search proceeds
from the current location in the source until an answer is found
to the current question (a target is returned that matches the
search cue). Finally, a data source can be scanned to find a new
question to ask and thereby start a design episode. Data from
internal and external memory are combined to form a plan, with
little distinction between the sources of the data; a plan is
built up from data of various types as they are encountered. This
mechanism captures the opportunistic, exploratory, and flexible
behaviour required both of Simon's (1969) ant and of Suchman's
(1987, 1993) situated planner. For Suchman, design is situated
within the external environment and cues from the environment
can suggest steps to add to a plan. Zippy is situated both at
a location within the brain and then within the environment, extending
Suchman's notion of situated design.
A designer chooses where to start search and how to search, and
the model executes the details of this choice. The basic model
says nothing about informed choice, because that is the hallmark
of an expert. Informed choice is used to decide on the focus of
a plan and start design from that point; once this focus has been
set as the initial goal, the rest of the design can be executed
in the model. Expert design can be implemented as an informed
decision about the next cue to choose at each node boundary, but
the expertise required for this task is beyond the current paper.
Observed expert design strategies have been based on linear, plan
and focal strategies, depth-first and breadth-first expansion,
easiest and hardest first (Ball, 1990), as well as the cognitive
cost of pursuing a design choice (Visser, 1990).
Procedural, functional, focal, opportunistic, and situated design in Zippy have been described in this section. The model can be used to explore many other design strategies; essentially, the control structure defines a language that is used to program the model.
An extension of the Noah program is used to illustrate the model
of design in this section. The initial state of the system is
given by the specification and the content of memory. The specification
is "Write a program that shows the average rainfall for a
period. Read in the rainfall each day. The end of the period is
signalled by an input value of -99. Also find the maximum rainfall,
and the longest sunny spell in the period. A sunny day is one
with no rainfall.". Memory is loaded with the count and sum
plan schemas, and isolated nodes for the other knowledge. The
strategy is forward, concrete design with instant coding, as shown
in Table 9 for the typical novice strategy.
The initial state of the solution is the cues and nodes created when the specification is read. The declarative representation created by reading the specification is shown at the top of Figure 12; only the read role has been shown in the figure, for simplicity. The roles, goals, and objects in the specification are translated into cues, and the values are stored as expressions. The nodes are linked when they are created to define a linear order for the program, shown at the bottom of the figure.
One node makes a single rainfall value, and two other nodes use
this value, so these three nodes are easily connected. Two nodes
(average and maximum) use a stream of rainfall, so a new node
that makes a stream of rainfall has to be created. The read node
makes a single rainfall, so it can be used to make a stream by
adding a loop to control it; the loop uses the sentinel in its
test. The longest sunny spell is translated from <longest,
sunny spell> to <maximum, count sunny day>; this cue
is then used to retrieve the maximum schema, and the new node
is linked to the sunny day producer.
For simplicity, only the generation of the average plan is simulated
here; the order in which cues are linked for the average plan
is shown in Figure 13. In the scenario, abstract backward search
from the cue <average, rain> is used to identify the input
data, then the plan to get this data is generated and the read
loop is coded. Attention then returns to the average cue, and
the abstract plan structure is expanded. Each part of the retrieved
plan is then coded in forward order.
The first two cues in the figure are generated by finding [1],
and then doing backward search on, the average schema. The schema
uses a stream of data, so this becomes the current goal [2]. The
specification states that the rainfall must be read [3] over a
period, so the designer adds a loop [4] to control the read node.
This loop node in turn requires an initial data input [5] and
a flag to stop the loop [6]; the stop value was given in the specification.
No read loop schema is known is known in this scenario, so the
abstract read loop node shown at the left of the figure has no
filled cues.
Forward search selects the links below the cue <read, stream,
rain> to pursue, and looks for a concrete node that uses that
cue. None is found, so control transfers to backward search from
the <average, rain> cue [1], down a level of detail [7].
Because there is no guarded average schema, the cue points to
a concrete node that uses the sum and count. The first :use cue
is selected [8], and broadcast. The cue matches the sum schema,
so the schema is returned, bound to the object [9, 10], and its
internal nodes are filled in linear order [11-14]. The next open
cue is the count [15], and the count schema is returned, bound
[16] and filled [17-19]. At this point, the essential plan structure
to produce the average is complete because all the cues are linked
and there is a concrete node for each cue.
In this scenario, the need to guard the average is realised after
the rest of the solution has been designed and coded. The guard
controls the average calculation [20], and consists of a test
that the count is positive [21]. The guard is defined here as
an if-else construct, so a node is required to deal with the else
case when the count is zero; when the count is zero [22, 23],
then the average is assigned to zero [24]. The complete plan is
now linked and coded.
The generation orders of the cues, nodes, and code in this scenario
are shown in Table 10. The first column of the table shows the
order in which each cue is defined, so this is also the order
of the declarations. Only concrete nodes are shown in the table.
The begin-end block is omitted from the program ontology, because
it is not a node; it is a procedural syntactic marker translated
from the node control structure. The code for the average plan
was completed in four design episodes, for the read loop backward,
the sum schema forward, the count schema forward, and the average
calculation both backward and forward. Code generation in this
scenario was generally forward, based on abstract backward search
and forward schema instantiation.
cue node code linear action
3 3 1 read (rain);
5 4 2 sum := 0;
7 6 3 count := 0;
2 2 4 while rain <> -99 do
begin
3 6 5 5 sum := sum + rain;
4 8 7 6 count := count + 1;
2 1 1 7 read (rain);
end;
1 4 8 8 average := sum / count;
The order of abstract and concrete node generation is shown in Figure 14 in the form of a top-down functional decomposition tree. In the tree, a number refers to a node, not a cue as in Figure 13. All nodes are abstract, except for those at the bottom of the tree. Child nodes are listed across the page in linear order within the parent; a total linear order for the program is obtained by constraint propagation.
The numbers in the top-down tree indicate the order in which nodes are generated; nodes without a number indicate the hierarchy, but do not exist in the design scenario. The generation order can be roughly described as forward and top-down, but there are many deviations from this strategy, both in terms of the direction (backward generation) and level (missing levels). For a top-down account, these deviations are problematic. For a model of design based on alternating forward and backward search at different levels, these deviations from the forward and top-down design are expected.
The theoretical model of design generates a characteristic appearance
order of program code as each plan is generated and added to the
program. It can be used to model human design by setting the knowledge,
focus, and search commands used by an individual. The protocol
analysis in this section shows both the power and limitations
of the model to explain human behaviour. The claim of the model
is that a program can be built from roles, goals, objects, and
their expressions; the support for this claim is demonstrated
in the analysis. However, most of the difficulty encountered by
the subject was in the definition of the roles, goals, and objects,
particularly in the roles. This level of problem solving is below
the grain size of the model, though evident in the protocols.
Once the role, goal and object keys are defined, the model can
build a program from the filled cues in the same way the human
does.
The problem and protocol are taken from a study of student programmers
described in Rist (1989). The problem specification is "Write
a program that shows the number of sunny days (days in which no
rain fell), the longest sunny spell, and the average rainfall
for a month. Read in the amount of rainfall each day, measured
in centimetres.". The problems in the study were presented
as soon as they could be solved by the student; this problem introduced
the program loop. Subjects talked aloud and wrote their notes
and code on paper.
The subject read through the specification, then started to organise
the problem. The protocol for this first cut at solving the problem
is given below, followed by a translation into cues and nodes.
The growing node structure is shown in Figure 15. The numbers
in the protocol and the figure indicate the order in which the
cues are added to the solution.
"OK, sunny days [1], longest sunny spell [2],
average [3]. See what I'm doing ... sunny days [4], OK, when rain
is zero [5] ... longest sunny spell [6]... how many days it was
sunny [7] and ... average rainfall [8]. What is the period [9]?
One month, so make it 30 days [9]. How to read [10] 30 days? Input
30 times. Make some kind of loop [11], do a loop to make the input.
So now I need a constant [12] and some variables ..."
The problem goals were listed in specification order [1-3]. The
first goal was selected for expansion [4], and the data for this
goal located [5]. The next problem goal was then mentioned [6],
but attention returned to the first goal and the fact that the
goal was a count of the sunny days [7]. The third problem goal
was then mentioned [8]. Attention again returned to the first
goal, and that goal was explored. The count of sunny days requires
a stream, so the subject tried to define this and assumed a period
of 30 days [9]. A plan to read a single day was known but no plan
to read a stream of data was retrieved, so attention moved to
the control for the read [10]. The loop role was then defined
[11] and a constant defined for the end test [12].
The next part of the verbal protocol echoed the code protocol,
so the verbal protocol is omitted and the generation order of
the progam code is shown in Table 11. The read loop was coded
as a chunk, then the sunny day count was added as a chunk. The
code for each of these plans is spread out in the execution order,
as shown in the table.
program Noah (input, ouptut);
1 const n = 30;
2 var rain: integer
7 sunDay: integer;
11 sunSpell: integer;
13 sum: integer;
3 begin
8 sunDay := 0;
12 sunSpell := 0;
(14) sum := 0;
4 for i := 1 to n do
17 begin
6 writeln ('Enter amount of rain: ');
5 readln (rain);
9 if rain > 0 then
10 sunDay := sunDay + 1;
15 sum := sum + rain
16 end;
18 average := sum / n;
19 writeln ('Average rainfall is ', average);
20 writeln ('Number of sunny days is ', sunDay);
The constant and the rainfall were declared [1-2], then the loop,
its read, and the prompt for the read [4-6]. The sunny day count
was then coded in schema or linear order [7-10], not in plan structure
order. The node to test for a sunny day is shown as a data node
in the figure, and was used as a data node during planning; its
role was not mentioned. Its role is a test, so the node controls
the count in the final program structure and in the code, so the
addition of the role to the cue changed it from a :use port to
an :obey port. From the protocols, it appears that the role is
the last part of the cue to be defined and used. This is a good
design heuristic, because most of the plan structure can be generated
without roles, then roles can be added later to define the detailed
control structure.
The longest sunny spell goal was the next goal in specification
order. The verbal protocol and the translation into cues and nodes
is shown in Figure 16. Design started with a mention of the problem
goal [1]. The object structure was then pursued to define a sunny
spell [2]. The subject realised that this was a count [3], but
could not work out what to count. From the role and goal information
in the count schema, the variable could be declared and initialised,
but the count could not be coded because the object structure
was unknown; the object was undefined. The subject then realized
that there were two things to consider, and at this point the
protocol is elusive; there are two candidates for a "thing".
"Now I have to ... longest sunny spell [1],
how to count [2] it? Let me try sunny spell [codes line 10], will
be zero at the beginning [3; lines 11 and 12], how to count it?
Ah, two things. If rain was zero, then add one to sunny days ...
if sunny days I have to count it ... if zero then ... Now, let's
see for average ..."
One interpretation is that a thing is a count, so the subject
realises there are two counts, for the sunny days and for a sunny
spell; both are counts that test for zero rainfall, but the spell
is reset to zero every rainy day. The second interpretation is
that there are two things in the sunny spell count, the count
and the reset. In either interpretation, the outcome is that a
separate sunny spell count is needed; the second interpretation
adds a reset. When the longest sunny spell plan was later coded,
the reset was coded in the correct place without comment, so the
reset was added at some point with no verbal evidence.
A generic count has two concrete nodes, the initialisation and
accumulate. A count of sunny days adds another node, to control
the accumulate. A count of a sunny spell adds another data flow
that requires two nodes, to reset the count to zero when rain
falls. The ontogeny from count, to count of sunny days, to count
of a sunny spell, is shown in Figure 17.
The average was the next goal selected, and the sum plan coded
in schema order, first the declaration (line 12 in Table 11) and
then the calculation (line 14); the sum was not initialised (line
13). The parts of the plans inside the loop are now complete,
so a begin-end was wrapped around the loop code (lines 15, 16).
The average was calculated (line 17), and the output of the average
(line 18) and sunny day count (line 19) plans were coded. The
node structure is easily inferred from the average plan schema
described earlier in the paper.
The longest sunny spell goal was then resumed and generated a
long chain of problem solving. The difficult part of the design
of the maximum plan is what to do with a larger value; the operator
">" requires two values to be compared, but the test
does not indicate what should then be done with the larger value.
The decision to store the largest value so far is the key to the
design of the plan; this corresponds to finding the correct role
for the larger value. The verbal protocol and the cognitive representation
for the first attempt at the longest sunny spell plan are shown
in Figure 18.
"Now I think about sunny spell ... sunny spell is a number [1, writes sunSpell = k1], after this there will be two [2, 3, writes sunSpell = k2], but how can I compare this? There are some numbers I have to keep, ah, maybe only two numbers, sunny spell 1 and 2 [4, 5]... If sunny spell 2 is bigger [6], then sunny spell 1 goes out, OK. So sunny spell is zero (codes line 20), then sunny spell (codes line 21) ..."
The focus of attention was the plan to find the longest sunny
spell. The goal supplied an operator [">"] that takes
two objects, and the rest of the problem solving was spent trying
to define the correct roles for the two objects. The objects were
first defined as "a value" [1], then as the first [2]
and second [3] values. The subject realised that only two values
were needed [4, 5] for the comparison [6], because the second
value could be stored if it was larger [7]. At this point the
comparison and store operations on the sunny spell values have
been defined, but only for the two specific values that are first
and second.
The remainder of the verbal protocol and the code generation order
is shown in Table 12. The subject first coded the input to the
plan, the sunny spell; the coding order was the initialisation
(line 20), the count (line 21) and the test (line 22). The input
to the sunny spell plan (the rainfall) was then considered, but
no additional code was needed because the plan already existed
in the program.
20 8 sunSpell := 0;
22 10 if rain = 0 then
21 9 sunSpell := sunSpell + 1;
"Now if rain (11) is in the loop ... maybe I need a new loop. No, I can do it all in the same loop"
23 12 if rain > 0 then
24 13 store := sunSpell;
"So now I've stored it, but how can I compare what I had before? I have sunny spell stored, and now I want to compare it, to take in or put out (crosses 24 out)"
25 begin
26 14 if sunSpell > store then
27 15 store := sunSpell;
28 16 sunSpell := 0;
29 end;
30 17 write ('Longest sunny spell is ', store);
31 end.
The corresponding cognitive structure for the protocol is shown
in Figure 19. The node at the top left of the figure shows the
node structure for the first part of the sunny spell plan, that
was coded by tracing the data and control links, not the linear
order (8, 9, 10, 11). The remaining two sets of nodes show the
two attempts at coding the maximum plan. The first attempt (lines
23, 24) was based on the control flow, because it specified that
the sunny spell count should be stored whenever it rained (rain
> 0). This located the maximum plan in the procedural structure,
but the data flow was incorrect; the subject realised that storing
the count by itself was not sufficient. The second attempt (lines
25-31) expanded the maximum plan by adding the test to compare
the stored to the new value, under the control of the rainfall
test. The count was then reset to zero after being used in the
maximum plan, then an output for the maximum plan was coded.
Code generation for this solution was generally forward, starting
with the input, then the sunny day count, the average, the sunny
spell count, and the maximum. Within three of the four plans,
however, backward verbal and code design was visible in the protocols;
only the average plan showed complete forward generation. The
code was mainly written by tracing through the plan structure
instead of the linear structure, so lines of code appeared for
adjacent nodes in the plan structure, creating both forward and
backward code generation. Focal design was visible in the decomposition
of the longest sunny spell cue into its goal and object, and the
attempt to define the object first. When the plan was attempted
a second time, the maximum plan was partially developed verbally,
then its input was coded, then the plan was coded in two attempts.
Lines of code far from the focus of the plan, such as initialisations
and declarations, were never coded.
The role, goal, and object representation was able to explain the protocol at a very fine level of detail. The problems in the protocol can be described as problems about how to define the slots in a cue: goals were the simplest slot to define, then objects, then roles. The plan structure was built first through the goal and object links, then roles were added to define the code at the concrete level. The model does not consider design within the cue, so the model's behaviour is simpler and more regular than that seen in the protocol.
Cue-based search provides a powerful and general search mechanism.
Search using the role, goal, and object keys in a cue can retrieve
more or less specific forms of knowledge. As the returned structures
are merged, new and more specific knowledge is created and stored,
so learning is automatic. Because the current location defines
the questions that can be asked from there and the answers are
returned and linked to that cue, all knowledge is local. There
is no global supervisor: the focus of attention moves from one
cue to another during design. Partial knowledge within a cue is
captured by the three-valued logic in which a slot is unknown,
known to contain nothing, or known to be filled with a useful
value. If a link is vacant, then it can be filled by searching
for the desired knowledge, or by building the knowledge from simpler
pieces. Partial knowledge within a node or plan is simply the
current, incomplete state of the solution.
The process of design is explained as a choice about how to traverse
the program structure. A design strategy is defined as a choice
of direction, link, and level. The main design strategies are
forward and top-down, and backward and bottom-up, through the
plan or linear structure. Focal design links cues through their
object and goal structures, then the plan and program structure
are added later. Opportunistic design to fill an unexpected vacant
slot causes a deviation from the current search command. The design
process is massively redundant and scale invariant, because design
continues until all the needed slots have been defined and filled
at any level of description.
A new behaviour has been reported and simulated in this paper:
the distinction between batch and interleaved processing. Batch
processing holds the slot constant and cycles over nodes, interleaved
processing holds the node constant and cycles over slots. At a
larger scale, traditional waterfall design can be seen as batch
processing, and prototyping can be seen as interleaved processing.
Problem decomposition is simple and basic to the model. People
have no problem decomposing a solution into parts; the difficult
part of design is recomposing the pieces to find some correct
way to put the pieces together. Because all planning is local
in the model, there is no lookahead to find the best way to decompose
the problem; decomposition occurs simply through the goal and
object cues. While it is difficult to cleanly separate analysis
from design (Swartout & Balzer, 1982), the model suggests
one key distinction: analysis makes no decisions. In terms of
the model, analysis proceeds through the goal and object links,
to define the high-level plan structure, and the data flow in
the detailed plan structure. The objects and goals can be composed
in many ways, so a choice of how to combine parts is considered
to be part of design. Any detailed mechanism is then added by
specifying the role in a cue and by choosing a specific linear
order. Applying this distinction, Zippy is more a model of analysis
than design, because it says little about how to constrain the
form of the final solution.
Variability in design is explained as a choice of which question
to ask next, and the answer that is returned from the search.
Variation within a solution can be explained as different choices
about the order to search and to combine the role, goal, and object
structure. This order is important, because different approaches
to the problem can generate radically different solutions (Ratcliff
& Siddiqui, 1985; Rist, 1989). A choice made during generation
determines the appearance order and organisation of the code,
and this order is retained in the final solution, unless the code
is re-organised. Because human working memory is quite small,
the results of design must be dumped to external memory or lost;
when a part of the solution is placed in external memory, then
the design is "frozen" and difficult to change (Green,
1989). An existing design thus constrains future choices, because
the existing pieces are often used in a new part of the solution,
even if they are not the best pieces to use.
Different programmers will generate a wide range of solutions,
due to their different knowledge and their different search strategies;
the model captures variability within a population, not within
an individual. Plan selection is not an important issue in this
model, because this is a satisficing (Simon, 1969) and not an
optimizing model. The basic approach may be called the "grab
and run" model of design, in which a programmer chooses a
cue as a focus, searches for a node that matches the cue, and
repeats the process until a solution is complete. Variability
in choosing a focus for design applies the same mechanism used
by Schank (p. 87, 1986) to generate explanation questions: "Part
of the variability in the explanation process comes from the great
deal of choice we have in picking features to focus on ... that
will determine the questions that are posed and which, in turn,
will determine ... possible answers.".
Expertise in programming should reduce variability in three ways:
by defining the best way to approach the design task, by supplying
a standard set of schemas to answer a question, and by constraining
the choices about execution structure to the "best"
solutions. Expert design occurs through the more abstract levels
of search, particularly the goal, object, and plan structures.
Both top-down and bottom-up design occur mainly through the plan
links, and focal design occurs through the goal and object links
separately. The plan structure of a program can be built without
roles, so the structure and dependencies can be mapped out before
committing to a specific mechanism. In the protocol of design,
the plan structure was usually derived first, then the roles were
added as a final step. Novice design focuses on the mechanism
so it emphasises the role structure; this makes novice design
both concrete and complex, because roles do not link nodes in
the way that plan structure does.
The major omission in the present model is informed choice. The comparison of alternative plans for the same goal is a characteristic of experts and not of novices, so it is not part of a basic model of design. A novice does not make a choice between solutions; each novice generates a different solution. The architecture can provide the data to be used in an informed choice about what to do next, and a location for the choice to be made, but the definition of the expertise required to evaluate the choices is a large project. Such a decision involves an evaluation of plan complexity and difficulty, the comparison of alternate choices, and a knowledge about the resources available to the designer.
Zippy combines memory search and planning, so it has antecedents
in both areas. In the area of memory search, it uses a description
as a search cue in a manner similar to the RABBIT system (Williams,
1984). Zippy is probably closest to Kolodner's (1984) model of
memory retrieval, CYRUS. In that model, a discrimination net was
traversed to answer queries and to reformulate a query when the
answer could not be found. When data was added to the model, it
was added at the location that was closest to the data so that
a minimal change in the net was made. These two processes are
similar to the search of a plan or chain, and the construction
and addition of a new node at the end of a chain in Zippy. The
construction of nodes from the words in the English specifications
resembles the process of text understanding and direct memory
access parsing seen in Schank's work (Schank and Riesbeck, 1981;
Riesbeck and Martin, 1985).
The model complements case-based reasoning (Hammond, 1990; Kolodner,
1993). It creates cases from more basic knowledge and stores them
as nodes and program schemas. Once a schema has been stored, the
model is a simple case-based reasoning system, in which a case
is retrieved by a cue and applied. Case adaptation has not yet
been implemented, but should not be difficult. A plan to be changed
is a single chunk in the plan structure and a single node in memory,
so a new plan can simply replace the inappropriate plan in the
plan structure, then constraint propagation is applied to define
a total linear order of the actions.
The links between cues and nodes define a form of semantic net,
in which the cues on each node define a set of attributes or predicates
for that node, so the underlying architecture may be seen as a
type of semantic net. A node is a production that can be traversed
in either direction, at any level of detail, so it should support
the production mechanisms discussed by Anderson (1983): composition,
proceduralisation, activation, strength, and so on. I think of
a node as a neuron and the cues as axons and dendrites that link
neurons, so search is parallel, the current focus of attention
is the most active part of the semantic net, and attention moves
from one location to another in the net as nodes are linked.
Within the planning area Zippy is a non-linear planner, that can
do linear planning if the linear order is stored in a node or
generated by constraint propagation. By using the multi-level
structure of a cue, planning may occur at any level of abstraction
in the non-linear plan structure or in the linear structure, so
Zippy overcomes the tyranny of detail seen in many non-linear
planners. Kambhampati's PRIAR system (1989) makes a similar attempt
to chunk behaviour at several levels of description, but his system
uses predicate logic in the blocks world to do hierarchical planning,
so the basic mechanism is very different from cue-based search.
Zippy may also be seen as a form of belief maintenance system,
in which the activation of a cue in episodic memory corresponds
to ruling a fact "in"; semantic memory contains far
more knowledge than is used in any single planning episode.
As an automatic programmer, Zippy uses an extension of the producer-consumer
relation (Barstow, 1982), and is perhaps closest to the DAROS
system built by Kant (1991). In her system, a set of modules were
indexed by various features, and planning occurred backward at
the module level when a module asked to be given a certain feature.
Kant's model does not have the scale invariance supplied by Zippy,
and it was very much concerned with the efficiency of the implemetation,
so the concerns of the two models are very different. Most models
of automatic programming are based on some form of predicate calculus
and are focussed on deriving provably correct programs, so they
are not directly comparable to a model of human program design;
people in general first design a program, then worry about proving
correctness.
The definition of an external interface separate from any internal
implementation provides much of the power seen in information
hiding and formal specification (Dijkstra, 1976; Meyer, 1988;
Parnas, 1972). A long-term goal of automatic programming has been
the definition of a wide spectrum language that can be used to
represent programs over a wide range of levels (Barstow, 1991).
Node structure is more than wide spectrum, it is scale invariant,
because cues and nodes are treated the same at each level of abstraction,
down to the base level. The semantics of specific goals and objects
have been defined only for the set of examples used in Zippy,
however, and the extension of these semantics to larger problems
and different domains is an open problem.
Most research on the psychology of program design has concentrated
on the mechanism of programming, so roles and behaviour have been
of central concern. The Programmer's Apprentice project (Rich
and Waters, 1990), for example, builds code forward and top-down
by connecting roles such as "loop", "accumulate"
and "filter" to complex data structures. The emphasis
on role, object, and behaviour fits into a view of design as forward,
top-down, and state-based, where a state is defined by the value
of the program variables. Most research on automatic programming
also has a top-down, forward bias based on variable state, where
the representation is some form of predicate calculus that is
used for forward theorem proving. This paper emphasises backward,
non-linear, and bottom-up planning where a state is a partial
solution. The non-linear data and control links can be traced
forward or backward, so program slices and plan structure are
formally isomorphic. However, the way that knowledge is chunked
and combined is very different, depending on whether the plan
or the mechanism is seen as the basic unit of planning.
Behaviour can be ignored if all the plans in a program are correctness-preserving.
If the plans are correct, then simulation or execution of the
code simply confirms the correctness of the plan. If a plan is
correct in isolation but incorrect when applied, the the cues
on the ports can be used to simulate or execute the chunk of code,
and the expected output can be checked against the actual output.
A formal program semantics based on roles can be used to define
the behaviour of a node, and the behaviour can be tested against
the assertions on each node to find any expectation violations.
If there is an expectation violation, then the plan structure
can be traced backward from the location of the incorrect output
to find the bug location. Once found and identified, the code
can then be extended or altered. In Norman's (1986) theory of
action, the model bridges the gulf of execution from goal to action,
but it does not yet bridge the gulf of evaluation from system
behaviour back to goal, because the behaviour of the plan is not
used.
Experts develop and simulate a partial solution at each level of detail in top-down design (Adelson and Soloway, 1985). In the present model, this is done by an abstract backward pass followed by a forward pass that adds the procedural structure at the next level of detail. Forward simulation tests the behaviour of the artifact, so simulation based on behaviour may be seen as a later step than the linking of nodes discussed here. Experts follow a balanced, breadth-first strategy so their solution state shows an orderly development. For novices, or for experts working on a hard problem, the solution state is characterised by the retrieval of partial solutions, at different levels of detail or abstraction, without a previous procedural decomposition. Such serendipitous design (Guindon, Krasner, and Curtis, 1987) neatly describes the pattern of development seen in the model simulation and the protocol, and seems characteristic of new design by the novice or by the expert.
An executable model of object-oriented design has not yet been
written, but the theory is clear. Plans and objects are orthogonal,
because one plan can use many objects and one object can take
part in many plans. The code that executes a plan will thus be
spread over several classes. A node is a routine in an Eiffel
system, because it contains the part of the plan that resides
in that class; more formally, a node is indexed by the pair <goal,
object>. A class is used to group the parts of each plan that
use the same object, so a class is defined by a set of cues that
hold the object constant, but vary the goal. Pre- and post-conditions
(Meyer, 1992) are easily stored with the cues on a node, so the
model notation matches the structure of an Eiffel system. The
executable system structure is generated by the interaction between
the goals and objects in the system (Rist, 1994; Rist & Terwilliger,
1995).
Plan structure has been applied in a program maintenance tool
called Tracer (Rist & McDermott, 1993) that takes the plan
structure of a Pacal program produced by PARE, and allows the
user to navigate through the data and control structure. The user
clicks on a line of code to define the focus and selects the direction
and type of link to pursue, then Tracer finds the connected code
and highlights it. Routines may be expanded, and the amount of
highlighted code can be controlled by setting terminals to stop
exploration of a branch. The first task of a maintenance programmer
is to find the relevant code to be understood when a change is
made (Koenemann & Robertson, 1991). This is a difficult, time-consuming
and error-prone task that is often effected by leafing through
pages of program code, looking for the link from one line of code
to another. In Tracer, the user sets the focus, link, and direction,
and the system finds and shows the relevant code; the person supplies
the intelligence and the machine supplies the accuracy. The text
is linked to the nodes and cues in the program, so finding the
relevant code from the specification (requirements tracking) is
trivial.
The representation has also been applied to automated program understanding. Plan structure abstracts over code order, code organisation, and role merging, but not over plan merging. The Plunder (Plan Understander) system (O'Keeffe and Rist, 1994) takes a "canonical" node structure and a target program, and tries to match them. The target is converted to node structure by PARE, and the two structures are then traversed through their data and control flow links. If there is a mismatch, nodes are merged or split, and comparison of the two solutions continues until the canon matches the target. The node split and merge operations are correctness-preserving, so at termination the two solutions are shown to be equivalent. Most of the initial matching information comes from the variable names, that are usually built from the goal, object, or plan keys and are thus easily matched to their cues. Analysis of the program structure for plan recognition has tended to focus on the abstract syntactic, procedural, or logical structure of the code, so the plan dependencies and variable names are often ignored. From the analysis presented here, the words used in the specification and the program are the keys to design and understanding.
This paper presents a new and powerful cognitive architecture
that has strong links to human cognition. The basic mechanism
in the model is to select a cue, broadcast it, and link the returned
node to the cue, so a solution is constructed by linking parts
of memory and design is implemented as a search through internal
and external memory. By linking one node to the next during design,
learning is automatic and the solution is a by-product of linking
nodes through cues. The words that are used to define the problem
are the keys to solving it, because they define the role, goal,
and object keys that index stored knowledge and are mapped directly
onto the internal representation.
This paper contributes four new pieces to the study of human program
design. First, the cognitive representation based on cues and
nodes is scale invariant, so it allows search at any level of
abstraction. Second, plan structure is built by merging the goal
and object structures in the model, and linear structure is derived
from plan structure, so the model explicitly separates and searches
the goal, object, plan, and linear structure of a program. Third,
the cognitive architecture can be programmed to capture the common
design strategies described in the literature; essentially, the
representation defines a language for program design, and the
architecture shows how to program in that language. Fourth, the
model explicitly separates the two main components of programming
expertise: knowledge of plan schemas and knowledge about how to
design.
The model provides four new pieces for the field of automatic
programming. First, it provides a wide spectrum language that
can specify a node at any level of abstraction. Second, it shows
how to represent and use knowledge that has an irregular structure,
as well as the regular structures that have received most attention
in previous automatic programmers. Third, it uses the non-linear
plan structure of the code as the basic representation instead
of the procedural or syntactic structure of the program. Fourth,
it shows how the mechanism of program execution, defined by the
role, is used within the non-linear planning framework defined
by the goal and object structures.
The architecture supports four programming paradigms: declarative,
procedural, functional, and object-oriented. The representation
of a problem is declarative, and search proceeds by a series of
unifications and failures; the model bears a strong resemblance
to Prolog, in fact. Procedural design is based on the role and
linear structure of the program, and a procedure is used to group
code that transforms a cue. Functional design is based on the
goal structure of the program, and a function is used to group
code that makes a new cue. Object-oriented design is based on
the object structure of a system, and a class is used to group
code that uses the same object.
From a single cue, there are many directions to search, and many data sources to search. The program grows as a cue is chosen to define a question, and a data source is chosen to provide an answer. A design strategy defines a trajectory through the program structure, and defines the context within which later searches are made. A program grows as nodes are connected, with little overall control. The only planning mechanism is the shift of attention from one focus to another in internal and external memory. Cue-based search through the node structure provides a massively redundant model of design and simulates human design by supporting multiple design strategies, opportunistic design, and situated cognition within a single architecture.
Adelson, B. (1984). When novices surpass experts: The difficulty of a task may increase with expertise. JEP: Learning, Memory and Cognition, 10, 483-495.
Adelson, B. & Soloway, E. (1985). The role of domain experience in software design. IEEE Transactions on Software Engineering, 11, 1351-1360.
Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.
Anderson, J. R., Farrell, R. & Sauers, R. (1984). Learning to program in LISP. Cognitive Science, 8, 87-129.
Anderson, J. R., & Jeffries, R. (1985). Novice LISP errors: Undetected losses of information from working memory. Human-Computer Interaction, 1, 107-132.
Ball, L. J. (1990). Cognitive processes in engineering design. Unpublished Ph. D. thesis, Department of Psychology, Polytechnic South West (Plymouth), U. K.
Barstow, D. R. (1982). The roles of knowledge and deduction in algorithm design. In Hayes,
J. E., Michie, D. and Pao, Y. H. (Eds.), Machine Intelligence 10. New York: John Wiley & Sons.
Barstow, D. (1991). Automatic programming for device control software. In M. R. Lowry and R. D. McCartney (Eds.), Automating Software Design, pp. 123-140. Menlo Park, CA: AAAI Press.
Brooks, R. (1977). Towards a theory of the comprehension of computer programs. Int. J. Man-Machine Studies, 18, 543-554.
Burstall, R. M., & Darlington, J. (1977). Some transformations for developing recursive programs. Journal of the ACM, 24, 44-67.
Chase, W. G. & Ericsson, K. A. (1981). Skilled memory. In John R. Anderson (Ed.), Cognitive skills and their acquisition, pp. 141-190. Hillsdale, NJ: Lawrence Erlbaum.
Davies, S. P. (1990). The nature and development of programming plans. Int. J. Man-Machine Studies, 32, 461-481.
Davies, S. P. (1991). The role of notation and knowledge representation in the determination of programming strategy. Cognitive Science, 15, 547-572.
Detienne, F. (1995). Design strategies and knowledge in object-oriented programming: Effects of expertise. Human-Computer Interaction, 10.
Dijkstra, E. W. (1968). A constructive approach to the problem of program correctness. BIT, 8, 174-186.
Dijkstra, E. W. (1976). A discipline of programming. Englewood Cliffs, NJ: Prentice-Hall.
Dromey, G. (1989). Program derivation. New York: Addison-Wesley.
Feather, M. S. (1982). A system for assisting program transformation. ACM Transactions on Programming Languages and Systems, 4, 1-20.
Goffman, E. (1974). Frame analysis. Penguin: Harmondsworth, Middlesex.
Green, T. R. G. (1989). Cognitive dimensions of notations. In A. Sutcliffe and L. Macauley (Eds.), People and Computers V. Cambridge: Cambridge University Press.
Green, T. R. G., Bellamy, R. K. E., & Parker, M. (1987). Parsing and gnisrap: a model of device use. In G. M. Olson, S. Sheppard, and E. Soloway (Eds.), Empirical Studies of Programmers: Second Workshop, pp. 132-146. Norwood, NJ: Ablex.
Griffith, B. (1980) E unum pinhead. San Francisco: Last Gasp Comics.
Guindon, R., Krasner, H. & Curtis, B. (1987). Breakdowns and processes during the early activities of software design by professionals. In G. M. Olson, S. Sheppard & E. Soloway (Eds.), Empirical studies of programmers: Second workshop (pp. 65-82). Norwood, NJ: Ablex.
Hammond, K. J. (1990). Case-based reasoning: A framework for planning from experience. Cognitive Science, 14, 385-444.
Harandi, M. T., & Ning, J. Q. (1990). Knowledge-based program analysis. IEEE Software, 7, 74-81.
Hartman, J. (1991). Automatic control understanding for natural programs. Ph. D. thesis AI91-161, Artificial Intelligence Laboratory, Department of Computer Sciences, University of Texas at Austin: Austin, Texas.
Hayes-Roth, B. & Hayes-Roth, F. (1979). A cognitive model of planning. Cognitive Science, 3, 275-310.
Horwitz, S., Prins, J., & Reps, T. (1989). Integrating noninterfering versions of programs. ACM trans. on Programming Languages and Systems, 11, 345-387.
Jeffries, R., Turner, A. A., Polson, P. G. & Atwood, M. E. (1981). The processes involved in designing software. In J. R. Anderson (Ed.), Cognitive skills and their acquisition (pp. 285-283). Hillsdale, NJ: Erlbaum.
Kambhampati, S. (1989). Flexible reuse and modification in hierarchical planning. (Tech. Rep. CAR-TR-469). Center for Automation Research, University of Maryland, MD.
Kant, E. (1985). Understanding and automating algorithm design. IEEE Transactions on Software Engineering, 11, 1361-1374.
Kant, E. (1991). Data relationships and software design. In Lowry, M. R. and McCartney, R. D. (Eds.), Automating Software Design. (pp. 141-168). Cambridge, MA: The MIT Press.
Kant, E. & Newell, A. (1984). Problem solving techniques for the design of algorithms. Information Processing and Management, 28, 97-118.
Kessler, C. M. & Anderson, J. R. (1986). Learning flow of control: Recursive and iterative procedures. Human-Computer Interaction, 2, 135-166.
Koenemann, J. & Robertson, S. P. (1991). Expert problem solving strategies for program comprehension. In S. P. Robertson, G. M. Olson, and J. S. Olson (Eds.), Proceedings of CHI '91 (pp. 125-130). ACM Press: New York.
Kolodner, J. L. (1984). Retrieval and organizational strategies in conceptual memory: A computer model. Hillsdale, NJ: Erlbaum.
Kolodner, J. (1993). Case-based reasoning. San Mateo, CA: Morgan Kaufmann.
Letovsky, S. & Soloway, E. (1986). Delocalized plans and program comprehension. IEEE Software, 3, 41-49.
Meyer, B. (1988). Object-oriented software construction. New York: Prentice Hall.
Meyer, B. (1992). Eiffel: The language. New York: Prentice Hall.
Naur, P. (1972). An experiment on program development. BIT, 12, 347-365.
Newell, A. & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall.
Norman, D. A. (1981). Categorization of action slips. Psychological Review, 88, 1-15.
Norman, D. A. (1986). Cognitive engineering. In D. A. Norman and S. W. Draper (Eds.), User centered system design, pp. 31-61.
Norman, D. A. & Bobrow, D. G. (1979). Descriptions: An intermediate stage in memory retrieval. Cognitive Psychology, 11, 107-123.
O'Keeffe, J, & Rist, R (1994). A model of plan recognition. Unpublished manuscript.
Parnas, D. (1972). On the criteria to be used in decomposing systems into modules. Comm. ACM, 15, 1053-1058.
Parnas, D. L. & Clements, P. C. (1986). A rational design process: How and why to fake it. IEEE Transactions on Software Engineering, SE-12, 2, 251- 257.
Ratcliff, B. & Siddiqui, J. I. A. (1985). An empirical investigation into problem decomposition strategies used in program design. International Journal of Man-Machine Studies, 22, 77-90.
Rich, C., & Waters, R. C. (1990). The Programmer's Apprentice. New York: ACM Press.
Riesbeck, C. K. & Martin, C. E. (1985). Direct memory access parsing. Research Report 354, Department of Computer Science, Yale University: New Haven.
Rist, R. S. (1986). Plans in programming: Definition, demonstration and development. In E. Soloway and R. Iyengar (Eds.), Empirical studies of programmers (pp. 28-47). New York: Ablex.
Ris