Artificial Intelligence 118
The Expert Model 118
The Student Model 120
The Pedagogical Model 122
The Knowledge Base 122
The Interface 124
Hypertext and Command Menus 124
The structure of semantic memory 125
Knowledge Representation 127
Knowledge Structuring: 127
Web Learning: 129

Artificial Intelligence

This volume proposes that Artificial Intelligence approaches to the application of advanced technologies for instruction, particularly as these ideas have been refined in Intelligent Tutoring Systems (ITS) and Environments, provide the theoretical framework and intellectual stimulus for the best of computer assisted instruction. The best of ITS have produced not only the most advanced prototypes for what good instruction based on the computer can do; but it has also forced the elaboration of a theoretical framework that has split the task of instruction into four meaningful and manageable components. Each of these has served as an important and deeply explored body of research.

The Expert Model

Expert systems are the most widely recognized application of artificial intelligence technology. It will therefore be useful to consider how the expert model in an ITS is and is not like an expert system . (For a more detailed discussion, see Clancey, 1986).

As an expert system, the expert model of an ITS contains both facts and procedures relevant to a domain. Both expert systems and ITS components should be capable of using these facts and procedures to solve problems that could be solved by a human expert. Moreover, both must be able to explain to a human why or how they reached their conclusions.
Expert systems and the expert model of an ITS differ mainly in the degree to which their procedures resemble those used by human experts. Useful expert systems simply organize their knowledge base into an arbitrary, but very efficient, collection of "if ... then" rules. In contrast, an expert system for training must organize its rules into cause and effect sequences or hierarchical structures that more closely resemble the organization of concepts by expert humans. There are many reasons why this is the case, but primarily this structure is adopted so that the other components can fulfill their function by engaging in a communicative dialogue with the expert model.

Significantly, the role of cognitive science and AI tools and environments has been largely to reify the abstract connectivity of conceptual structures to make them amenable to public inspection. This is also the focus of hypertext systems (as will be discussed in greater detail later). Work on reifying declarative conceptual structures is proceeding apace and can in some sense be seen as the special focus of the IDE hypertext environment (Russell, Moran and Jordan, 1988; see also Lenat et al., 1985). As this paper will try to clarify, this is one of the fundamental strengths of computational grammars and tools connected to them: they can make the grammatical relations among words visible and manipulable. The growth of these understanding perspectives has emboldened a vision of profoundly influencing mental skills of thinking and problem solving in much the same way as athletic and music coaches have influenced the growth of performance skills.


Technologies that will be described in greater detail later: Prolog and Unification Grammars, make it possible to create expert systems that can parse and to some extent understand natural languages, especially related languages like English and German. Their degree of expertise varies considerably with the amount of time and effort expended on their development, but for beginning learners of foreign languages, these systems are surprisingly competent. To summarize for the moment, computational systems can formulate some of the understanding skills of human language users by creating symbolic expressions of the grammatical rules of syntax and lexicon. These rule systems provide an objective and detailed description of language skills. How cognitively valid they are is a matter of some debate, but there is every reason to believe that it is worthwhile to examine the question experimentally; and even if it turns out that these representation systems are not good cognitive models of human skills, they can nevertheless be used to create learning environments of considerable power (Yazdani, 1987).


The Student Model

Student models are built in real time during the trainee's interaction with the ITS tutor. Each trainee's model is dynamic in that it is modified by the most recent trainee- computer interaction. Often it uses as its source a library of bugs, typical errors, and inconsistent models of the world that correspond to common misconceptions about the causal structure of the world held by many trainees. The structure of these resources depends heavily on explicit knowledge acquisition research in the best tradition of cognitive science. The student model guides the tutor in pacing and sequencing the instruction Thus the ability to build an accurate trainee model is a critical component of any ITS.

Sleeman & Brown (1982) have discussed three approaches to the representation of trainee knowledge. The overlay approach represents the trainee's knowledge as a subset of the expert model. The differential approach "abstracts how the trainee's behavior is critically different from that of an expert". The coach approach minimizes interaction and focuses on a prearranged group of issues that are individualized to a trainee's particular stage of development. Other approaches such as perturbation
and mal- rules attempt to represent the trainee's knowledge as misconceptions and deviations in the procedural structure associated with correct skills. It should be evident that the student model is not simply a reduced subset of the expert model. Nevertheless, an accurate analysis of the expert model is usually an excellent foundation and precondition for developing an approach to the student model.


The computational techniques used to create parsers and grammars also provide a way of describing students' language skills in the form of rules and data. Lytinen and Moon (1988) present a thoughtful analysis that provides a provocative case for the argument we are proposing here. They delineated a system (IMMIGRANT) that analyzed real teachers' instructions to students of second languages in English and wrote new rules for German grammar based on these instructions. For instance, an example instruction might be : "In German, verbs come at the end of relative clauses." or, "In German, the case of a prepositional object must match the case required by the preposition." Obviously, these are simple rules for relatively novice students.

For each instruction the system analyzed the terms in the instructions and created a proposed set of rules to add the English rule base: e.g.; PP
(1) = PREP
(2)=NP
(1 RIBORD) = (2 LEFBORD)
(LEFBORD) = (1 LEFBORD)
(RIBORD) = (2 RIBORD)
(HEAD) = (1 HEAD)
(HEAD PREP-OBJ) = ( 2 HEAD)

When unified with the total English grammar database, IMMIGRANT adds:

(1 CASE) = (HEAD PREP-OBJ)

This then defines how the English grammar is inadequate as a model of German grammar. One can also think of this as a technique that can be applied to the analysis of any particular student's skills to define the inadequacies of the student's grammar.

Imagine, furthermore, a systematic set of unifications to generate a more or less complete German grammar as derived from the English. How different would this look from a German grammar created explicitly for German? How different would the actual sequence of linguistic utterances be when generated by these two hypothetical systems, one derived from English and the other
created specifically for German? Also, could one use the difficulty of transforming particular English rules to predict the difficulty that English speakers have learning those differences? These are questions that have no answer yet, but await a detailed computational analysis of these sorts of student models.



For the first time, we are able to make specific predictions about second language production by novice language learners on the basis of their first language using this explicit theoretical framework. For now, we must expect limited success from such a development, since these techniques are only beginning to develop. There is much room for improvement, but a bold new enterprise lies before us.



The Pedagogical Model

The pedagogical model is the curriculum part of an
ITS. This component generates instruction based upon the trainee's instructional history. The ability to generate instruction is best understood by way of contrast to more traditional forms of computer- based training (CBT). In traditional CBT, a limited number of instructional sequences are built into the design of the lesson. Most CBT presents the trainee with a fairly well- defined, hierarchical path from start to finish. Additional prestored sequences may be provided to trainees who need extra help on a point, but the trainee is quickly shuttled back to the main sequence. Feedback for the trainee's answers is also prestored and is typically based only upon the trainee's immediate answer to the question. The trainee's history of prior successes and failures seldom affects this feedback. In contrast, in ITS both the instructional sequence and the feedback for answers is generated online, dynamically by the tutor in light of the trainee's unique instructional history. ITS thus have the potential to provide each trainee with an individualized instructional sequence.

The pedagogical model is not just more complex, it is principled in ways that earlier forms of CBT could not be. The pedagogical model is rule-based and able to adopt both strategic and tactical approaches to instruction. Given a student model that may be a huge semantic network of the appropriate knowledge, the pedagogical model may analyze an individual trainee's knowledge network into specific trees or grammars to decide which nodes are ripe to be used to attach new information to make the structure grow in a predefined way. Similarly, the model may use this structure to decide which resources in the knowledge base to use to have the greatest effect

The Knowledge Base

The knowledge base is potentially the most important and extensive component. Knowledge of every kind needs to be made available for instructional purposes, and it needs to be in the form for most effective training. The knowledge base constitutes more than the curriculum. It also needs to be the environment where training takes place. It is usually an enormous set of resources ( graphics, simulations, text, and animations) that must be carefully structured and organized so that it can be properly presented through the human interface. Convenient knowledge representation techniques include graphic simulations, graphic browsers, qualitative models, and hypertext.



Too little work has been aimed at developing new representations for information and relationships. There are several important categories: representing justifications, consequences, and causal mechanisms. A representation is a structure of objects, relationships, and operations together with a mapping that places this structure in some explicit relationship (correspondence, transformation). Of most obvious importance is the development of structures that encode and make explicit causal connections, and decompose systems into simpler causal structures where functional relationships are more apparent.
Hypertext systems provide the most general and powerful solution to these problems. Graphic browsers are the best developed of these representation systems and underlie most useful hypertext systems. Qualitative models, viewed generically, provide additional interactive and programmatic possibilities. The graphics packages used in MACH-III (developed out of the STEAMER projects) make inspections of the causal relationships among components much more direct than any physical device can allow.

Qualitative simulations provide convenient vehicles for creating systems with both meaningful structural and functional components. Instead of a textual description of terminology and its interrelationships, the structure is described visually. Functional relations can also be described by using animation, color, arrows, and textual descriptions. However, visual descriptions of both structure
and function are notoriously concrete: it is difficult to obtain the right level of abstraction without the use of conceptual descriptions in a textual and hierarchical form. Furthermore, it is difficult to create visual descriptions at different levels of abstractions in a truly hierarchical format. Functional and structural relations must be compressed or eliminated in arbitrary ways that defy accurate concrete representations. So, it is only when conceptual (textual) and concrete (visual) representations complement each other that adequately faithful models can be created. These problems must be resolved at the point where tutor and trainee meet, the human interface.


The Interface
The resources of the knowledge base are all employed in the course of instruction. Access to this large database can often be unwieldy, time consuming, and cumbersome. A natural language interface would resolve many problems, but it remains a dream quite a distance beyond our grasp for most applications. Because of the complexity of the materials available, unique interfaces have been designed to manage the potential explosion of possibilities. The most important recent development has been the exploration of hypertext structures.
Hypertext provides a systematic approach to structuring and
delivering knowledge organized in complex networks and presented in multi-media formats. It is uniquely suited to training design and development. Some common and important features of hypertext systems that can be exploited for training will be described in the following sections.

Vannevar Bush (1945) is generally credited with the basic ideas of hypertext systems. However, it is only recently that computers with high bandwidth graphical displays have been able to implement these ideas in an acceptable way ( Forsdick, 1985, Halasz et al., 1985, McCracken and Akscyn, 1984, McMath, Tamaru, and Rada, 1987).
The central features of a hypertext, multimedia system appear to be hotspots that appear on the interface of the screen/text/graphic surface that can be activated to bring up further selections or the materials that have been linked to the hotspot. The rich interlinkages of graphic and textual materials that can be created with these mechanisms are only beginning to be explored. The unique facilities for linking text and simulations in very flexible but controlled ways is the hallmark of hypertext systems.


Hypertext and Command Menus

Hypertext systems are integrally linked to command menu systems, such as those popularized so dramatically in modern electronic spreadsheets. Texts written in hypertext act like implicit command menus. However, command menus are usually simple trees or hierarchies. When one is at the bottom of the tree, it is usually necessary to follow the tree back up to the top before taking another track to get to a different part of the tree. Hypertext opens the real possibility of converting these trees into networks. This raises the very real question of how to structure the networks. The suggestion has been made by many researchers that the proper structure to use can be determined by comparison with the structure of semantic memory.

The structure of semantic memory


The structure of semantic memory has been analyzed in various terms as a system of nodes and links, much like a hypertext system ( Anderson, 1983; Collins & Quillian, 1969; Rosch & Mervis, 1975; Smith & Medin, 1981). Within this framework the structure of concepts and the links between them are dependent on the relationship of features and default values stored at each concept node. In order to elucidate this structure we have carried out a series of experiments on the differences between novices' and experts' understanding of LOTUS commands (Mutter et al., In Press).

Information about the novice- to- expert transition is
critical for developing training for the use of command languages. Our study examined the evolution in knowledge structures as the trainees pass from a novice to an intermediate level of expertise. These data enabled us to develop a student model for the acquisition of expertise in using command languages.

Our task involved multi- trial cued recall (MTCR) of words representing command language concepts (e.g. cell, column, address) or functions (e.g. edit, copy, move, save). This test yields a hierarchical tree- like structure of the trainee's organization of these words. By comparing the tree- like structures from various points during the ten- week period, we expected to obtain a picture of how the organization of these command language terms changes over time. Some preliminary data from this task are presented below.

































Figure 1: Before and After knowledge structures for a person learning LOTUS.


After three weeks of training, trainees show evidence of an organization that is becoming meaningful for LOTUS. Not only are the graphs describing their knowledge structures more like those of the experts, but other measures as well showing increasing similarity. There is also evidence that these structures change with different contexts and goals. What these studies imply is that simple trees can be used for particular contexts, plans, goals, and purposes. Changing the context requires a change in the command structure, to form a new hierarchy or tree.
Since these new trees have many nodes in common with the old tree, it implies an overall structure that is a large network of concepts. Such a complex system can be fully described in a hypertext network.

Knowledge Representation


Knowledge Structuring:



The most significant problem in getting started is deciding how to structure the hypertext. The answer to this question depends upon how the hypertext will be used. The various applications of hypertext in the application section require different access and information structures. Approaches may include:
-Conceptual structures include pre-determined, content relationships, such as taxonomies.
-Task related structures are those that resemble or facilitate the completion of a task.
-Primary tasks include retrieving information, such as in information retrieval systems, and learning from instructional systems.
- Knowledge related structures are those that are based upon the knowledge structures of the expert or the learner.
-Problem related structures simulate problems or decision making.

The first method is to develop a cognitive or semantic map of the expert's knowledge using quantitative methods (Diekhoff & Diekhoff, 1982). This methods requires the expert to complete word associations of all of the related concepts in the content domain. The intercorrelations are statistically analyzed using multi-dimensional scaling or principle components to generate a structural map.

Diekhoff, G.M. & Diekhoff, K.B. (1982). Cognitive maps as a tool in communicating structural knowledge. Educational Technology, 22(4), 28-30.

Fisher, K.M., Faletti, J., Thornton, R., Patterson, H., Lipson, J. & Spring, C. (1988). Computer based knowledge representation as a tool for students and teachers. Paper presented at the annual meeting of the American Educational Research Association, New Orleans, LA, April, 1988.

Kozma, R.B. (1987). The implications of cognitive psychology for computer-based learning tools. Educational Technology, 28(11), 20-25.

McAleese, R. (1985). Some problems of knowledge representation in an authoring environment: Exteriorization, anomalous state meta-cognition and self-confrontation. Programmed Learning & Educational Technology, 22(4), 299-306.



Ted Nelson, who coined the term hypertext, believed that the author's structuring of knowledge may be arbitrary and therefore counterproductive to individual readers. Since each individual's knowledge structure is unique, based upon his/her set of unique experiences and capacities, the ways that each would prefer to access, interact with, and interrelate knowledge is also distinct. Linking new information to their knowledge structure is an inherently individualistic process. So, in order to accommodate text to the learner, rather than the learner to the text structure, the text and its structure and the sequence in which it is decoded by the learner should be malleable, rather than set. Therefore readers should be encouraged by the hypertext to jump around and even alter the text in order to make it more personally meaningful.

A schema for an object, event, or idea is comprised of a set of attributes. Attributes are the associations that an individual forms around an idea. Our schema for "fire truck," for instance, is comprised of attributes such as red, hoses, ladders, large truck, sirens as well as other associations such as firemen, dalmatian dogs, fire hydrants (a dog's schema for this one would be totally different from our's), insurance, etc. Our knowledge structure consists of schema for all of these other attributes, so that schemas are said to embed within each other. We generate concepts for ideas based upon our associations with that concept. Each schema that we construct represents a mini-framework in which to interrelate elements or attributes of information about a topic into a single conceptual unit (Norman, Gentner, & Stevens, 1976). These concepts are all arranged in a network of interrelated concepts known as our semantic network.

Perhaps the most universally accepted model of semantic networks is active structural networks (Quillian, 1968). These networks are structures which are composed of nodes and ordered, labelled relationships connecting them (Norman, Gentner, and Stevens, 1976). The nodes are instances of propositional structures or (more easily conceived of as) token instances of concepts (Norman, 1976). The relationships or links (lines) between the nodes describe the propositional connection between the nodes, that is the type of action that one exerts on the other. These structural networks may be used to represent what a learner may already know. These networks may be used also to represent the semantic relationships which describe the information to be learned.

Web Learning:


Learning is a matter of acquiring new concepts with all their rich interlinkages by constructing new nodes and interrelating them with each other and with existing nodes (Norman, 1976). It stands to reason then that the more links that can be established between new nodes and the learner's existing network, the better comprehended information will be and therefore the easier it will be to learn or acquire new knowledge.

A learner's semantic network of concepts is often diagrammed and represented spatially as webs of information. This is a stylized depiction of a web of concepts into which new constructs may be integrated. Web learning principles assume that information, when learned, is integrated into prior knowledge by means of a web structure rather than in a linear fashion. New material is intertwined in the web at nodes that are related to it . The web grows as learners acquire more detailed information.

According to web teaching principles, the effective teacher presents material in a way that allows learners the opportunity to develop some framework for relating materials to each other (Norman, 1973). In order to connect new information to the learner's existing structure, the teacher can construct and present the supporting web structure first, adding detail later. The teacher can begin with a coarse web of information, outlining topics to be discussed and then giving a general overview, followed by detailed overviews, and finally detailed sub-structures.

Hypertext displays seem naturally adapted to use web teaching methods. A good hypertext can begin by giving an advanced organizer (Ausubel, 1978; Dansereau, 1990) showing a map of the whole text. This map represents the web of interrelated concepts contained in the hypertext. An introductory set of screens would then lead into the hypertext which contains the detailed concepts. Fully applying web learning system with appropriate teaching principles involves matching the network structure of the subject matter with the semantic network of the learner. In doing so, the teacher needs to:
 assess the knowledge of the learner;
 construct a model of the student's knowledge;
 compare the knowledge of the student with the expert knowledge structure, noting differences;
 present new material needed to fill the gaps, based upon web teaching strategies (Norman, 1976 ; Halff, 1988). Hypertext is a natural complement to AI technologies for instructing in these ways.



The heart of this system than uses specially structured knowledge bases for efficient retrieval of information when it is needed. The topics of databases and knowledge bases, and information retrieval will be introduced next.

Chapter 4

Chapter 6