|
|
|
|
|
|
|
the plan to recognize critical features for goal-attainment. The plan must be able to classify each situation encountered according to the goal-directed transformation which should be applied to it. The long-term problem is that of determining whether the set of detectors is adequate to this task. Important shortcomings are indicated when, from application of identical transformations to situations classed as equivalent by the detectors, situations with critically different evaluations result. When this happens, the detectors have clearly failed to distinguish some feature which makes a critical difference as far as the transformations are concerned. The object, then, is to generate a detector which gives different readings for the previously indistinguishable situations. Among the obvious candidates are modifications of the detectors which made the distinctions after the transformations were applied. Usually simple modifications will enable such detectors to make the distinction before the transformation as well as after. |
|
|
|
|
|
|
|
|
We can look at this whole problem in another way, a way which makes contact with standard definitions in the theory of probability. Assume that the search plan assigns to each transformation h a probability dependent upon the observed situation. That is, if Sa is the current situation, then each situation SbÎ can be assigned a conditional probability of occurrence Pab, where Pab is simply the sum of the probabilities of all transformations leading from Sa to Sb. (It may, of course, be that there are no transformations of Sa to Sb, in which case Pab = 0.) A sequence of trials performed according to the probabilities Pab is a Markov chain, the outcome of each trial being a random variable (dependent upon the outcome of prior trials). The sample space underlying this random variable is the set of situations . Let us assign a measure of utility or relevance to each of these situations. (For example, goals could be assigned utility 1 and all other situations utility 0, or some more complicated assignment ranking goals and intermediate situations could be used.) Then, formally, the function W making this assignment is also a random variable. Accordingly, we can assign an expected utility to the random variable representing the outcome of each trial in the Markov chain. In these terms, the plan continually redefines the Markov chain (by changing the transformation probabilities). It attempts in this way to increase the average (over time) of the expected values of the sequence of random variables corresponding to its trials. |
|
|
|
|
|
|
|
|
The role of detectors here is, as already suggested, reduction of the size of the sample space and simplification of the search. More formally, consider a set of n detectors (not necessarily all those available), H = {d1,. . . , dn}, where H is arbitrarily ordered. The detectors in H assign to each an n-tuple of readings (v1,. . . ,vn) belonging to the direct product |
|
|
|
|
|