< previous page page_69 next page >

Page 69
the subset designated by x, is also a random variable, A Îx being tried with probability ( P(A))/( SA'ÎxP(A')) and yielding payoff µ(A). In what follows x will be used to designate both an element of X and the corresponding random variable with sample space x, the particular usage being specified when it is not clear from context. As a random variable, x has a well-defined average µx (and variance C0085-01.gif) where, intuitively, µx is the payoff expected when an element of x is randomly selected under the marginal distribution ( P(A))/(SA'Îx P(A')).
Clearly, when C0085-03.gif, instances of x (i.e., A Îx) are likely to exhibit performance better than the current average C0084-05.gif. This suggests a simple procedure (a bit too simple as it turns out) for exploiting combinations of attributes associated with better-than-average performance while further searching C0021-03.gif: (i) try instances of various schemata until at least one schema x is located which exhibits a sample average C0085-02.gif; (ii) generate new instances of the (observed) above-average schema x, returning to step (i) from time to time (particularly when the increasing overall average C0085-05.gif comes close to C0085-04.gif) to locate new schemata {x'} for which C0085-02.gif. In effect, then, credit is apportioned to a combination of attributes in accord with the observed average performance of its instances. This procedure has some immediate advantages over a fixed random (or enumerative) search of C0021-03.gif: it generates improvements with high probability while gathering new information by trying new C0021-02.gif; furthermore, the new trials of the above-average schema x increase confidence that the observed average C0085-06.gif closely approximates µx. It is oversimple because each instance C0021-02.gif tried yields information about a great many schemata other than xinformation which is not used.
Given l detectors, a single structure C0021-02.gif is an instance of 2l distinct schemata, as can be easily affirmed by noting that A is an instance of any schema x defined by substituting "C0084-01.gif"s for one or more of the l attribute values in A's representation. Thus a single trial A constitutes a trial of 2l distinct random variables, yielding information about the expected payoff µx of each. (If l is only 20 this is still information about a million schemata!) Any procedure which uses even a fraction of this information to locate x for which C0085-07.gif has a substantial advantage over the one-at-a-time procedure just proposed.
Exploiting this tremendous flow of information poses a much more clearly defined challenge than the one which started the chapter. Schemata have advanced our understanding, in this sense, but the new problem is difficult. The amount of storage required quickly exceeds all feasible bounds if one attempts to record the average payoff of the observed instances of each schema sampled. Moreover, the information will be employed effectively only if it is used to generate new C0021-02.gif which, individually, test as many above-average schema as possible. The adaptive

 
< previous page page_69 next page >

If you like this book, buy it!