< previous page page_132 next page >

Page 132
a system which is simple and artificial, while the other is to a system which is complex and natural.
3. Robustness Vis-À-Vis A Simple Artificial Adaptive System
The first application concerns game-playing algorithms. The game-playing illustration (section 3.3) begins by pointing out that the outcome of a 2-person game without chance moves (a strictly determined game) is fixed once each player has selected a pure strategy. Assume, for present purposes, that the opponent has adopted the best pure strategy available to him (the minimax strategy) for use in all plays of the game. Then any pure strategy selected by the adaptive plan will lead to a unique outcome and a unique payoff (again, as pointed out in section 3.3see Figure 4). Thus, the function µ which assigns payoff to outcomes can be extended to the strategies C0039-11.gif employed by the adaptive plan, assigning to each strategy the unique payoff it achieves against the opponent's fixed strategy. (It is helpful, though not necessary, to think of these payoffs as wide rangingnumerical equivalents of "close win," "loss by a wide margin," etc., rather than just 1, 0, - 1 for "win,'' "draw," "loss.") The strategies available to the adaptive plan will be limited to a set of strategies fundamentally little different from the threshold pattern recognition devices of section 1.3. These strategies are based on the recognition and evaluation of positions (configurations) in the game tree and are substantially the same as those employed by Samuel in his 1959 checkers-player. Each strategy in C0039-11.gif is defined by a linear form C0148-03.gif where: (i) C0148-04.gif evaluates each configuration C0148-05.gif for a property relevant to winning the game (e.g., in checkers, d1 might assign to each configuration the difference in the number of kings on each side, d2 might count the number of pieces advanced beyond the centerline, etc.); (ii) C0148-06.gif weights the property according to its estimated importance in the play of the game. The linear form determines a move by assigning a rank
C0148-01.gif
is the set of configurations legally attainable on the yth move; then that move is chosen which leads to a configuration C0148-09.gif of maximal rank, i.e., C0148-07.gifC0148-08.gif.
The objective now is to find an adaptive plan which searches the set of strategies C0039-11.gif so that performance improves rapidly. To keep the example simple only the special case of correction of weights at the end of each play of the game

 
< previous page page_132 next page >

If you like this book, buy it!