|
|
|
|
|
|
|
a system which is simple and artificial, while the other is to a system which is complex and natural. |
|
|
|
|
|
|
|
|
3. Robustness Vis-À-Vis A Simple Artificial Adaptive System |
|
|
|
|
|
|
|
|
The first application concerns game-playing algorithms. The game-playing illustration (section 3.3) begins by pointing out that the outcome of a 2-person game without chance moves (a strictly determined game) is fixed once each player has selected a pure strategy. Assume, for present purposes, that the opponent has adopted the best pure strategy available to him (the minimax strategy) for use in all plays of the game. Then any pure strategy selected by the adaptive plan will lead to a unique outcome and a unique payoff (again, as pointed out in section 3.3see Figure 4). Thus, the function µ which assigns payoff to outcomes can be extended to the strategies employed by the adaptive plan, assigning to each strategy the unique payoff it achieves against the opponent's fixed strategy. (It is helpful, though not necessary, to think of these payoffs as wide rangingnumerical equivalents of "close win," "loss by a wide margin," etc., rather than just 1, 0, - 1 for "win,'' "draw," "loss.") The strategies available to the adaptive plan will be limited to a set of strategies fundamentally little different from the threshold pattern recognition devices of section 1.3. These strategies are based on the recognition and evaluation of positions (configurations) in the game tree and are substantially the same as those employed by Samuel in his 1959 checkers-player. Each strategy in is defined by a linear form where: (i) evaluates each configuration for a property relevant to winning the game (e.g., in checkers, d1 might assign to each configuration the difference in the number of kings on each side, d2 might count the number of pieces advanced beyond the centerline, etc.); (ii) weights the property according to its estimated importance in the play of the game. The linear form determines a move by assigning a rank |
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
is the set of configurations legally attainable on the yth move; then that move is chosen which leads to a configuration of maximal rank, i.e.,  . |
|
|
|
|
|
|
|
|
The objective now is to find an adaptive plan which searches the set of strategies so that performance improves rapidly. To keep the example simple only the special case of correction of weights at the end of each play of the game |
|
|
|
|
|