|
|
|
|
 |
|
|
Lacking such knowledge [of machine-learning techniques], it is necessary to specify methods of problem solution in minute and exact detail, a time-consuming and costly procedure. Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort. Samuel in "Some Studies in Machine Learning Using the Game of Checkers" IBM J. Res. Dev. 3 (p. 211) |
|
|
|
|
|
|
|
|
Most competitive games played by man (board games, card games, etc.) can be presented in terms of a tree of moves where each vertex (point, node) of the tree corresponds to a possible game configuration and each directed edge (arrow) leading from a given vertex represents a legal move of the game. The edge points to a new vertex corresponding to a configuration which can be attained from the given one in one move (turn, action); the options open to a player from a given configuration are thus indicated by the edges leading from the corresponding vertex. The tree has a single distinguished vertex with no edges leading into it, the initial vertex, and there are terminal vertices, having no edges leading from them, which designate outcomes of the game. In a typical two-person game which does not involve chance, the first player selects one of the options leading from the initial configuration; then the second player selects one of the options leading from the resulting configuration; the play of the game proceeds with the two players alternately selecting options. The result is a path from the initial vertex to some terminal vertex. The outcomes are ranked, usually by a payoff function which assigns a value to each terminal vertex. |
|
|
|
|
|
|
|
|
In these terms, a pure strategy for a given player is an algorithm (program, procedure) which, for each nonterminal configuration, selects a particular option leading therefrom. Once each player chooses a pure strategy, the outcome of the game is completely determined, although in practice it is usually possible to determine this outcome only by actually playing the game. Thus, in a strictly determined (non-chance) two-person game, each pair of pure strategies (one for each player) can be assigned a unique payoff. The object of either player, then, is to find a strategy which does as well as possible against the opponent as measured by the expected payoff. This informal object ramifies into a whole series of cases, depending upon the initial information about the opponent and the form of the game. |
|
|
|
|
|
|
|
|
One of the simplest cases occurs when it is known that the opponent, say the second player, has selected a single pure strategy for all future plays of the game. |
|
|
|
|
|