< previous page page_165 next page >

Page 165
limited resources. To see this, consider the following idealized situation. There are two one-armed bandits, bandit x1 paying 1 unit with probability p1 on each trial, bandit x2 paying 1 unit with probability p2 < p1. There are also M players. The casino is so organized that the bandits are continuously (and simultaneously) operated, so that at any time t, for a modest fee, a player may elect to receive the payoff (possibly zero) of one of the two bandits. The manager has, however, introduced a gimmick. If M1 players elect to play bandit x1 at time t, they must share the unit of payoff if the outcome is successful. That is, on that particular trial, each of the M1 players will receive a payoff of 1/M1 with probability p1. Now, let us assume that the M players must participate for a period of T consecutive trials. If there is but one player (M = 1), clearly he will maximize his income (or minimize his losses) by playing bandit x1 at all times. However, if there are M > 1 players the situation changes. There will be stable queues, where no player can improve his payoff by shifting from one bandit to another. These occur when the players distribute themselves in the ratio M1/M2 = p1/p2 (at least as closely as allowed by the requirement that M1 and M2 be integers summing to M). For example, if p1 = 1/2 , p2 = 1/8, and M = 10, there will be 8 players queued in front of bandit x1 and 2 players in front of bandit x2. We see that with limited resources (in the numerical example, a maximum of 2 units payoff per trial and an expectation of 5/8 unit) the population of players must divide into two subpopulations in order to optimize individual shares of the resources (the "bandit x1 players" and the "bandit x2 players"). Similar considerations apply when there are r > 2 bandits.
We have here a rough analogy to the differentiation of individuals (the subpopulations) to exploit environmental niches (the bandits). The analogy can be made more precise by recasting it in terms of schemata. Let us consider a population of M individuals and the set of C0181-02.gif schemata defined on a given set of l0 positions. Assume that schema xi, i = 1, . . ., 2l0, exploits a unique "environmental niche" which produces a total of Qi units of payoff per time-step. (Qi corresponds to the renewal rate of a critical, volatile resource exploited by xi.) If the population contains Mi instances of xi, the Qi units are shared among them so that each instance of xi receives a payoff of Qi/Mi.C0181-04.gif so that schema x(1) is associated with the most productive niche, x(2) with the second most productive niche, etc. Clearly when M(1) is large enough that Q(1)/M(1) < Q(2), an instance of x(2) will be at a reproductive advantage. Following the same line of argument as in the case of the 2 one-armed bandits, we get as a stable distribution the obvious generalization:
C0181-01.gif

 
< previous page page_165 next page >

If you like this book, buy it!