< previous page page_76 next page >

Page 76
which use schemata to compare structures in C0021-03.gif. The objective of this chapter is to determine this criterion. In the process we will learn a good deal more about schemata and intrinsic parallelism.
1. The 2-Armed Bandit
The simplest precise version of the optimal allocation problem arises when we restrict attention to two random variables, x and x', with only two possible payoffs, 0 or 1. A trial of x produces the payoff 1 with probability p1 and the payoff 0 with probability 1 - p1; similarly x' produces 1 with probability p2 and 0 with probability 1 - p2. (For example, such trials could be produced by flipping either of two unbalanced coins, one of which produces heads with probability p1, the other with probability p2.) One is allowed N trials on each of which either x or x' can be selected. The object is to maximize the total payoff(the cumulative number of heads). Clearly if we know that x produces payoff 1 with a higher probability then all N trials should be allocated to x with a resulting expected accumulation p1·N. On the other hand if we know nothing initially about x and x' it would be unwise not to test both. How trials should be allocated to accomplish this is certainly not immediately obvious. (This is a version of the much-studied 2-armed bandit problem, a prototype of important decision problems. Bellman [1961] and Hellman and Cover [1970] give interesting discussions of the problem.)
If we allow the two random variables to be completely general (having probability distributions over an arbitrary number of outcomes), we get a slight generalization of the original problem which makes direct contact with our discussion of schemata. The outcome of a trial of either random variable is to be interpreted as a payoff (performance). The object once more is to discover a procedure for distributing an arbitrary number of trials, N, between x and x' so as to maximize the expected payoff over the N trials. As before, if we know for each xi the mean and variance C0092-03.gif of its distribution (actually the mean µi would suffice), the problem has a trivial solution (allocate all trials to the random variable with maximal mean). The conflict asserts itself, however, if we inject just a bit more uncertainty. Thus we can know the mean-variance pairs but not which variable is described by which pair; i.e., we know the pairs C0092-01.gif and C0092-02.gif but not which pair describes x.
If it could be determined through some small number of trials which of x and x' has the higher mean, then from that point on all trials could be allocated to that random variable. Unfortunately, unless the distributions are non-overlapping, no finite number of observations will establish with certainty which random variable has the higher mean. (E.g., given µx > µx' along with a probability p > 0 that a trial of x' will yield an outcome x > µx, there is still a probability pN after N

 
< previous page page_76 next page >

If you like this book, buy it!