|
|
|
|
|
|
|
5. Control and Function Optimization |
|
|
|
|
|
 |
|
|
The fact that we need time to determine the minimum of the [performance] functional or the optimal [control] vector c* is sad, but unavoidableit is a cost that we have to pay in order to solve a complex problem in the presence of uncertainty. . . . adaptation and learning are characterized by a sequential gathering of information and the usage of current information to eliminate the uncertainty created by insufficient a priori information. Tsypkin in Adaptation and Learning in Automatic Systems (p. 69) |
|
|
|
|
|
|
|
|
In the usual version, a controlled process is defined in terms of a set of variables {x1, . . . , xk} which are to be controlled. (For example, a simple process of air conditioning may involve three critical variables, temperature, humidity, and air flow.) The set of states or the phase space for the process, X, is the set of all possible combinations of values for these variables. (Thus, for an air conditioning process the phase space would be a 3-dimensional space of all triples of real numbers (x1, x2, x3) where the temperature x1in degrees centigrade might have a range 0 £ x1 £ 50, etc.) Permissible changes or transitions in phase space are determined as a function of the state variable itself and a set of control parameters C. Typically X is a region in n-dimensional Euclidean space and the control parameters assume values in a region C of an m-dimensional space. Accordingly, the equation takes the form of a "law of motion" in the space X, |
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
Often X will have several components X1, . . ., Xk following distinct laws f1, . . . ,fk so that |
|
|
|
|
|
|
|
|
For example, given a pursuit problem with a moving target having coordinates X2(t) at time t, f2(X2(t), C(t)) would be the law of motion of the target while f1(X1(t), C(t)) would determine the pursuit curve. If some component, say X3, represents time, then f3(X3(t), C(t)) = t and the law of motion becomes an explicit function of time. |
|
|
|
|
|
|
|
|
When a rule or policy A is given for selecting elements of C as a function of time, a unique trajectory through is determined by the law of motion f. The object is to select a policy A for minimizing a given function J which assigns a performance or cost to each possible trajectory . In practice, the function J is usually determined as the cumulation over time of some instanta- |
|
|
|
|
|