为了正常的体验网站,请在浏览器设置里面开启Javascript功能!
首页 > 线性规划notes-MIT

线性规划notes-MIT

2013-12-10 40页 pdf 264KB 43阅读

用户头像

is_070125

暂无简介

举报
线性规划notes-MIT 18.415/6.854 Advanced Algorithms October 1994 Linear Programming Lecturer: Michel X. Goemans 1 An Introduction to Linear Programming Linear programming is a very important class of problems, both algorithmically and combinatorially. Linear programming has many ...
线性规划notes-MIT
18.415/6.854 Advanced Algorithms October 1994 Linear Programming Lecturer: Michel X. Goemans 1 An Introduction to Linear Programming Linear programming is a very important class of problems, both algorithmically and combinatorially. Linear programming has many applications. From an algorithmic point-of-view, the simplex was proposed in the forties (soon after the war, and was motivated by military applications) and, although it has performed very well in prac- tice, is known to run in exponential time in the worst-case. On the other hand, since the early seventies when the classes P and NP were defined, it was observed that linear programming is in NP∩ co-NP although no polynomial-time algorithm was known at that time. The first polynomial-time algorithm, the ellipsoid algorithm, was only dis- covered at the end of the seventies. Karmarkar’s algorithm in the mid-eighties lead to very active research in the area of interior-point methods for linear programming. We shall present one of the numerous variations of interior-point methods in class. From a combinatorial perspective, systems of linear inequalities were already studied at the end of the last century by Farkas and Minkovsky. Linear programming, and especially the notion of duality, is very important as a proof technique. We shall illustrate its power when discussing approximation algorithms. We shall also talk about network flow algorithms where linear programming plays a crucial role both algorithmically and combinatorially. For a more in-depth coverage of linear programming, we refer the reader to [1, 4, 7, 8, 5]. A linear program is the problem of optimizing a linear objective function in the decision variables, x1 . . . xn, subject to linear equality or inequality constraints on the xi’s. In standard form, it is expressed as: Min n∑ j=1 cjxj (objective function) subject to: n∑ j=1 aijxj = bi, i = 1 . . .m (constraints) xj ≥ 0, j = 1 . . . n (non-negativity constraints) where {aij, bi, cj} are given. A linear program is expressed more conveniently using matrices: min cTx subject to { Ax = b x ≥ 0 LP-1 where x =   x1 ... xn   ∈ Rn×1 b =   b1 ... bm   ∈ Rm×1 c =   c1 ... cn   ∈ Rn×1 A =   a11 . . . amn   ∈ Rm×n 2 Basic Terminology Definition 1 If x satisfies Ax = b, x ≥ 0, then x is feasible. Definition 2 A linear program (LP) is feasible if there exists a feasible solution, otherwise it is said to be infeasible. Definition 3 An optimal solution x∗ is a feasible solution s.t. cTx∗ = min{cTx : Ax = b, x ≥ 0}. Definition 4 LP is unbounded (from below) if ∀λ ∈ R, ∃ a feasible x∗ s.t. cTx∗ ≤ λ. 3 Equivalent Forms A linear program can take on several forms. We might be maximizing instead of minimizing. We might have a combination of equality and inequality contraints. Some variables may be restricted to be non-positive instead of non-negative, or be unrestricted in sign. Two forms are said to be equivalent if they have the same set of optimal solutions or are both infeasible or both unbounded. 1. A maximization problem can be expressed as a minimization problem. max cTx⇔ min−cTx 2. An equality can be represented as a pair of inequalities. aTi x = bi ⇔ { aTi x ≤ bi aTi x ≥ bi ⇔ { aTi x ≤ bi −aTi x ≤ −bi LP-2 3. By adding a slack variable, an inequality can be represented as a combination of equality and non-negativity constraints. aTi x ≤ bi ⇔ aTi x+ si = bi, si ≥ 0. 4. Non-positivity constraints can be expressed as non-negativity constraints. To express xj ≤ 0, replace xj everywhere with −yj and impose the condition yj ≥ 0. 5. x may be unrestricted in sign. If x is unrestricted in sign, i.e. non-positive or non-negative, everywhre replace xj by x + j − x−j , adding the constraints x+j , x−j ≥ 0. In general, an inequality can be represented using a combination of equality and non-negativity constraints, and vice versa. Using these rules, min { cTx s.t. Ax ≥ b} can be transformed into min { cTx+ − cTx− s.t. Ax+ −Ax− − Is = b, x+, x−, s ≥ 0}. The former LP is said to be in canonical form, the latter in standard form. Conversely, an LP in standard form may be written in canonical form. min { cTx s.t. Ax = b, x ≥ 0} is equivalent to min { cTx s.t. Ax ≥ b, −Ax ≥ −b, Ix ≥ 0}. This may be rewritten as A ′ x ≥ b′ , where A′ =   A-A I   and b′ =   b-b 0  . 4 Example Consider the following linear program: min x2 subject to   x1 ≥ 2 3x1 − x2 ≥ 0 x1 + x2 ≥ 6 −x1 + 2x2 ≥ 0 The optimal solution is (4, 2) of cost 2 (see Figure 1). If we were maximizing x2 instead of minimizing under the same feasible region, the resulting linear program would be unbounded since x2 can increase arbitrarily. From this picture, the reader should be convinced that, for any objective function for which the linear program is bounded, there exists an optimal solution which is a “corner” of the feasible region. We shall formalize this notion in the next section. LP-3 (2,1) (2,4) x1 x2 3x − x > 021 1 1 1 x + x > 6 x > 2 2 2−x + 2x > 0 Figure 1: Graph representing primal in example. An example of an infeasible linear program can be obtained by reversing some of the inequalities of the above LP: x1 ≤ 2 3x1 − x2 ≥ 0 x1 + x2 ≥ 6 −x1 + 2x2 ≤ 0. 5 The Geometry of LP Let P = {x : Ax = b, x ≥ 0} ⊆ Rn. Definition 5 x is a vertex of P if 6 ∃y 6= 0 s.t. x+ y, x− y ∈ P . Theorem 1 Assume min{cTx : x ∈ P} is finite, then ∀x ∈ P, ∃ a vertex x′ such that cTx ′ ≤ cTx. Proof: If x is a vertex, then take x ′ = x. If x is not a vertex, then, by definition, ∃y 6= 0 s.t. x + y, x − y ∈ P . Since A(x+ y) = b and A(x− y) = b, Ay = 0. LP-4 WLOG, assume cTy ≤ 0 (take either y or −y). If cTy = 0, choose y such that ∃j s.t. yj < 0. Since y 6= 0 and cTy = cT (−y) = 0, this must be true for either y or −y. Consider x + λy, λ > 0. cT (x + λy) = cTx + λcTy ≤ cTx, since cTy is assumed non-positive. Case 1 ∃j such that yj < 0 As λ increases, component j decreases until x+ λy is no longer feasible. Choose λ = min{j:yj<0}{xj/−yj} = xk/−yk. This is the largest λ such that x + λy ≥ 0. Since Ay = 0, A(x + λy) = Ax + λAy = Ax = b. So x + λy ∈ P , and moreover x+ λy has one more zero component, (x+ λy)k, than x. Replace x by x+ λy. Case 2 yj ≥ 0 ∀j By assumption, cTy < 0 and x+ λy is feasible for all λ ≥ 0, since A(x+ λy) = Ax+λAy = Ax = b, and x+λy ≥ x ≥ 0. But cT (x+λy) = cTx+λcTy → −∞ as λ→∞, implying LP is unbounded, a contradiction. Case 1 can happen at most n times, since x has n components. By induction on the number of non-zero components of x, we obtain a vertex x ′ . ✷ Remark: The theorem was described in terms of the polyhedral set P = {x : Ax = b : x ≥ 0}. Strictly speaking, the theorem is not true for P = {x : Ax ≥ b}. Indeed, such a set P might not have any vertex. For example, consider P = {(x1, x2) : 0 ≤ x2 ≤ 1} (see Figure 2). This polyhedron has no vertex, since for any x ∈ P , we have x + y, x − y ∈ P , where y = (1, 0). It can be shown that P has a vertex iff Rank(A) = n. Note that, if we transform a program in canonical form into standard form, the non-negativity constraints imply that the resulting matrix A has full column rank, since Rank   A-A I   = n. Corollary 2 If min{cTx : Ax = b, x ≥ 0} is finite, There exists an optimal solution, x∗, which is a vertex. Proof: Suppose not. Take an optimal solution. By Theorem 1 there exists a vertex costing no more and this vertex must be optimal as well. ✷ Corollary 3 If P = {x : Ax = b, x ≥ 0} 6= ∅, then P has a vertex. LP-5 (0,1) (0,0) x x 1 2 Figure 2: A polyhedron with no vertex. Theorem 4 Let P = {x : Ax = b, x ≥ 0}. For x ∈ P , let Ax be a submatrix of A corresponding to j s.t. xj > 0. Then x is a vertex iff Ax has linearly independent columns. (i.e. Ax has full column rank.) Example A =   2 1 3 07 3 2 1 0 0 0 5   x =   2 0 1 0   Ax =   2 37 2 0 0  , and x is a vertex. Proof: Show ¬i→ ¬ii. Assume x is not a vertex. Then, by definition, ∃y 6= 0 s.t. x + y, x − y ∈ P . Let Ay be submatrix corresponding to non-zero components of y. As in the proof of Theorem 1, Ax + Ay = b Ax − Ay = b } ⇒ Ay = 0. Therefore, Ay has dependent columns since y 6= 0. Moreover, x + y ≥ 0 x − y ≥ 0 } ⇒ yj = 0 whenever xj = 0. Therefore Ay is a submatrix of Ax. Since Ay is a submatrix of Ax, Ax has linearly dependent columns. LP-6 Show ¬ii→ ¬i. Suppose Ax has linearly dependent columns. Then ∃y s.t. Axy = 0, y 6= 0. Extend y to Rn by adding 0 components. Then ∃y ∈ Rn s.t. Ay = 0, y 6= 0 and yj = 0 wherever xj = 0. Consider y ′ = λy for small λ ≥ 0. Claim that x+ y′, x− y′ ∈ P , by argument analogous to that in Case 1 of the proof of Theorem 1, above. Hence, x is not a vertex. ✷ 6 Bases Let x be a vertex of P = {x : Ax = b, x ≥ 0}. Suppose first that |{j : xj > 0}| = m (where A is m × n). In this case we denote B = {j : xj > 0}. Also let AB = Ax; we use this notation not only for A and B, but also for x and for other sets of indices. Then AB is a square matrix whose columns are linearly independent (by Theorem 4), so it is non-singular. Therefore we can express x as xj = 0 if j 6∈ B, and since ABxB = b, it follows that xB = A −1 B b. The variables corresponding to B will be called basic. The others will be referred to as nonbasic. The set of indices corresponding to nonbasic variables is denoted by N = {1, . . . , n} − B. Thus, we can write the above as xB = A −1 B b and xN = 0. Without loss of generality we will assume that A has full row rank, rank(A) = m. Otherwise either there is a redundant constraint in the system Ax = b (and we can remove it), or the system has no solution at all. If |{j : xj > 0}| < m, we can augment Ax with additional linearly independent columns, until it is an m×m submatrix of A of full rank, which we will denote AB. In other words, although there may be less than m positive components in x, it is convenient to always have a basis B such that |B| = m and AB is non-singular. This enables us to always express x as we did before, xN = 0, xB = A −1 B b. Summary x is a vertex of P iff there is B ⊆ {1, . . . , n} such that |B| = m and 1. xN = 0 for N = {1, . . . , n} −B 2. AB is non-singular 3. xB = A −1 B b ≥ 0 In this case we say that x is a basic feasible solution. Note that a vertex can have several basic feasible solution corresponding to it (by augmenting {j : xj > 0} in different ways). A basis might not lead to any basic feasible solution since A−1B b is not necessarily nonnegative. LP-7 Example: x1 + x2 + x3 = 5 2x1 − x2 + 2x3 = 1 x1, x2, x3 ≥ 0 We can select as a basis B = {1, 2}. Thus, N = {3} and AB = ( 1 1 2 −1 ) A−1B = ( 1 3 1 3 2 3 −1 3 ) A−1B b = ( 2 3 ) x = (2, 3, 0) Remark. A crude upper bound on the number of vertices of P is ( n m ) . This number is exponential (it is upper bounded by nm). We can come up with a tighter approx- imation of ( n−m 2 m 2 ) , though this is still exponential. The reason why the number is much smaller is that most basic solutions to the system Ax = b (which we counted) are not feasible, that is, they do not satisfy x ≥ 0. 7 The Simplex Method The Simplex algorithm [Dantzig,1947] [2] solves linear programming problems by focusing on basic feasible solutions. The basic idea is to start from some vertex v and look at the adjacent vertices. If an improvement in cost is possible by moving to one of the adjacent vertices, then we do so. Thus, we will start with a bfs corresponding to a basis B and, at each iteration, try to improve the cost of the solution by removing one variable from the basis and replacing it by another. We begin the Simplex algorithm by first rewriting our LP in the form: min cBxB + cNxN s.t. ABxB + ANxN = b xB, xN ≥ 0 Here B is the basis corresponding to the bfs we are starting from. Note that, for any solution x, xB = A −1 B b − A−1B ANxN and that its total cost, cTx can be specified as follows: LP-8 cTx = cBxB + cNxN = cB(A −1 B b− A−1B ANxN ) + cNxN = cBA −1 B b+ (cN − cBA−1B AN)xN We denote the reduced cost of the non-basic variables by c˜N , c˜N = cN −cBA−1B AN , i.e. the quantity which is the coefficient of xN above. If there is a j ∈ N such that c˜j < 0, then by increasing xj (up from zero) we will decrease the cost (the value of the objective function). Of course xB depends on xN , and we can increase xj only as long as all the components of xB remain positive. So in a step of the Simplex method, we find a j ∈ N such that c˜j < 0, and increase it as much as possible while keeping xB ≥ 0. It is not possible any more to increase xj , when one of the components of xB is zero. What happened is that a non-basic variable is now positive and we include it in the basis, and one variable which was basic is now zero, so we remove it from the basis. If, on the other hand, there is no j ∈ N such that c˜j < 0, then we stop, and the current basic feasible solution is an optimal solution. This follows from the new expression for cTx since xN is nonnegative. Remarks: 1. Note that some of the basic variables may be zero to begin with, and in this case it is possible that we cannot increase xj at all. In this case we can replace say j by k in the basis, but without moving from the vertex corresponding to the basis. In the next step we might replace k by j, and be stuck in a loop. Thus, we need to specify a “pivoting rule” to determine which index should enter the basis, and which index should be removed from the basis. 2. While many pivoting rules (including those that are used in practice) can lead to infinite loops, there is a pivoting rule which will not (known as the minimal index rule - choose the minimal j and k possible [Bland, 1977]). This fact was discovered by Bland in 1977. There are other methods of “breaking ties” which eliminate infinite loops. 3. There is no known pivoting rule for which the number of pivots in the worst case is better than exponential. 4. The question of the complexity of the Simplex algorithm and the last remark leads to the question of what is the length of the shortest path between two vertices of a convex polyhedron, where the path is along edges, and the length of the path in measured in terms of the number of vertices visited. LP-9 Hirsch Conjecture: For m hyperplanes in d dimensions the length of the shortest path between any two vertices of the arrangement is at most m− d. This is a very open question — there is not even a polynomial bound proven on this length. On the other hand, one should note that even if the Hirsch Conjecture is true, it doesn’t say much about the Simplex Algorithm, because Simplex generates paths which are monotone with respect to the objective function, whereas the shortest path need not be monotone. Recently, Kalai (and others) has considered a randomized pivoting rule. The idea is to randomly permute the index columns of A and to apply the Simplex method, always choosing the smallest j possible. In this way, it is possible to show a subexponential bound on the expected number of pivots. This leads to a subexponential bound for the diameter of any convex polytope defined by m hyperplanes in a d dimension space. The question of the existence of a polynomial pivoting scheme is still open though. We will see later a completely different algorithm which is polynomial, although not strongly polynomial (the existence of a strongly polynomial algo- rithm for linear programming is also open). That algorithm will not move from one vertex of the feasible domain to another like the Simplex, but will confine its interest to points in the interior of the feasible domain. A visualization of the geometry of the Simplex algorithm can be obtained from considering the algorithm in 3 dimensions (see Figure 3). For a problem in the form min{cTx : Ax ≤ b} the feasible domain is a polyhedron in R3, and the algorithm moves from vertex to vertex in each step (or does not move at all). 8 When is a Linear Program Feasible ? We now turn to another question which will lead us to important properties of linear programming. Let us begin with some examples. We consider linear programs of the form Ax = b, x ≥ 0. As the objective function has no effect on the feasibility of the program, we ignore it. We first restrict our attention to systems of equations (i.e. we neglect the non- negativity constraints). Example: Consider the system of equations: x1 + x2 + x3 = 6 2x1 + 3x2 + x3 = 8 2x1 + x2 + 3x3 = 0 and the linear combination LP-10 Objective function Figure 3: Traversing the vertices of a convex body (here a polyhedron in R3). −4 × x1 + x2 + x3 = 6 1 × 2x1 + 3x2 + x3 = 8 1 × 2x1 + x2 + 3x3 = 0 The linear combination results in the equation 0x1 + 0x2 + 0x3 = −16 which means of course that the system of equations has no feasible solution. In fact, an elementary theorem of linear algebra says that if a system has no solution, there is always a vector y such as in our example (y = (−4, 1, 1)) which proves that the system has no solution. Theorem 5 Exactly one of the following is true for the system Ax = b: 1. There is x such that Ax = b. 2. There is y such that ATy = 0 but yT b = 1. This is not quite enough for our purposes, because a system can be feasible, but still have no non-negative solutions x ≥ 0. Fortunately, the following lemma establishes the equivalent results for our system Ax = b, x ≥ 0. Theorem 6 (Farkas’ Lemma) Exactly one of the following is true for the system Ax = b, x ≥ 0: LP-11 1. There is x such that Ax = b, x ≥ 0. 2. There is y such that ATy ≥ 0 but bTy < 0. Proof: We will first show that the two conditions cannot happen together, and then than at least one of them must happen. Suppose we do have both x and y as in the statement of the theorem. Ax = b =⇒ yTAx = yT b =⇒ xTATy = yT b but this is a contradiction, because yT b < 0, and since x ≥ 0 and ATy ≥ 0, so xTAT y ≥ 0. The other direction is less trivial, and usually shown using properties of the Sim- plex algorithm, mainly duality. We will use another tool, and later use Farkas’ Lemma to prove properties about duality in linear programming. The tool we shall use is the Projection theorem, which we state without proof: Theorem 7 (Projection Theorem) Let K be a closed convex (see Figure 4) non- empty set in Rn, and let b be any point in Rn. The projection of b onto K is a point p ∈ K that minimizes the Euclidean distance ‖b − p‖. Then p has the property that for all z ∈ K, (z − p)T (b− p) ≤ 0 (see Figure 5) non-empty set. not convex convex Figure 4: Convex and non-convex sets in R2. We are now ready to prove the other direction of Farkas’ Lemma. Assume that there is no x such that Ax = b, x ≥ 0; we will show that there is y such that ATy ≥ 0 but yT b < 0. Let K = {Ax : x ≥ 0} ⊆ Rm (A is an m× n matrix). K is a cone in Rm and it is convex, non-empty and closed. According to our assumption, Ax = b, x ≥ 0 has no solution, so b does not belong to K. Let p be the projection of b onto K. LP-12 b p z Figure 5: The Projection Theorem. Since p ∈ K, there is a w ≥ 0 such that Aw = p. According to the Projection Theorem, for all z ∈ K, (z−p)T (b−p) ≤ 0 That is, for all x ≥ 0 (Ax−p)T (b−p) ≤ 0 We define y = p−b, which implies (Ax−p)T y ≥ 0. Since Aw = p, (Ax−Aw)Ty ≥ 0. (x− w)T (ATy) ≥ 0 for all x ≥ 0 (remember that w was fixed by choosing b). Set x = w +   0 0 ... 1 ... 0   (w plus a unit vector with a 1 in the i-th row). Note that x is non-negative, because w ≥ 0. This will extract the i-th column of A, so we conclude that the i-th component of ATy is non-negative (ATy)i ≥ 0, and since this is true for all i, ATy ≥ 0. Now it only remains to show that yT b < 0. ytb = (p−y)Ty = pTy−yTy Since (Ax−p)Ty ≥ 0 for all x ≥ 0, taking x to be zero shows that pTy ≤ 0. Since b 6∈ K, y = p−b 6= 0, so yTy > 0. So yT b = pTy−yTy < 0. ✷ Using a very similar proof one can show th
/
本文档为【线性规划notes-MIT】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。

历史搜索

    清空历史搜索