- Algorithms for Convex Problems
- Karush-Kuhn-Tucker Optimality Necessary Conditions
- Fritz-John Conditions
- Convex Programming Problem
- Pseudoconvex Function
- Strongly Quasiconvex Function
- Strictly Quasiconvex Function
- Differentiable Quasiconvex Function
- Quasiconvex & Quasiconcave functions
- Sufficient & Necessary Conditions for Global Optima
- Differentiable Convex Function
- Convex & Concave Function
- Direction
- Extreme point of a convex set
- Polyhedral Set
- Conic Combination
- Polar Cone
- Convex Cones
- Fundamental Separation Theorem
- Closest Point Theorem
- Weierstrass Theorem
- Caratheodory Theorem
- Convex Hull
- Affine Set
- Convex Set
- Minima and Maxima
- Inner Product
- Norm
- Linear Programming
- Introduction
- Home
Convex Optimization Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Convex Optimization - Quick Guide
Convex Optimization - Introduction
This course is useful for the students who want to solve non-pnear optimization problems that arise in various engineering and scientific apppcations. This course starts with basic theory of pnear programming and will introduce the concepts of convex sets and functions and related terminologies to explain various theorems that are required to solve the non pnear programming problems. This course will introduce various algorithms that are used to solve such problems. These type of problems arise in various apppcations including machine learning, optimization problems in electrical engineering, etc. It requires the students to have prior knowledge of high school maths concepts and calculus.
In this course, the students will learn to solve the optimization problems pke $min fleft ( x ight )$ subject to some constraints.
These problems are easily solvable if the function $fleft ( x ight )$ is a pnear function and if the constraints are pnear. Then it is called a pnear programming problem (LPP). But if the constraints are non-pnear, then it is difficult to solve the above problem. Unless we can plot the functions in a graph, then try to analyse the optimization can be one way, but we can t plot a function if it s beyond three dimensions. Hence there comes the techniques of non-pnear programming or convex programming to solve such problems. In these tutorial, we will focus on learning such techniques and in the end, a few algorithms to solve such problems. first we will bring the notion of convex sets which is the base of the convex programming problems. Then with the introduction of convex functions, we will some important theorems to solve these problems and some algorithms based on these theorems.
Terminologies
The space $mathbb{R}^n$ − It is an n-dimensional vector with real numbers, defined as follows − $mathbb{R}^n=left { left ( x_1,x_2,...,x_n ight )^{ au }:x_1,x_2,....,x_n in mathbb{R} ight }$
The space $mathbb{R}^{mXn}$ − It is a set of all real values matrices of order $mXn$.
Convex Optimization - Linear Programming
Methodology
Linear Programming also called Linear Optimization, is a technique which is used to solve mathematical problems in which the relationships are pnear in nature. the basic nature of Linear Programming is to maximize or minimize an objective function with subject to some constraints. The objective function is a pnear function which is obtained from the mathematical model of the problem. The constraints are the conditions which are imposed on the model and are also pnear.
From the given question, find the objective function.
find the constraints.
Draw the constraints on a graph.
find the feasible region, which is formed by the intersection of all the constraints.
find the vertices of the feasible region.
find the value of the objective function at these vertices.
The vertice which either maximizes or minimizes the objective function (according to the question) is the answer.
Examples
Step 1 − Maximize $5x+3y$ subject to
$x+yleq 2$,
$3x+yleq 3$,
$xgeq 0 :and :ygeq 0$
Solution −
The first step is to find the feasible region on a graph.
Clearly from the graph, the vertices of the feasible region are
$left ( 0, 0 ight )left ( 0, 2 ight )left ( 1, 0 ight )left ( frac{1}{2}, frac{3}{2} ight )$
Let $fleft ( x, y ight )=5x+3y$
Putting these values in the objective function, we get −
$fleft ( 0, 0 ight )$=0
$fleft ( 0, 2 ight )$=6
$fleft ( 1, 0 ight )$=5
$fleft ( frac{1}{2}, frac{3}{2} ight )$=7
Therefore, the function maximizes at $left ( frac{1}{2}, frac{3}{2} ight )$
Step 2 − A watch company produces a digital and a mechanical watch. Long-term projections indicate an expected demand of at least 100 digital and 80 mechanical watches each day. Because of pmitations on production capacity, no more than 200 digital and 170 mechanical watches can be made daily. To satisfy a shipping contract, a total of at least 200 watches much be shipped each day.
If each digital watch sold results in a $$2$ loss, but each mechanical watch produces a $$5$ profit, how many of each type should be made daily to maximize net profits?
Solution −
Let $x$ be the number of digital watches produced
$y$ be the number of mechanical watches produced
According to the question, at least 100 digital watches are to be made daily and maximaum 200 digital watches can be made.
$Rightarrow 100 leq :xleq 200$
Similarly, at least 80 mechanical watches are to be made daily and maximum 170 mechanical watches can be made.
$Rightarrow 80 leq :yleq 170$
Since at least 200 watches are to be produced each day.
$Rightarrow x +yleq 200$
Since each digital watch sold results in a $$2$ loss, but each mechanical watch produces a $$5$ profit,
Total profit can be calculated as
$Profit =-2x + 5y$
And we have to maximize the profit, Therefore, the question can be formulated as −
Maximize $-2x + 5y$ subject to
$100 :leq x:leq 200$
$80 :leq y:leq 170$
$x+y:leq 200$
Plotting the above equations in a graph, we get,
The vertices of the feasible region are
$left ( 100, 170 ight )left ( 200, 170 ight )left ( 200, 180 ight )left ( 120, 80 ight ) and left ( 100, 100 ight )$
The maximum value of the objective function is obtained at $left ( 100, 170 ight )$ Thus, to maximize the net profits, 100 units of digital watches and 170 units of mechanical watches should be produced.
Convex Optimization - Norm
A norm is a function that gives a strictly positive value to a vector or a variable.
Norm is a function $f:mathbb{R}^n ightarrow mathbb{R}$
The basic characteristics of a norm are −
Let $X$ be a vector such that $Xin mathbb{R}^n$
$left | x ight |geq 0$
$left | x ight |= 0 Leftrightarrow x= 0forall x in X$
$left |alpha x ight |=left | alpha ight |left | x ight |forall :x in X and :alpha :is :a :scalar$
$left | x+y ight |leq left | x ight |+left | y ight | forall x,y in X$
$left | x-y ight |geq left | left | x ight |-left | y ight | ight |$
By definition, norm is calculated as follows −
$left | x ight |_1=displaystylesumpmits_{i=1}^nleft | x_i ight |$
$left | x ight |_2=left ( displaystylesumpmits_{i=1}^nleft | x_i ight |^2 ight )^{frac{1}{2}}$
$left | x ight |_p=left ( displaystylesumpmits_{i=1}^nleft | x_i ight |^p ight )^{frac{1}{p}},1 leq p leq infty$
Norm is a continuous function.
Proof
By definition, if $x_n ightarrow x$ in $XRightarrow fleft ( x_n ight ) ightarrow fleft ( x ight ) $ then $fleft ( x ight )$ is a constant function.
Let $fleft ( x ight )=left | x ight |$
Therefore, $left | fleft ( x_n ight )-fleft ( x ight ) ight |=left | left | x_n ight | -left | x ight | ight |leq left | left | x_n-x ight | : ight |$
Since $x_n ightarrow x$ thus, $left | x_n-x ight | ightarrow 0$
Therefore $left | fleft ( x_n ight )-fleft ( x ight ) ight |leq 0Rightarrow left | fleft ( x_n ight )-fleft ( x ight ) ight |=0Rightarrow fleft ( x_n ight ) ightarrow fleft ( x ight )$
Hence, norm is a continuous function.
Convex Optimization - Inner Product
Inner product is a function which gives a scalar to a pair of vectors.
Inner Product − $f:mathbb{R}^n imes mathbb{R}^n ightarrow kappa$ where $kappa$ is a scalar.
The basic characteristics of inner product are as follows −
Let $X in mathbb{R}^n$
$left langle x,x ight anglegeq 0, forall x in X$
$left langle x,x ight angle=0Leftrightarrow x=0, forall x in X$
$left langle alpha x,y ight angle=alpha left langle x,y ight angle,forall alpha in kappa : and: forall x,y in X$
$left langle x+y,z ight angle =left langle x,z ight angle +left langle y,z ight angle, forall x,y,z in X$
$left langle overpne{y,x} ight angle=left ( x,y ight ), forall x, y in X$
Note −
Relationship between norm and inner product: $left | x ight |=sqrt{left ( x,x ight )}$
$forall x,y in mathbb{R}^n,left langle x,y ight angle=x_1y_1+x_2y_2+...+x_ny_n$
Examples
1. find the inner product of $x=left ( 1,2,1 ight ): and : y=left ( 3,-1,3 ight )$
Solution
$left langle x,y ight angle =x_1y_1+x_2y_2+x_3y_3$
$left langle x,y ight angle=left ( 1 imes3 ight )+left ( 2 imes-1 ight )+left ( 1 imes3 ight )$
$left langle x,y ight angle=3+left ( -2 ight )+3$
$left langle x,y ight angle=4$
2. If $x=left ( 4,9,1 ight ),y=left ( -3,5,1 ight )$ and $z=left ( 2,4,1 ight )$, find $left ( x+y,z ight )$
Solution
As we know, $left langle x+y,z ight angle=left langle x,z ight angle+left langle y,z ight angle$
$left langle x+y,z ight angle=left ( x_1z_1+x_2z_2+x_3z_3 ight )+left ( y_1z_1+y_2z_2+y_3z_3 ight )$
$left langle x+y,z ight angle=left { left ( 4 imes 2 ight )+left ( 9 imes 4 ight )+left ( 1 imes1 ight ) ight }+$
$left { left ( -3 imes2 ight )+left ( 5 imes4 ight )+left ( 1 imes 1 ight ) ight }$
$left langle x+y,z ight angle=left ( 8+36+1 ight )+left ( -6+20+1 ight )$
$left langle x+y,z ight angle=45+15$
$left langle x+y,z ight angle=60$
Convex Optimization - Minima and Maxima
Local Minima or Minimize
$ar{x}in :S$ is said to be local minima of a function $f$ if $fleft ( ar{x} ight )leq fleft ( x ight ),forall x in N_varepsilon left ( ar{x} ight )$ where $N_varepsilon left ( ar{x} ight )$ means neighbourhood of $ar{x}$, i.e., $N_varepsilon left ( ar{x} ight )$ means $left | x-ar{x} ight |< varepsilon$
Local Maxima or Maximizer
$ar{x}in :S$ is said to be local maxima of a function $f$ if $fleft ( ar{x} ight )geq fleft ( x ight ), forall x in N_varepsilon left ( ar{x} ight )$ where $N_varepsilon left ( ar{x} ight )$ means neighbourhood of $ar{x}$, i.e., $N_varepsilon left ( ar{x} ight )$ means $left | x-ar{x} ight |< varepsilon$
Global minima
$ar{x}in :S$ is said to be global minima of a function $f$ if $fleft ( ar{x} ight )leq fleft ( x ight ), forall x in S$
Global maxima
$ar{x}in :S$ is said to be global maxima of a function $f$ if $fleft ( ar{x} ight )geq fleft ( x ight ), forall x in S$
Examples
Step 1 − find the local minima and maxima of $fleft ( ar{x} ight )=left | x^2-4 ight |$
Solution −
From the graph of the above function, it is clear that the local minima occurs at $x= pm 2$ and local maxima at $x = 0$
Step 2 − find the global minima af the function $fleft (x ight )=left | 4x^3-3x^2+7 ight |$
Solution −
From the graph of the above function, it is clear that the global minima occurs at $x=-1$.
Convex Optimization - Convex Set
Let $Ssubseteq mathbb{R}^n$ A set S is said to be convex if the pne segment joining any two points of the set S also belongs to the S, i.e., if $x_1,x_2 in S$, then $lambda x_1+left ( 1-lambda ight )x_2 in S$ where $lambda inleft ( 0,1 ight )$.
Note −
The union of two convex sets may or may not be convex.
The intersection of two convex sets is always convex.
Proof
Let $S_1$ and $S_2$ be two convex set.
Let $S_3=S_1 cap S_2$
Let $x_1,x_2 in S_3$
Since $S_3=S_1 cap S_2$ thus $x_1,x_2 in S_1$and $x_1,x_2 in S_2$
Since $S_i$ is convex set, $forall$ $i in 1,2,$
Thus $lambda x_1+left ( 1-lambda ight )x_2 in S_i$ where $lambda in left ( 0,1 ight )$
Therfore, $lambda x_1+left ( 1-lambda ight )x_2 in S_1cap S_2$
$Rightarrow lambda x_1+left ( 1-lambda ight )x_2 in S_3$
Hence, $S_3$ is a convex set.
Weighted average of the form $displaystylesumpmits_{i=1}^k lambda_ix_i$,where $displaystylesumpmits_{i=1}^k lambda_i=1$ and $lambda_igeq 0,forall i in left [ 1,k ight ]$ is called conic combination of $x_1,x_2,....x_k.$
Weighted average of the form $displaystylesumpmits_{i=1}^k lambda_ix_i$, where $displaystylesumpmits_{i=1}^k lambda_i=1$ is called affine combination of $x_1,x_2,....x_k.$
Weighted average of the form $displaystylesumpmits_{i=1}^k lambda_ix_i$ is called pnear combination of $x_1,x_2,....x_k.$
Examples
Step 1 − Prove that the set $S=left { x in mathbb{R}^n:Cxleq alpha ight }$ is a convex set.
Solution
Let $x_1$ and $x_2 in S$
$Rightarrow Cx_1leq alpha$ and $:and :Cx_2leq alpha$
To show:$::y=left ( lambda x_1+left ( 1-lambda ight )x_2 ight )in S :forall :lambda inleft ( 0,1 ight )$
$Cy=Cleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )=lambda Cx_1+left ( 1-lambda ight )Cx_2$
$Rightarrow Cyleq lambda alpha+left ( 1-lambda ight )alpha$
$Rightarrow Cyleq alpha$
$Rightarrow yin S$
Therefore, $S$ is a convex set.
Step 2 − Prove that the set $S=left { left ( x_1,x_2 ight )in mathbb{R}^2:x_{1}^{2}leq 8x_2 ight }$ is a convex set.
Solution
Let $x,y in S$
Let $x=left ( x_1,x_2 ight )$ and $y=left ( y_1,y_2 ight )$
$Rightarrow x_{1}^{2}leq 8x_2$ and $y_{1}^{2}leq 8y_2$
To show − $lambda x+left ( 1-lambda ight )yin SRightarrow lambda left ( x_1,x_2 ight )+left (1-lambda ight )left ( y_1,y_2 ight ) in SRightarrow left [ lambda x_1+left ( 1- lambda)y_2] in S ight ) ight ]$
$Now, left [lambda x_1+left ( 1-lambda ight )y_1 ight ]^{2}=lambda ^2x_{1}^{2}+left ( 1-lambda ight )^2y_{1}^{2}+2 lambdaleft ( 1-lambda ight )x_1y_1$
But $2x_1y_1leq x_{1}^{2}+y_{1}^{2}$
Therefore,
$left [ lambda x_1 +left ( 1-lambda ight )y_1 ight ]^{2}leq lambda ^2x_{1}^{2}+left ( 1- lambda ight )^2y_{1}^{2}+2 lambdaleft ( 1- lambda ight )left ( x_{1}^{2}+y_{1}^{2} ight )$
$Rightarrow left [ lambda x_1+left ( 1-lambda ight )y_1 ight ]^{2}leq lambda x_{1}^{2}+left ( 1- lambda ight )y_{1}^{2}$
$Rightarrow left [ lambda x_1+left ( 1-lambda ight )y_1 ight ]^{2}leq 8lambda x_2+8left ( 1- lambda ight )y_2$
$Rightarrow left [ lambda x_1+left ( 1-lambda ight )y_1 ight ]^{2}leq 8left [lambda x_2+left ( 1- lambda ight )y_2 ight ]$
$Rightarrow lambda x+left ( 1- lambda ight )y in S$
Step 3 − Show that a set $S in mathbb{R}^n$ is convex if and only if for each integer k, every convex combination of any k points of $S$ is in $S$.
Solution
Let $S$ be a convex set. then, to show;
$c_1x_1+c_2x_2+.....+c_kx_k in S, displaystylesumpmits_{1}^k c_i=1,c_igeq 0, forall i in 1,2,....,k$
Proof by induction
For $k=1,x_1 in S, c_1=1 Rightarrow c_1x_1 in S$
For $k=2,x_1,x_2 in S, c_1+c_2=1$ and Since S is a convex set
$Rightarrow c_1x_1+c_2x_2 in S.$
Let the convex combination of m points of S is in S i.e.,
$c_1x_1+c_2x_2+...+c_mx_m in S,displaystylesumpmits_{1}^m c_i=1 ,c_i geq 0, forall i in 1,2,...,m$
Now, Let $x_1,x_2....,x_m,x_{m+1} in S$
Let $x=mu_1x_1+mu_2x_2+...+mu_mx_m+mu_{m+1}x_{m+1}$
Let $x=left ( mu_1+mu_2+...+mu_m ight )frac{mu_1x_1+mu_2x_2+mu_mx_m}{mu_1+mu_2+.........+mu_m}+mu_{m+1}x_{m+1}$
Let $y=frac{mu_1x_1+mu_2x_2+...+mu_mx_m}{mu_1+mu_2+.........+mu_m}$
$Rightarrow x=left ( mu_1+mu_2+...+mu_m ight )y+mu_{m+1}x_{m+1}$
Now $y in S$ because the sum of the coefficients is 1.
$Rightarrow x in S$ since S is a convex set and $y,x_{m+1} in S$
Hence proved by induction.
Convex Optimization - affine Set
A set $A$ is said to be an affine set if for any two distinct points, the pne passing through these points pe in the set $A$.
Note −
$S$ is an affine set if and only if it contains every affine combination of its points.
Empty and singleton sets are both affine and convex set.
For example, solution of a pnear equation is an affine set.
Proof
Let S be the solution of a pnear equation.
By definition, $S=left { x in mathbb{R}^n:Ax=b ight }$
Let $x_1,x_2 in SRightarrow Ax_1=b$ and $Ax_2=b$
To prove : $Aleft [ heta x_1+left ( 1- heta ight )x_2 ight ]=b, forall heta inleft ( 0,1 ight )$
$Aleft [ heta x_1+left ( 1- heta ight )x_2 ight ]= heta Ax_1+left ( 1- heta ight )Ax_2= heta b+left ( 1- heta ight )b=b$
Thus S is an affine set.
Theorem
If $C$ is an affine set and $x_0 in C$, then the set $V= C-x_0=left { x-x_0:x in C ight }$ is a subspace of C.
Proof
Let $x_1,x_2 in V$
To show: $alpha x_1+eta x_2 in V$ for some $alpha,eta$
Now, $x_1+x_0 in C$ and $x_2+x_0 in C$ by definition of V
Now, $alpha x_1+eta x_2+x_0=alpha left ( x_1+x_0 ight )+eta left ( x_2+x_0 ight )+left ( 1-alpha -eta ight )x_0$
But $alpha left ( x_1+x_0 ight )+eta left ( x_2+x_0 ight )+left ( 1-alpha -eta ight )x_0 in C$ because C is an affine set.
Therefore, $alpha x_1+eta x_2 in V$
Hence proved.
Convex Optimization - Hull
The convex hull of a set of points in S is the boundary of the smallest convex region that contain all the points of S inside it or on its boundary.
OR
Let $Ssubseteq mathbb{R}^n$ The convex hull of S, denoted $Coleft ( S ight )$ by is the collection of all convex combination of S, i.e., $x in Coleft ( S ight )$ if and only if $x in displaystylesumpmits_{i=1}^n lambda_ix_i$, where $displaystylesumpmits_{1}^n lambda_i=1$ and $lambda_i geq 0 forall x_i in S$
Remark − Conves hull of a set of points in S in the plane defines a convex polygon and the points of S on the boundary of the polygon defines the vertices of the polygon.
Theorem $Coleft ( S ight )= left { x:x=displaystylesumpmits_{i=1}^n lambda_ix_i,x_i in S, displaystylesumpmits_{i=1}^n lambda_i=1,lambda_i geq 0 ight }$ Show that a convex hull is a convex set.
Proof
Let $x_1,x_2 in Coleft ( S ight )$, then $x_1=displaystylesumpmits_{i=1}^n lambda_ix_i$ and $x_2=displaystylesumpmits_{i=1}^n lambda_gamma x_i$ where $displaystylesumpmits_{i=1}^n lambda_i=1, lambda_igeq 0$ and $displaystylesumpmits_{i=1}^n gamma_i=1,gamma_igeq0$
For $ heta in left ( 0,1 ight ), heta x_1+left ( 1- heta ight )x_2= heta displaystylesumpmits_{i=1}^n lambda_ix_i+left ( 1- heta ight )displaystylesumpmits_{i=1}^n gamma_ix_i$
$ heta x_1+left ( 1- heta ight )x_2=displaystylesumpmits_{i=1}^n lambda_i heta x_i+displaystylesumpmits_{i=1}^n gamma_ileft ( 1- heta ight )x_i$
$ heta x_1+left ( 1- heta ight )x_2=displaystylesumpmits_{i=1}^nleft [ lambda_i heta +gamma_ileft ( 1- heta ight ) ight ]x_i$
Considering the coefficients,
$displaystylesumpmits_{i=1}^nleft [ lambda_i heta +gamma_ileft ( 1- heta ight ) ight ]= heta displaystylesumpmits_{i=1}^n lambda_i+left ( 1- heta ight )displaystylesumpmits_{i=1}^ngamma_i= heta +left ( 1- heta ight )=1$
Hence, $ heta x_1+left ( 1- heta ight )x_2 in Coleft ( S ight )$
Thus, a convex hull is a convex set.
Caratheodory Theorem
Let S be an arbitrary set in $mathbb{R}^n$.If $x in Coleft ( S ight )$, then $x in Coleft ( x_1,x_2,....,x_n,x_{n+1} ight )$.
Proof
Since $x in Coleft ( S ight )$, then $x$ is representated by a convex combination of a finite number of points in S, i.e.,
$x=displaystylesumpmits_{j=1}^k lambda_jx_j,displaystylesumpmits_{j=1}^k lambda_j=1, lambda_j geq 0$ and $x_j in S, forall j in left ( 1,k ight )$
If $k leq n+1$, the result obtained is obviously true.
If $k geq n+1$, then $left ( x_2-x_1 ight )left ( x_3-x_1 ight ),....., left ( x_k-x_1 ight )$ are pnearly dependent.
$Rightarrow exists mu _j in mathbb{R}, 2leq jleq k$ (not all zero) such that $displaystylesumpmits_{j=2}^k mu _jleft ( x_j-x_1 ight )=0$
Define $mu_1=-displaystylesumpmits_{j=2}^k mu _j$, then $displaystylesumpmits_{j=1}^k mu_j x_j=0, displaystylesumpmits_{j=1}^k mu_j=0$
where not all $mu_j s$ are equal to zero. Since $displaystylesumpmits_{j=1}^k mu_j=0$, at least one of the $mu_j > 0,1 leq j leq k$
Then, $x=displaystylesumpmits_{1}^k lambda_j x_j+0$
$x=displaystylesumpmits_{1}^k lambda_j x_j- alpha displaystylesumpmits_{1}^k mu_j x_j$
$x=displaystylesumpmits_{1}^kleft ( lambda_j- alphamu_j ight )x_j $
Choose $alpha$ such that $alpha=minleft { frac{lambda_j}{mu_j}, mu_jgeq 0 ight }=frac{lambda_j}{mu _j},$ for some $i=1,2,...,k$
If $mu_jleq 0, lambda_j-alpha mu_jgeq 0$
If $mu_j> 0, then :frac{lambda _j}{mu_j}geq frac{lambda_i}{mu _i}=alpha Rightarrow lambda_j-alpha mu_jgeq 0, j=1,2,...k$
In particular, $lambda_i-alpha mu_i=0$, by definition of $alpha$
$x=displaystylesumpmits_{j=1}^k left ( lambda_j- alphamu_j ight )x_j$,where
$lambda_j- alphamu_jgeq0$ and $displaystylesumpmits_{j=1}^kleft ( lambda_j- alphamu_j ight )=1$ and $lambda_i- alphamu_i=0$
Thus, x can be representated as a convex combination of at most (k-1) points.
This reduction process can be repeated until x is representated as a convex combination of (n+1) elements.
Convex Optimization - Weierstrass Theorem
Let S be a non empty, closed and bounded set (also called compact set) in $mathbb{R}^n$ and let $f:S ightarrow mathbb{R} $ be a continuous function on S, then the problem min $left { fleft ( x ight ):x in S ight }$ attains its minimum.
Proof
Since S is non-empty and bounded, there exists a lower bound.
$alpha =Infleft { fleft ( x ight ):x in S ight }$
Now let $S_j=left { x in S:alpha leq fleft ( x ight ) leq alpha +delta ^j ight } forall j=1,2,...$ and $delta in left ( 0,1 ight )$
By the definition of infimium, $S_j$ is non-empty, for each $j$.
Choose some $x_j in S_j$ to get a sequence $left { x_j ight }$ for $j=1,2,...$
Since S is bounded, the sequence is also bounded and there is a convergent subsequence $left { y_j ight }$, which converges to $hat{x}$. Hence $hat{x}$ is a pmit point and S is closed, therefore, $hat{x} in S$. Since f is continuous, $fleft ( y_i ight ) ightarrow fleft ( hat{x} ight )$.
Since $alpha leq fleft ( y_i ight )leq alpha+delta^k, alpha=displaystylepm_{k ightarrow infty}fleft ( y_i ight )=fleft ( hat{x} ight )$
Thus, $hat{x}$ is the minimizing solution.
Remarks
There are two important necessary conditions for Weierstrass Theorem to hold. These are as follows −
Step 1 − The set S should be a bounded set.
Consider the function fleft ( x ight )=x$.
It is an unbounded set and it does have a minima at any point in its domain.
Thus, for minima to obtain, S should be bounded.
Step 2 − The set S should be closed.
Consider the function $fleft ( x ight )=frac{1}{x}$ in the domain left ( 0,1 ight ).
This function is not closed in the given domain and its minima also does not exist.
Hence, for minima to obtain, S should be closed.
Convex Optimization - Closest Point Theorem
Let S be a non-empty closed convex set in $mathbb{R}^n$ and let $y otin S$, then $exists$ a point $ar{x}in S$ with minimum distance from y, i.e.,$left | y-ar{x} ight | leq left | y-x ight | forall x in S.$
Furthermore, $ar{x}$ is a minimizing point if and only if $left ( y-hat{x} ight )^{T}left ( x-hat{x} ight )leq 0$ or $left ( y-hat{x}, x-hat{x} ight )leq 0$
Proof
Existence of closest point
Since $S e phi,exists$ a point $hat{x}in S$ such that the minimum distance of S from y is less than or equal to $left | y-hat{x} ight |$.
Define $hat{S}=S cap left { x:left | y-x ight |leq left | y-hat{x} ight | ight }$
Since $ hat{S}$ is closed and bounded, and since norm is a continuous function, then by Weierstrass theorem, there exists a minimum point $hat{x} in S$ such that $left | y-hat{x} ight |=Infleft { left | y-x ight |,x in S ight }$
Uniqueness
Suppose $ar{x} in S$ such that $left | y-hat{x} ight |=left | y-hat{x} ight |= alpha$
Since S is convex, $frac{hat{x}+ar{x}}{2} in S$
But, $left | y-frac{hat{x}-ar{x}}{2} ight |leq frac{1}{2}left | y-hat{x} ight |+frac{1}{2}left | y-ar{x} ight |=alpha$
It can t be strict inequapty because $hat{x}$ is closest to y.
Therefore, $left | y-hat{x} ight |=mu left | y-hat{x} ight |$, for some $mu$
Now $left | mu ight |=1.$ If $mu=-1$, then $left ( y-hat{x} ight )=-left ( y-hat{x} ight )Rightarrow y=frac{hat{x}+ar{x}}{2} in S$
But $y in S$. Hence contradiction. Thus $mu=1 Rightarrow hat{x}=ar{x}$
Thus, minimizing point is unique.
For the second part of the proof, assume $left ( y-hat{x} ight )^{ au }left ( x-ar{x} ight )leq 0$ for all $xin S$
Now,
$left | y-x ight |^{2}=left | y-hat{x}+ hat{x}-x ight |^{2}=left | y-hat{x} ight |^{2}+left |hat{x}-x ight |^{2}+2left (hat{x}-x ight )^{ au }left ( y-hat{x} ight )$
$Rightarrow left | y-x ight |^{2}geq left | y-hat{x} ight |^{2}$ because $left | hat{x}-x ight |^{2}geq 0$ and $left ( hat{x}- x ight )^{T}left ( y-hat{x} ight )geq 0$
Thus, $hat{x}$ is minimizing point.
Conversely, assume $hat{x}$ is minimizimg point.
$Rightarrow left | y-x ight |^{2}geq left | y-hat{x} ight |^2 forall x in S$
Since S is convex set.
$Rightarrow lambda x+left ( 1-lambda ight )hat{x}=hat{x}+lambdaleft ( x-hat{x} ight ) in S$ for $x in S$ and $lambda in left ( 0,1 ight )$
Now, $left | y-hat{x}-lambdaleft ( x-hat{x} ight ) ight |^{2}geq left | y-hat{x} ight |^2$
And
$left | y-hat{x}-lambdaleft ( x-hat{x} ight ) ight |^{2}=left | y-hat{x} ight |^{2}+lambda^2left | x-hat{x} ight |^{2}-2lambdaleft ( y-hat{x} ight )^{T}left ( x-hat{x} ight )$
$Rightarrow left | y-hat{x} ight |^{2}+lambda^{2}left | x-hat{x} ight |-2 lambdaleft ( y-hat{x} ight )^{T}left ( x-hat{x} ight )geq left | y-hat{x} ight |^{2}$
$Rightarrow 2 lambdaleft ( y-hat{x} ight )^{T}left ( x-hat{x} ight )leq lambda^2left | x-hat{x} ight |^2$
$Rightarrow left ( y-hat{x} ight )^{T}left ( x-hat{x} ight )leq 0$
Hence Proved.
Fundamental Separation Theorem
Let S be a non-empty closed, convex set in $mathbb{R}^n$ and $y otin S$. Then, there exists a non zero vector $p$ and scalar $eta$ such that $p^T y>eta$ and $p^T x < eta$ for each $x in S$
Proof
Since S is non empty closed convex set and $y otin S$ thus by closest point theorem, there exists a unique minimizing point $hat{x} in S$ such that
$left ( x-hat{x} ight )^Tleft ( y-hat{x} ight )leq 0 forall x in S$
Let $p=left ( y-hat{x} ight ) eq 0$ and $eta=hat{x}^Tleft ( y-hat{x} ight )=p^That{x}$.
Then $left ( x-hat{x} ight )^Tleft ( y-hat{x} ight )leq 0$
$Rightarrow left ( y-hat{x} ight )^Tleft ( x-hat{x} ight )leq 0$
$Rightarrow left ( y-hat{x} ight )^Txleq left ( y-hat{x} ight )^T hat{x}=hat{x}^Tleft ( y-hat{x} ight )$ i,e., $p^Tx leq eta$
Also, $p^Ty-eta=left ( y-hat{x} ight )^Ty-hat{x}^T left ( y-hat{x} ight )$
$=left ( y-hat{x} ight )^T left ( y-x ight )=left | y-hat{x} ight |^{2}>0$
$Rightarrow p^Ty> eta$
This theorem results in separating hyperplanes. The hyperplanes based on the above theorem can be defined as follows −
Let $S_1$ and $S_2$ are be non-empty subsets of $mathbb{R}$ and $H=left { X:A^TX=b ight }$ be a hyperplane.
The hyperplane H is said to separate $S_1$ and $S_2$ if $A^TX leq b forall X in S_1$ and $A_TX geq b forall X in S_2$
The hyperplane H is said to strictly separate $S_1$ and $S_2$ if $A^TX < b forall X in S_1$ and $A_TX > b forall X in S_2$
The hyperplane H is said to strongly separate $S_1$ and $S_2$ if $A^TX leq b forall X in S_1$ and $A_TX geq b+ varepsilon forall X in S_2$, where $varepsilon$ is a positive scalar.
Convex Optimization - Cones
A non empty set C in $mathbb{R}^n$ is said to be cone with vertex 0 if $x in CRightarrow lambda x in C forall lambda geq 0$.
A set C is a convex cone if it convex as well as cone.
For example, $y=left | x ight |$ is not a convex cone because it is not convex.
But, $y geq left | x ight |$ is a convex cone because it is convex as well as cone.
Note − A cone C is convex if and only if for any $x,y in C, x+y in C$.
Proof
Since C is cone, for $x,y in C Rightarrow lambda x in C$ and $mu y in C :forall :lambda, mu geq 0$
C is convex if $lambda x + left ( 1-lambda ight )y in C : forall :lambda in left ( 0, 1 ight )$
Since C is cone, $lambda x in C$ and $left ( 1-lambda ight )y in C Leftrightarrow x,y in C$
Thus C is convex if $x+y in C$
In general, if $x_1,x_2 in C$, then, $lambda_1x_1+lambda_2x_2 in C, forall lambda_1,lambda_2 geq 0$
Examples
The conic combination of infinite set of vectors in $mathbb{R}^n$ is a convex cone.
Any empty set is a convex cone.
Any pnear function is a convex cone.
Since a hyperplane is pnear, it is also a convex cone.
Closed half spaces are also convex cones.
Note − The intersection of two convex cones is a convex cone but their union may or may not be a convex cone.
Convex Optimization - Polar Cone
Let S be a non empty set in $mathbb{R}^n$ Then, the polar cone of S denoted by $S^*$ is given by $S^*=left {p in mathbb{R}^n, p^Tx leq 0 : forall x in S ight }$.
Remark
Polar cone is always convex even if S is not convex.
If S is empty set, $S^*=mathbb{R}^n$.
Polarity may be seen as a generapsation of orthogonapty.
Let $Csubseteq mathbb{R}^n$ then the orthogonal space of C, denoted by $C^perp =left { y in mathbb{R}^n:left langle x,y ight angle=0 forall x in C ight }$.
Lemma
Let $S,S_1$ and $S_2$ be non empty sets in $mathbb{R}^n$ then the following statements are true −
$S^*$ is a closed convex cone.
$S subseteq S^{**}$ where $S^{**}$ is a polar cone of $S^*$.
$S_1 subseteq S_2 Rightarrow S_{2}^{*} subseteq S_{1}^{*}$.
Proof
Step 1 − $S^*=left { p in mathbb{R}^n,p^Txleq 0 : forall :x in S ight }$
Let $x_1,x_2 in S^*Rightarrow x_{1}^{T}xleq 0 $ and $x_{2}^{T}x leq 0,forall x in S$
For $lambda in left ( 0, 1 ight ),left [ lambda x_1+left ( 1-lambda ight )x_2 ight ]^Tx=left [ left ( lambda x_1 ight )^T+ left {left ( 1-lambda ight )x_{2} ight }^{T} ight ]x, forall x in S$
$=left [ lambda x_{1}^{T} +left ( 1-lambda ight )x_{2}^{T} ight ]x=lambda x_{1}^{T}x+left ( 1-lambda ight )x_{2}^{T}leq 0$
Thus $lambda x_1+left ( 1-lambda ight )x_{2} in S^*$
Therefore $S^*$ is a convex set.
For $lambda geq 0,p^{T}x leq 0, forall :x in S$
Therefore, $lambda p^T x leq 0,$
$Rightarrow left ( lambda p ight )^T x leq 0$
$Rightarrow lambda p in S^*$
Thus, $S^*$ is a cone.
To show $S^*$ is closed, i.e., to show if $p_n ightarrow p$ as $n ightarrow infty$, then $p in S^*$
$forall x in S, p_{n}^{T}x-p^T x=left ( p_n-p ight )^T x$
As $p_n ightarrow p$ as $n ightarrow infty Rightarrow left ( p_n ightarrow p ight ) ightarrow 0$
Therefore $p_{n}^{T}x ightarrow p^{T}x$. But $p_{n}^{T}x leq 0, : forall x in S$
Thus, $p^Tx leq 0, forall x in S$
$Rightarrow p in S^*$
Hence, $S^*$ is closed.
Step 2 − $S^{**}=left { q in mathbb{R}^n:q^T p leq 0, forall p in S^* ight }$
Let $x in S$, then $ forall p in S^*, p^T x leq 0 Rightarrow x^Tp leq 0 Rightarrow x in S^{**}$
Thus, $S subseteq S^{**}$
Step 3 − $S_2^*=left { p in mathbb{R}^n:p^Txleq 0, forall x in S_2 ight }$
Since $S_1 subseteq S_2 Rightarrow forall x in S_2 Rightarrow forall x in S_1$
Therefore, if $hat{p} in S_2^*, $then $hat{p}^Tx leq 0,forall x in S_2$
$Rightarrow hat{p}^Txleq 0, forall x in S_1$
$Rightarrow hat{p}^T in S_1^*$
$Rightarrow S_2^* subseteq S_1^*$
Theorem
Let C be a non empty closed convex cone, then $C=C^**$
Proof
$C=C^{**}$ by previous lemma.
To prove : $x in C^{**} subseteq C$
Let $x in C^{**}$ and let $x otin C$
Then by fundamental separation theorem, there exists a vector $p eq 0$ and a scalar $alpha$ such that $p^Ty leq alpha, forall y in C$
Therefore, $p^Tx > alpha$
But since $left ( y=0 ight ) in C$ and $p^Tyleq alpha, forall y in C Rightarrow alphageq 0$ and $p^Tx>0$
If $p otin C^*$, then there exists some $ar{y} in C$ such that $p^T ar{y}>0$ and $p^Tleft ( lambda ar{y} ight )$ can be made arbitrarily large by taking $lambda$ sufficiently large.
This contradicts with the fact that $p^Ty leq alpha, forall y in C$
Therefore,$p in C^*$
Since $x in C^*=left { q:q^Tpleq 0, forall p in C^* ight }$
Therefore, $x^Tp leq 0 Rightarrow p^Tx leq 0$
But $p^Tx> alpha$
Thus is contardiction.
Thus, $x in C$
Hence $C=C^{**}$.
Convex Optimization - Conic Combination
A point of the form $alpha_1x_1+alpha_2x_2+....+alpha_nx_n$ with $alpha_1, alpha_2,...,alpha_ngeq 0$ is called conic combination of $x_1, x_2,...,x_n.$
If $x_i$ are in convex cone C, then every conic combination of $x_i$ is also in C.
A set C is a convex cone if it contains all the conic combination of its elements.
Conic Hull
A conic hull is defined as a set of all conic combinations of a given set S and is denoted by coni(S).
Thus, $conileft ( S ight )=left { displaystylesumpmits_{i=1}^k lambda_ix_i:x_i in S,lambda_iin mathbb{R}, lambda_igeq 0,i=1,2,... ight }$
The conic hull is a convex set.
The origin always belong to the conic hull.
Convex Optimization - Polyhedral Set
A set in $mathbb{R}^n$ is said to be polyhedral if it is the intersection of a finite number of closed half spaces, i.e.,
$S=left { x in mathbb{R}^n:p_{i}^{T}xleq alpha_i, i=1,2,....,n ight }$
For example,
$left { x in mathbb{R}^n:AX=b ight }$
$left { x in mathbb{R}^n:AXleq b ight }$
$left { x in mathbb{R}^n:AXgeq b ight }$
Polyhedral Cone
A set in $mathbb{R}^n$ is said to be polyhedral cone if it is the intersection of a finite number of half spaces that contain the origin, i.e., $S=left { x in mathbb{R}^n:p_{i}^{T}xleq 0, i=1, 2,... ight }$
Polytope
A polytope is a polyhedral set which is bounded.
Remarks
A polytope is a convex hull of a finite set of points.
A polyhedral cone is generated by a finite set of vectors.
A polyhedral set is a closed set.
A polyhedral set is a convex set.
Extreme point of a convex set
Let S be a convex set in $mathbb{R}^n$. A vector $x in S$ is said to be a extreme point of S if $x= lambda x_1+left ( 1-lambda ight )x_2$ with $x_1, x_2 in S$ and $lambda inleft ( 0, 1 ight )Rightarrow x=x_1=x_2$.
Example
Step 1 − $S=left { left ( x_1,x_2 ight ) in mathbb{R}^2:x_{1}^{2}+x_{2}^{2}leq 1 ight }$
Extreme point, $E=left { left ( x_1, x_2 ight )in mathbb{R}^2:x_{1}^{2}+x_{2}^{2}= 1 ight }$
Step 2 − $S=left { left ( x_1,x_2 ight )in mathbb{R}^2:x_1+x_2< 2, -x_1+2x_2leq 2, x_1,x_2geq 0 ight }$
Extreme point, $E=left { left ( 0, 0 ight), left ( 2, 0 ight), left ( 0, 1 ight), left ( frac{2}{3}, frac{4}{3} ight) ight }$
Step 3 − S is the polytope made by the points $left { left ( 0,0 ight ), left ( 1,1 ight ), left ( 1,3 ight ), left ( -2,4 ight ),left ( 0,2 ight ) ight }$
Extreme point, $E=left { left ( 0,0 ight ), left ( 1,1 ight ),left ( 1,3 ight ),left ( -2,4 ight ) ight }$
Remarks
Any point of the convex set S, can be represented as a convex combination of its extreme points.
It is only true for closed and bounded sets in $mathbb{R}^n$.
It may not be true for unbounded sets.
k extreme points
A point in a convex set is called k extreme if and only if it is the interior point of a k-dimensional convex set within S, and it is not an interior point of a (k+1)- dimensional convex set within S. Basically, for a convex set S, k extreme points make k-dimensional open faces.
Convex Optimization - Direction
Let S be a closed convex set in $mathbb{R}^n$. A non zero vector $d in mathbb{R}^n$ is called a direction of S if for each $x in S,x+lambda d in S, forall lambda geq 0.$
Two directions $d_1$ and $d_2$ of S are called distinct if $d eq alpha d_2$ for $ alpha>0$.
A direction $d$ of $S$ is said to be extreme direction if it cannot be written as a positive pnear combination of two distinct directions, i.e., if $d=lambda _1d_1+lambda _2d_2$ for $lambda _1, lambda _2>0$, then $d_1= alpha d_2$ for some $alpha$.
Any other direction can be expressed as a positive combination of extreme directions.
For a convex set $S$, the direction d such that $x+lambda d in S$ for some $x in S$ and all $lambda geq0$ is called recessive for $S$.
Let E be the set of the points where a certain function $f:S ightarrow$ over a non-empty convex set S in $mathbb{R}^n$ attains its maximum, then $E$ is called exposed face of $S$. The directions of exposed faces are called exposed directions.
A ray whose direction is an extreme direction is called an extreme ray.
Example
Consider the function $fleft ( x ight )=y=left |x ight |$, where $x in mathbb{R}^n$. Let d be unit vector in $mathbb{R}^n$
Then, d is the direction for the function f because for any $lambda geq 0, x+lambda d in fleft ( x ight )$.
Convex and Concave Function
Let $f:S ightarrow mathbb{R}$, where S is non empty convex set in $mathbb{R}^n$, then $fleft ( x ight )$ is said to be convex on S if $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight ), forall lambda in left ( 0,1 ight )$.
On the other hand, Let $f:S ightarrow mathbb{R}$, where S is non empty convex set in $mathbb{R}^n$, then $fleft ( x ight )$ is said to be concave on S if $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )geq lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight ), forall lambda in left ( 0, 1 ight )$.
Let $f:S ightarrow mathbb{R}$ where S is non empty convex set in $mathbb{R}^n$, then $fleft ( x ight )$ is said to be strictly convex on S if $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )< lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight ), forall lambda in left ( 0, 1 ight )$.
Let $f:S ightarrow mathbb{R}$ where S is non empty convex set in $mathbb{R}^n$, then $fleft ( x ight )$ is said to be strictly concave on S if $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )> lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight ), forall lambda in left ( 0, 1 ight )$.
Examples
A pnear function is both convex and concave.
$fleft ( x ight )=left | x ight |$ is a convex function.
$fleft ( x ight )= frac{1}{x}$ is a convex function.
Theorem
Let $f_1,f_2,...,f_k:mathbb{R}^n ightarrow mathbb{R}$ be convex functions. Consider the function $fleft ( x ight )=displaystylesumpmits_{j=1}^k alpha_jf_jleft ( x ight )$ where $alpha_j>0,j=1, 2, ...k,$ then $fleft ( x ight )$is a convex function.
Proof
Since $f_1,f_2,...f_k$ are convex functions
Therefore, $f_ileft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda f_ileft ( x_1 ight )+left ( 1-lambda ight )f_ileft ( x_2 ight ), forall lambda in left ( 0, 1 ight )$ and $i=1, 2,....,k$
Consider the function $fleft ( x ight )$.
Therefore,
$ fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )$
$=displaystylesumpmits_{j=1}^k alpha_jf_jleft ( lambda x_1 +1-lambda ight )x_2leq displaystylesumpmits_{j=1}^kalpha_jlambda f_jleft ( x_1 ight )+left ( 1-lambda ight )f_jleft ( x_2 ight )$
$Rightarrow fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda left ( displaystylesumpmits_{j=1}^k alpha _jf_jleft ( x_1 ight ) ight )+left ( displaystylesumpmits_{j=1}^k alpha _jf_jleft ( x_2 ight ) ight )$
$Rightarrow fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda fleft ( x_2 ight )leq left ( 1-lambda ight )fleft ( x_2 ight )$
Hence, $fleft ( x ight )$ is a convex function.
Theorem
Let $fleft ( x ight )$ be a convex function on a convex set $Ssubset mathbb{R}^n$ then a local minima of $fleft ( x ight )$ on S is a global minima.
Proof
Let $hat{x}$ be a local minima for $fleft ( x ight )$ and $hat{x}$ is not global minima.
therefore, $exists hat{x} in S$ such that $fleft ( ar{x} ight )< fleft ( hat{x} ight )$
Since $hat{x}$ is a local minima, there exists neighbourhood $N_varepsilon left ( hat{x} ight )$ such that $fleft ( hat{x} ight )leq fleft ( x ight ), forall x in N_varepsilon left ( hat{x} ight )cap S$
But $fleft ( x ight )$ is a convex function on S, therefore for $lambda in left ( 0, 1 ight )$
we have $lambda hat{x}+left ( 1-lambda ight )ar{x}leq lambda fleft ( hat{x} ight )+left ( 1-lambda ight )fleft ( ar{x} ight )$
$Rightarrow lambda hat{x}+left ( 1-lambda ight )ar{x}< lambda fleft ( hat{x} ight )+left ( 1-lambda ight )fleft (hat{x} ight )$
$Rightarrow lambda hat{x}+left ( 1-lambda ight )ar{x}< fleft (hat{x} ight ), forall lambda in left ( 0,1 ight )$
But for some $lambda<1$ but close to 1, we have
$lambda hat{x}+left ( 1-lambda ight )ar{x} in N_varepsilon left ( hat{x} ight )cap S$ and $fleft ( lambda hat{x}+left ( 1-lambda ight )ar{x} ight )< fleft ( ar{x} ight )$
which is a contradiction.
Hence, $ar{x}$ is a global minima.
Epigraph
let S be a non-empty subset in $mathbb{R}^n$ and let $f:S ightarrow mathbb{R}$ then the epigraph of f denoted by epi(f) or $E_f$ is a subset of $mathbb{R}^n+1$ defined by $E_f=left { left ( x,alpha ight ):x in mathbb{R}^n, alpha in mathbb{R}, fleft ( x ight )leq alpha ight }$
Hypograph
let S be a non-empty subset in $mathbb{R}^n$ and let $f:S ightarrow mathbb{R}$, then the hypograph of f denoted by hyp(f) or $H_f=left { left ( x, alpha ight ):x in mathbb{R}^n, alpha in mathbb{R}^n, alpha in mathbb{R}, fleft ( x ight )geq alpha ight }$
Theorem
Let S be a non-empty convex set in $mathbb{R}^n$ and let $f:S ightarrow mathbb{R}^n$, then f is convex if and only if its epigraph $E_f$ is a convex set.
Proof
Let f is a convex function.
To show $E_f$ is a convex set.
Let $left ( x_1, alpha_1 ight ),left ( x_2, alpha_2 ight ) in E_f,lambda inleft ( 0, 1 ight )$
To show $lambda left ( x_1,alpha_1 ight )+left ( 1-lambda ight )left ( x_2, alpha_2 ight ) in E_f$
$Rightarrow left [ lambda x_1+left ( 1-lambda ight )x_2, lambda alpha_1+left ( 1-lambda ight )alpha_2 ight ]in E_f$
$fleft ( x_1 ight )leq alpha _1, fleft ( x_2 ight )leq alpha _2$
Therefore, $fleft (lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda fleft ( x_1 ight )+left ( 1-lambda ight )f left ( x_2 ight )$
$Rightarrow fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda alpha_1+left ( 1-lambda ight )alpha_2$
Converse
Let $E_f$ is a convex set.
To show f is convex.
i.e., to show if $x_1, x_2 in S,lambda left ( 0, 1 ight )$
$fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight )$
Let $x_1,x_2 in S, lambda in left ( 0, 1 ight ),fleft ( x_1 ight ), fleft ( x_2 ight ) in mathbb{R}$
Since $E_f$ is a convex set, $left ( lambda x_1+left ( 1-lambda ight )x_2, lambda fleft ( x_1 ight )+left ( 1-lambda ight ) ight )fleft ( x_2 ight )in E_f$
Therefore, $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight )$
Convex Optimization - Jensen s Inequapty
Let S be a non-empty convex set in $mathbb{R}^n$ and $f:S ightarrow mathbb{R}^n$. Then f is convex if and only if for each integer $k>0$
$x_1,x_2,...x_k in S, displaystylesumpmits_{i=1}^k lambda_i=1, lambda_igeq 0, forall i=1,2,s,k$, we have $fleft ( displaystylesumpmits_{i=1}^k lambda_ix_i ight )leq displaystylesumpmits_{i=1}^k lambda _ifleft ( x ight )$
Proof
By induction on k.
$k=1:x_1 in S$ Therefore $fleft ( lambda_1 x_1 ight ) leq lambda_i fleft (x_1 ight )$ because $lambda_i=1$.
$k=2:lambda_1+lambda_2=1$ and $x_1, x_2 in S$
Therefore, $lambda_1x_1+lambda_2x_2 in S$
Hence by definition, $fleft ( lambda_1 x_1 +lambda_2 x_2 ight )leq lambda _1fleft ( x_1 ight )+lambda _2fleft ( x_2 ight )$
Let the statement is true for $n < k$
Therefore,
$fleft ( lambda_1 x_1+ lambda_2 x_2+....+lambda_k x_k ight )leq lambda_1 fleft (x_1 ight )+lambda_2 fleft (x_2 ight )+...+lambda_k fleft (x_k ight )$
$k=n+1:$ Let $x_1, x_2,....x_n,x_{n+1} in S$ and $displaystylesumpmits_{i=1}^{n+1}=1$
Therefore $mu_1x_1+mu_2x_2+.......+mu_nx_n+mu_{n+1} x_{n+1} in S$
thus,$fleft (mu_1x_1+mu_2x_2+...+mu_nx_n+mu_{n+1} x_{n+1} ight )$
$=fleft ( left ( mu_1+mu_2+...+mu_n ight)frac{mu_1x_1+mu_2x_2+...+mu_nx_n}{mu_1+mu_2+mu_3}+mu_{n+1}x_{n+1} ight)$
$=fleft ( mu_y+mu_{n+1}x_{n+1} ight )$ where $mu=mu_1+mu_2+...+mu_n$ and
$y=frac{mu_1x_1+mu_2x_2+...+mu_nx_n}{mu_1+mu_2+...+mu_n}$ and also $mu_1+mu_{n+1}=1,y in S$
$Rightarrow fleft ( mu_1x_1+mu_2x_2+...+mu_nx_n+mu_{n+1}x_{n+1} ight ) leq mu fleft ( y ight )+mu_{n+1} fleft ( x_{n+1} ight )$
$Rightarrow fleft ( mu_1x_1+mu_2x_2+...+mu_nx_n+mu_{n+1}x_{n+1} ight ) leq$
$left ( mu_1+mu_2+...+mu_n ight )fleft ( frac{mu_1x_1+mu_2x_2+...+mu_nx_n}{mu_1+mu_2+...+mu_n} ight )+mu_{n+1}fleft ( x_{n+1} ight )$
$Rightarrow fleft ( mu_1x_1+mu_2x_2+...+mu_nx_n +mu_{n+1}x_{n+1} ight )leq left ( mu_1+ mu_2+ ...+mu_n ight )$
$left [ frac{mu_1}{mu_1+ mu_2+ ...+mu_n}fleft ( x_1 ight )+...+frac{mu_n}{mu_1+ mu_2+ ...+mu_n}fleft ( x_n ight ) ight ]+mu_{n+1}fleft ( x_{n+1} ight )$
$Rightarrow fleft ( mu_1x_1+mu_2x_2+...+mu_nx_n+mu_{n+1}x_{n+1} ight )leq mu_1fleft ( x_1 ight )+mu_2fleft ( x_2 ight )+....$
Hence Proved.
Convex Optimization - Differentiable Function
Let S be a non-empty open set in $mathbb{R}^n$,then $f:S ightarrow mathbb{R}$ is said to be differentiable at $hat{x} in S$ if there exist a vector $igtriangledown fleft ( hat{x} ight )$ called gradient vector and a function $alpha :mathbb{R}^n ightarrow mathbb{R}$ such that
$fleft ( x ight )=fleft ( hat{x} ight )+igtriangledown fleft ( hat{x} ight )^Tleft ( x-hat{x} ight )+left | x=hat{x} ight |alpha left ( hat{x}, x-hat{x} ight ), forall x in S$ where
$alpha left (hat{x}, x-hat{x} ight ) ightarrow 0 igtriangledown fleft ( hat{x} ight )=left [ frac{partial f}{partial x_1}frac{partial f}{partial x_2}...frac{partial f}{partial x_n} ight ]_{x=hat{x}}^{T}$
Theorem
let S be a non-empty, open convexset in $mathbb{R}^n$ and let $f:S ightarrow mathbb{R}$ be differentiable on S. Then, f is convex if and only if for $x_1,x_2 in S, igtriangledown fleft ( x_2 ight )^T left ( x_1-x_2 ight ) leq fleft ( x_1 ight )-fleft ( x_2 ight )$
Proof
Let f be a convex function. i.e., for $x_1,x_2 in S, lambda in left ( 0, 1 ight )$
$fleft [ lambda x_1+left ( 1-lambda ight )x_2 ight ]leq lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight )$
$ Rightarrow fleft [ lambda x_1+left ( 1-lambda ight )x_2 ight ]leq lambda left ( fleft ( x_1 ight )-fleft ( x_2 ight ) ight )+fleft ( x_2 ight )$
$ Rightarrowlambda left ( fleft ( x_1 ight )-fleft ( x_2 ight ) ight )geq fleft ( x_2+lambda left ( x_1-x_2 ight ) ight )-fleft ( x_2 ight )$
$Rightarrow lambda left ( fleft ( x_1 ight )-fleft ( x_2 ight ) ight )geq fleft ( x_2 ight )+igtriangledown fleft ( x_2 ight )^Tleft ( x_1-x_2 ight )lambda +$
$left | lambda left ( x_1-x_2 ight ) ight |alpha left ( x_2,lambdaleft (x_1 - x_2 ight )-fleft ( x_2 ight ) ight )$
where $alphaleft ( x_2, lambdaleft (x_1 - x_2 ight ) ight ) ightarrow 0$ as$lambda ightarrow 0$
Dividing by $lambda$ on both sides, we get −
$fleft ( x_1 ight )-fleft ( x_2 ight ) geq igtriangledown fleft ( x_2 ight )^T left ( x_1-x_2 ight )$
Converse
Let for $x_1,x_2 in S, igtriangledown fleft ( x_2 ight )^T left ( x_1-x_2 ight ) leq fleft ( x_1 ight )-f left ( x_2 ight )$
To show that f is convex.
Since S is convex, $x_3=lambda x_1+left (1-lambda ight )x_2 in S, lambda in left ( 0, 1 ight )$
Since $x_1, x_3 in S$, therefore
$fleft ( x_1 ight )-f left ( x_3 ight ) geq igtriangledown fleft ( x_3 ight )^T left ( x_1 -x_3 ight )$
$ Rightarrow fleft ( x_1 ight )-f left ( x_3 ight )geq igtriangledown fleft ( x_3 ight )^T left ( x_1 - lambda x_1-left (1-lambda ight )x_2 ight )$
$ Rightarrow fleft ( x_1 ight )-f left ( x_3 ight )geq left ( 1- lambda ight )igtriangledown fleft ( x_3 ight )^T left ( x_1 - x_2 ight )$
Since, $x_2, x_3 in S$ therefore
$fleft ( x_2 ight )-fleft ( x_3 ight )geq igtriangledown fleft ( x_3 ight )^Tleft ( x_2-x_3 ight )$
$Rightarrow fleft ( x_2 ight )-fleft ( x_3 ight )geq igtriangledown fleft ( x_3 ight )^Tleft ( x_2-lambda x_1-left ( 1-lambda ight )x_2 ight )$
$Rightarrow fleft ( x_2 ight )-fleft ( x_3 ight )geq left ( -lambda ight )igtriangledown fleft ( x_3 ight )^Tleft ( x_1-x_2 ight )$
Thus, combining the above equations, we get −
$lambda left ( fleft ( x_1 ight )-fleft ( x_3 ight ) ight )+left ( 1- lambda ight )left ( fleft ( x_2 ight )-fleft ( x_3 ight ) ight )geq 0$
$Rightarrow fleft ( x_3 ight )leq lambda fleft ( x_1 ight )+left ( 1-lambda ight )fleft ( x_2 ight )$
Theorem
let S be a non-empty open convex set in $mathbb{R}^n$ and let $f:S ightarrow mathbb{R}$ be differentiable on S, then f is convex on S if and only if for any $x_1,x_2 in S,left ( igtriangledown f left ( x_2 ight )-igtriangledown f left ( x_1 ight ) ight )^T left ( x_2-x_1 ight ) geq 0$
Proof
let f be a convex function, then using the previous theorem −
$igtriangledown fleft ( x_2 ight )^Tleft ( x_1-x_2 ight )leq fleft ( x_1 ight )-fleft ( x_2 ight )$ and
$igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )leq fleft ( x_2 ight )-fleft ( x_1 ight )$
Adding the above two equations, we get −
$igtriangledown fleft ( x_2 ight )^Tleft ( x_1-x_2 ight )+igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )leq 0$
$Rightarrow left ( igtriangledown fleft ( x_2 ight )-igtriangledown fleft ( x_1 ight ) ight )^Tleft ( x_1-x_2 ight )leq 0$
$Rightarrow left ( igtriangledown fleft ( x_2 ight )-igtriangledown fleft ( x_1 ight ) ight )^Tleft ( x_2-x_1 ight )geq 0$
Converse
Let for any $x_1,x_2 in S,left (igtriangledown f left ( x_2 ight )- igtriangledown f left ( x_1 ight ) ight )^T left ( x_2-x_1 ight )geq 0$
To show that f is convex.
Let $x_1,x_2 in S$, thus by mean value theorem, $frac{fleft ( x_1 ight )-fleft ( x_2 ight )}{x_1-x_2}=igtriangledown fleft ( x ight ),x in left ( x_1-x_2 ight ) Rightarrow x= lambda x_1+left ( 1-lambda ight )x_2$ because S is a convex set.
$Rightarrow fleft ( x_1 ight )- fleft ( x_2 ight )=left ( igtriangledown fleft ( x ight )^T ight )left ( x_1-x_2 ight )$
for $x,x_1$, we know −
$left ( igtriangledown fleft ( x ight )-igtriangledown fleft ( x_1 ight ) ight )^Tleft ( x-x_1 ight )geq 0$
$Rightarrow left ( igtriangledown fleft ( x ight )-igtriangledown fleft ( x_1 ight ) ight )^Tleft ( lambda x_1+left ( 1-lambda ight )x_2-x_1 ight )geq 0$
$Rightarrow left ( igtriangledown fleft ( x ight )- igtriangledown fleft ( x_1 ight ) ight )^Tleft ( 1- lambda ight )left ( x_2-x_1 ight )geq 0$
$Rightarrow igtriangledown fleft ( x ight )^Tleft ( x_2-x_1 ight )geq igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )$
Combining the above equations, we get −
$Rightarrow igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )leq fleft ( x_2 ight )-fleft ( x_1 ight )$
Hence using the last theorem, f is a convex function.
Twice Differentiable function
Let S be a non-empty subset of $mathbb{R}^n$ and let $f:S ightarrow mathbb{R}$ then f is said to be twice differentiable at $ar{x} in S$ if there exists a vector $igtriangledown fleft (ar{x} ight ), a :nXn$ matrix $Hleft (x ight )$(called Hessian matrix) and a function $alpha:mathbb{R}^n ightarrow mathbb{R}$ such that $fleft ( x ight )=fleft ( ar{x}+x-ar{x} ight )=fleft ( ar{x} ight )+igtriangledown fleft ( ar{x} ight )^Tleft ( x-ar{x} ight )+frac{1}{2}left ( x-ar{x} ight )Hleft ( ar{x} ight )left ( x-ar{x} ight )$
where $ alpha left ( ar{x}, x-ar{x} ight ) ightarrow Oasx ightarrow ar{x}$
Sufficient & Necessary Conditions for Global Optima
Theorem
Let f be twice differentiable function. If $ar{x}$ is a local minima, then $igtriangledown fleft ( ar{x} ight )=0$ and the Hessian matrix $Hleft ( ar{x} ight )$ is a positive semidefinite.
Proof
Let $d in mathbb{R}^n$. Since f is twice differentiable at $ar{x}$.
Therefore,
$fleft ( ar{x} +lambda d ight )=fleft ( ar{x} ight )+lambda igtriangledown fleft ( ar{x} ight )^T d+lambda^2d^THleft ( ar{x} ight )d+lambda^2d^THleft ( ar{x} ight )d+$
$lambda^2left | d ight |^2eta left ( ar{x}, lambda d ight )$
But $igtriangledown fleft ( ar{x} ight )=0$ and $etaleft ( ar{x}, lambda d ight ) ightarrow 0$ as $lambda ightarrow 0$
$Rightarrow fleft ( ar{x} +lambda d ight )-fleft ( ar{x} ight )=lambda ^2d^THleft ( ar{x} ight )d$
Since $ar{x }$ is a local minima, there exists a $delta > 0$ such that $fleft ( x ight )leq fleft ( ar{x}+lambda d ight ), forall lambda in left ( 0,delta ight )$
Theorem
Let $f:S ightarrow mathbb{R}^n$ where $S subset mathbb{R}^n$ be twice differentiable over S. If $igtriangledown fleft ( x ight )=0$ and $Hleft ( ar{x} ight )$ is positive semi-definite, for all $x in S$, then $ar{x}$ is a global optimal solution.
Proof
Since $Hleft ( ar{x} ight )$ is positive semi-definite, f is convex function over S. Since f is differentiable and convex at $ar{x}$
$igtriangledown fleft ( ar{x} ight )^T left ( x-ar{x} ight ) leq fleft (x ight )-fleft (ar{x} ight ),forall x in S$
Since $igtriangledown fleft ( ar{x} ight )=0, fleft ( x ight )geq fleft ( ar{x} ight )$
Hence, $ar{x}$ is a global optima.
Theorem
Suppose $ar{x} in S$ is a local optimal solution to the problem $f:S ightarrow mathbb{R}$ where S is a non-empty subset of $mathbb{R}^n$ and S is convex. $min :fleft ( x ight )$ where $x in S$.
Then:
$ar{x}$ is a global optimal solution.
If either $ar{x}$ is strictly local minima or f is strictly convex function, then $ar{x}$ is the unique global optimal solution and is also strong local minima.
Proof
Let $ar{x}$ be another global optimal solution to the problem such that $x eq ar{x}$ and $fleft ( ar{x} ight )=fleft ( hat{x} ight )$
Since $hat{x},ar{x} in S$ and S is convex, then $frac{hat{x}+ar{x}}{2} in S$ and f is strictly convex.
$Rightarrow fleft ( frac{hat{x}+ar{x}}{2} ight )< frac{1}{2} fleft (ar{x} ight )+frac{1}{2} fleft (hat{x} ight )=fleft (hat{x} ight )$
This is contradiction.
Hence, $hat{x}$ is a unique global optimal solution.
Corollary
Let $f:S subset mathbb{R}^n ightarrow mathbb{R}$ be a differentiable convex function where $phi eq Ssubset mathbb{R}^n$ is a convex set. Consider the problem $min fleft (x ight ),x in S$,then $ar{x}$ is an optimal solution if $igtriangledown fleft (ar{x} ight )^Tleft (x-ar{x} ight ) geq 0,forall x in S.$
Proof
Let $ar{x}$ is an optimal solution, i.e, $fleft (ar{x} ight )leq fleft (x ight ),forall x in S$
$Rightarrow fleft (x ight )=fleft (ar{x} ight )geq 0$
$fleft (x ight )=fleft (ar{x} ight )+igtriangledown fleft (ar{x} ight )^Tleft (x-ar{x} ight )+left | x-ar{x} ight |alpha left ( ar{x},x-ar{x} ight )$
where $alpha left ( ar{x},x-ar{x} ight ) ightarrow 0$ as $x ightarrow ar{x}$
$Rightarrow fleft (x ight )-fleft (ar{x} ight )=igtriangledown fleft (ar{x} ight )^Tleft (x-ar{x} ight )geq 0$
Corollary
Let f be a differentiable convex function at $ar{x}$,then $ar{x}$ is global minimum iff $igtriangledown fleft (ar{x} ight )=0$
Examples
$fleft (x ight )=left (x^2-1 ight )^{3}, x in mathbb{R}$.
$igtriangledown fleft (x ight )=0 Rightarrow x= -1,0,1$.
$igtriangledown^2fleft (pm 1 ight )=0, igtriangledown^2 fleft (0 ight )=6>0$.
$fleft (pm 1 ight )=0,fleft (0 ight )=-1$
Hence, $fleft (x ight ) geq -1=fleft (0 ight )Rightarrow fleft (0 ight ) leq f left (x ight)forall x in mathbb{R}$
$fleft (x ight )=xlog x$ defined on $S=left { x in mathbb{R}, x> 0 ight }$.
${f} x=1+log x$
${f} x=frac{1}{x}>0$
Thus, this function is strictly convex.
$f left (x ight )=e^{x},x in mathbb{R}$ is strictly convex.
Quasiconvex and Quasiconcave functions
Let $f:S ightarrow mathbb{R}$ where $S subset mathbb{R}^n$ is a non-empty convex set. The function f is said to be quasiconvex if for each $x_1,x_2 in S$, we have $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq maxleft { fleft ( x_1 ight ),fleft ( x_2 ight ) ight },lambda in left ( 0, 1 ight )$
For example, $fleft ( x ight )=x^{3}$
Let $f:S ightarrow R $ where $Ssubset mathbb{R}^n$ is a non-empty convex set. The function f is said to be quasiconvex if for each $x_1, x_2 in S$, we have $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )geq minleft { fleft ( x_1 ight ),fleft ( x_2 ight ) ight }, lambda in left ( 0, 1 ight )$
Remarks
Every convex function is quasiconvex but the converse is not true.
A function which is both quasiconvex and quasiconcave is called quasimonotone.
Theorem
Let $f:S ightarrow mathbb{R}$ and S is a non empty convex set in $mathbb{R}^n$. The function f is quasiconvex if and only if $S_{alpha} =left ( x in S:fleft ( x ight )leq alpha ight }$ is convex for each real number alpha$
Proof
Let f is quasiconvex on S.
Let $x_1,x_2 in S_{alpha}$ therefore $x_1,x_2 in S$ and $max left { fleft ( x_1 ight ),fleft ( x_2 ight ) ight }leq alpha$
Let $lambda in left (0, 1 ight )$ and let $x=lambda x_1+left ( 1-lambda ight )x_2leq max left { fleft ( x_1 ight ),fleft ( x_2 ight ) ight }Rightarrow x in S$
Thus, $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq maxleft { fleft ( x_1 ight ), fleft ( x_2 ight ) ight }leq alpha$
Therefore, $S_{alpha}$ is convex.
Converse
Let $S_{alpha}$ is convex for each $alpha$
$x_1,x_2 in S, lambda in left ( 0,1 ight )$
$x=lambda x_1+left ( 1-lambda ight )x_2$
Let $x=lambda x_1+left ( 1-lambda ight )x_2$
For $x_1, x_2 in S_{alpha}, alpha= max left { fleft ( x_1 ight ), fleft ( x_2 ight ) ight }$
$Rightarrow lambda x_1+left (1-lambda ight )x_2 in S_{alpha}$
$Rightarrow f left (lambda x_1+left (1-lambda ight )x_2 ight )leq alpha$
Hence proved.
Theorem
Let $f:S ightarrow mathbb{R}$ and S is a non empty convex set in $mathbb{R}^n$. The function f is quasiconcave if and only if $S_{alpha} =left { x in S:fleft ( x ight )geq alpha ight }$ is convex for each real number $alpha$.
Theorem
Let $f:S ightarrow mathbb{R}$ and S is a non empty convex set in $mathbb{R}^n$. The function f is quasimonotone if and only if $S_{alpha} =left { x in S:fleft ( x ight )= alpha ight }$ is convex for each real number $alpha$.
Differentiable Quasiconvex Function
Theorem
Let S be a non empty convex set in $mathbb{R}^n$ and $f:S ightarrow mathbb{R}$ be differentiable on S, then f is quasiconvex if and only if for any $x_1,x_2 in S$ and $fleft ( x_1 ight )leq fleft ( x_2 ight )$, we have $igtriangledown fleft ( x_2 ight )^Tleft ( x_2-x_1 ight )leq 0$
Proof
Let f be a quasiconvex function.
Let $x_1,x_2 in S$ such that $fleft ( x_1 ight ) leq fleft ( x_2 ight )$
By differentiabipty of f at $x_2, lambda in left ( 0, 1 ight )$
$fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )=fleft ( x_2+lambda left (x_1-x_2 ight ) ight )=fleft ( x_2 ight )+igtriangledown fleft ( x_2 ight )^Tleft ( x_1-x_2 ight )$
$+lambda left | x_1-x_2 ight |alpha left ( x_2,lambda left ( x_1-x_2 ight ) ight )$
$Rightarrow fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )-fleft ( x_2 ight )-fleft ( x_2 ight )=igtriangledown fleft ( x_2 ight )^Tleft ( x_1-x_2 ight )$
$+lambda left | x_1-x_2 ight |alpha left ( x2, lambdaleft ( x_1-x_2 ight ) ight )$
But since f is quasiconvex, $f left ( lambda x_1+ left ( 1- lambda ight )x_2 ight )leq f left (x_2 ight )$
$igtriangledown fleft ( x_2 ight )^Tleft ( x_1-x_2 ight )+lambda left | x_1-x_2 ight |alpha left ( x_2,lambda left ( x_1,x_2 ight ) ight )leq 0$
But $alpha left ( x_2,lambda left ( x_1,x_2 ight ) ight ) ightarrow 0$ as $lambda ightarrow 0$
Therefore, $igtriangledown fleft ( x_2 ight )^Tleft ( x_1-x_2 ight ) leq 0$
Converse
let for $x_1,x_2 in S$ and $fleft ( x_1 ight )leq fleft ( x_2 ight )$, $igtriangledown fleft ( x_2 ight )^T left ( x_1,x_2 ight ) leq 0$
To show that f is quasiconvex,ie, $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )leq fleft ( x_2 ight )$
Proof by contradiction
Suppose there exists an $x_3= lambda x_1+left ( 1-lambda ight )x_2$ such that $fleft ( x_2 ight )< f left ( x_3 ight )$ for some $ lambda in left ( 0, 1 ight )$
For $x_2$ and $x_3,igtriangledown fleft ( x_3 ight )^T left ( x_2-x_3 ight ) leq 0$
$Rightarrow -lambda igtriangledown fleft ( x_3 ight )^Tleft ( x_2-x_3 ight )leq 0$
$Rightarrow igtriangledown fleft ( x_3 ight )^T left ( x_1-x_2 ight )geq 0$
For $x_1$ and $x_3,igtriangledown fleft ( x_3 ight )^T left ( x_1-x_3 ight ) leq 0$
$Rightarrow left ( 1- lambda ight )igtriangledown fleft ( x_3 ight )^Tleft ( x_1-x_2 ight )leq 0$
$Rightarrow igtriangledown fleft ( x_3 ight )^T left ( x_1-x_2 ight )leq 0$
thus, from the above equations, $igtriangledown fleft ( x_3 ight )^T left ( x_1-x_2 ight )=0$
Define $U=left { x:fleft ( x ight )leq fleft ( x_2 ight ),x=mu x_2+left ( 1-mu ight )x_3, mu in left ( 0,1 ight ) ight }$
Thus we can find $x_0 in U$ such that $x_0 = mu_0 x_2= mu x_2+left ( 1- mu ight )x_3$ for some $mu _0 in left ( 0,1 ight )$ which is nearest to $x_3$ and $hat{x} in left ( x_0,x_1 ight )$ such that by mean value theorem,
$$frac{fleft ( x_3 ight )-fleft ( x_0 ight )}{x_3-x_0}= igtriangledown fleft ( hat{x} ight )$$
$$Rightarrow fleft ( x_3 ight )=fleft ( x_0 ight )+igtriangledown fleft ( hat{x} ight )^Tleft ( x_3-x_0 ight )$$
$$Rightarrow fleft ( x_3 ight )=fleft ( x_0 ight )+mu_0 lambda fleft ( hat{x} ight )^T left ( x_1-x_2 ight )$$
Since $x_0$ is a combination of $x_1$ and $x_2$ and $fleft (x_2 ight )< fleft ( hat{x} ight )$
By repeating the starting procedure, $igtriangledown f left ( hat{x} ight )^T left ( x_1-x_2 ight )=0$
Thus, combining the above equations, we get:
$$fleft ( x_3 ight )=fleft ( x_0 ight ) leq fleft ( x_2 ight )$$
$$Rightarrow fleft ( x_3 ight )leq fleft ( x_2 ight )$$
Hence, it is contradiction.
Examples
Step 1 − $fleft ( x ight )=X^3$
$Let f left ( x_1 ight )leq fleft ( x_2 ight )$
$Rightarrow x_{1}^{3}leq x_{2}^{3}Rightarrow x_1leq x_2$
$igtriangledown fleft ( x_2 ight )left ( x_1-x_2 ight )=3x_{2}^{2}left ( x_1-x_2 ight )leq 0$
Thus, $fleft ( x ight )$ is quasiconvex.
Step 2 − $fleft ( x ight )=x_{1}^{3}+x_{2}^{3}$
Let $hat{x_1}=left ( 2, -2 ight )$ and $hat{x_2}=left ( 1, 0 ight )$
thus, $fleft ( hat{x_1} ight )=0,fleft ( hat{x_2} ight )=1 Rightarrow fleft ( hat{x_1} ight )setminus < f left ( hat{x_2} ight )$
Thus, $igtriangledown f left ( hat{x_2} ight )^T left ( hat{x_1}- hat{x_2} ight )= left ( 3, 0 ight )^T left ( 1, -2 ight )=3 >0$
Hence $fleft ( x ight )$ is not quasiconvex.
Strictly Quasiconvex Function
Let $f:S ightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$ then f is said to be strictly quasicovex function if for each $x_1,x_2 in S$ with $fleft ( x_1 ight ) eq fleft ( x_2 ight )$, we have $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )< max :left { fleft ( x_1 ight ),fleft ( x_2 ight ) ight }$
Remarks
Every strictly quasiconvex function is strictly convex.
Strictly quasiconvex function does not imply quasiconvexity.
Strictly quasiconvex function may not be strongly quasiconvex.
Pseudoconvex function is a strictly quasiconvex function.
Theorem
Let $f:S ightarrow mathbb{R}^n$ be strictly quasiconvex function and S be a non-empty convex set in $mathbb{R}^n$.Consider the problem: $min :fleft ( x ight ), x in S$. If $hat{x}$ is local optimal solution, then $ar{x}$ is global optimal solution.
Proof
Let there exists $ ar{x} in S$ such that $fleft ( ar{x} ight )leq f left ( hat{x} ight )$
Since $ar{x},hat{x} in S$ and S is convex set, therefore,
$$lambda ar{x}+left ( 1-lambda ight )hat{x}in S, forall lambda in left ( 0,1 ight )$$
Since $hat{x}$ is local minima, $fleft ( hat{x} ight ) leq fleft ( lambda ar{x}+left ( 1-lambda ight )hat{x} ight ), forall lambda in left ( 0,delta ight )$
Since f is strictly quasiconvex.
$$fleft ( lambda ar{x}+left ( 1-lambda ight )hat{x} ight )< max left { fleft ( hat{x} ight ),fleft ( ar{x} ight ) ight }=fleft ( hat{x} ight )$$
Hence, it is contradiction.
Strictly quasiconcave function
Let $f:S ightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$, then f is saud to be strictly quasicovex function if for each $x_1,x_2 in S$ with $fleft (x_1 ight ) eq fleft (x_2 ight )$, we have
$$fleft (lambda x_1+left (1-lambda ight )x_2 ight )> min left { f left (x_1 ight ),fleft (x_2 ight ) ight }$$.
Examples
$fleft (x ight )=x^2-2$
It is a strictly quasiconvex function because if we take any two points $x_1,x_2$ in the domain that satisfy the constraints in the definition $fleft (lambda x_1+left (1- lambda ight )x_2 ight )< max left { f left (x_1 ight ),fleft (x_2 ight ) ight }$ As the function is decreasing in the negative x-axis and it is increasing in the positive x-axis (since it is a parabola).
$fleft (x ight )=-x^2$
It is not a strictly quasiconvex function because if we take take $x_1=1$ and $x_2=-1$ and $lambda=0.5$, then $fleft (x_1 ight )=-1=fleft (x_2 ight )$ but $fleft (lambda x_1+left (1- lambda ight )x_2 ight )=0$ Therefore it does not satisfy the conditions stated in the definition. But it is a quasiconcave function because if we take any two points in the domain that satisfy the constraints in the definition $fleft ( lambda x_1+left (1-lambda ight )x_2 ight )> min left { f left (x_1 ight ),fleft (x_2 ight ) ight }$. As the function is increasing in the negative x-axis and it is decreasing in the positive x-axis.
Strongly Quasiconvex Function
Let $f:S ightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$ then f is strongly quasiconvex function if for any $x_1,x_2 in S$ with $left ( x_1 ight ) eq left ( x_2 ight )$, we have $fleft ( lambda x_1+left ( 1-lambda ight )x_2 ight )< max :left { fleft ( x_1 ight ),fleft ( x_2 ight ) ight },forall lambda in left ( 0,1 ight )$
Theorem
A quasiconvex function $f:S ightarrow mathbb{R}^n$ on a non-empty convex set S in $mathbb{R}^n$ is strongly quasiconvex function if it is not constant on a pne segment joining any points of S.
Proof
Let f is quasiconvex function and it is not constant on a pne segment joining any points of S.
Suppose f is not strongly quasiconvex function.
There exist $x_1,x_2 in S$ with $x_1 eq x_2$ such that
$$fleft ( z ight )geq maxleft { fleft ( x_1 ight ), fleft ( x_2 ight ) ight }, forall z= lambda x_1+left ( 1-lambda ight )x_2, lambda in left ( 0,1 ight )$$
$Rightarrow fleft ( x_1 ight )leq fleft ( z ight )$ and $fleft ( x_2 ight )leq fleft ( z ight )$
Since f is not constant in $left [ x_1,z ight ]$ and $left [z,x_2 ight ] $
So there exists $u in left [ x_1,z ight ]$ and $v=left [ z,x_2 ight ]$
$$Rightarrow u= mu_1x_1+left ( 1-mu_1 ight )z,v=mu_2z+left ( 1- mu_2 ight )x_2$$
Since f is quasiconvex,
$$Rightarrow fleft ( u ight )leq maxleft { fleft ( x_1 ight ),f left ( z ight ) ight }=fleft ( z ight ):: and ::f left ( v ight ) leq max left { fleft ( z ight ),fleft ( x_2 ight ) ight }$$
$$Rightarrow fleft ( u ight )leq fleft ( z ight ) :: and :: fleft ( v ight )leq fleft ( z ight )$$
$$Rightarrow max left { fleft ( u ight ),fleft ( v ight ) ight } leq fleft ( z ight )$$
But z is any point between u and v, if any of them are equal, then f is constant.
Therefore, $max left { fleft ( u ight ),fleft ( v ight ) ight } leq fleft ( z ight )$
which contradicts the quasiconvexity of f as $z in left [ u,v ight ]$.
Hence f is strongly quasiconvex function.
Theorem
Let $f:S ightarrow mathbb{R}^n$ and S be a non-empty convex set in $mathbb{R}^n$. If $hat{x}$ is local optimal solution, then $hat{x}$ is unique global optimal solution.
Proof
Since a strong quasiconvex function is also strictly quasiconvex function, thus a local optimal solution is global optimal solution.
Uniqueness − Let f attains global optimal solution at two points $u,v in S$
$$Rightarrow fleft ( u ight ) leq fleft ( x ight ).forall x in S:: and ::fleft ( v ight ) leq fleft ( x ight ).forall x in S$$
If u is global optimal solution, $fleft ( u ight )leq fleft ( v ight )$ and $fleft ( v ight )leq fleft ( u ight )Rightarrow fleft ( u ight )=fleft ( v ight )$
$$fleft ( lambda u+left ( 1-lambda ight )v ight )< max left {fleft ( u ight ),fleft ( v ight ) ight }=fleft ( u ight )$$
which is a contradiction.
Hence there exists only one global optimal solution.
Remarks
A strongly quasiconvex function is also strictly quasiconvex fucntion.
A strictly convex function may or may not be strongly quasiconvex.
A differentiable strictly convex is strongly quasiconvex.
Pseudoconvex Function
Let $f:S ightarrow mathbb{R}$ be a differentiable function and S be a non-empty convex set in $mathbb{R}^n$, then f is said to be pseudoconvex if for each $x_1,x_2 in S$ with $igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )geq 0$, we have $fleft ( x_2 ight )geq fleft ( x_1 ight )$, or equivalently if $fleft ( x_1 ight )>fleft ( x_2 ight )$ then $igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )<0$
Pseudoconcave function
Let $f:S ightarrow mathbb{R}$ be a differentiable function and S be a non-empty convex set in $mathbb{R}^n$, then f is said to be pseudoconvex if for each $x_1, x_2 in S$ with $igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )geq 0$, we have $fleft ( x_2 ight )leq fleft ( x_1 ight )$, or equivalently if $fleft ( x_1 ight )>fleft ( x_2 ight )$ then $igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )>0$
Remarks
If a function is both pseudoconvex and pseudoconcave, then is is called pseudopnear.
A differentiable convex function is also pseudoconvex.
A pseudoconvex function may not be convex. For example,
$fleft ( x ight )=x+x^3$ is not convex. If $x_1 leq x_2,x_{1}^{3} leq x_{2}^{3}$
Thus,$igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )=left ( 1+3x_{1}^{2} ight )left ( x_2-x_1 ight ) geq 0$
And, $fleft ( x_2 ight )-fleft ( x_1 ight )=left ( x_2-x_1 ight )+left ( x_{2}^{3} -x_{1}^{3} ight )geq 0$
$Rightarrow fleft ( x_2 ight )geq fleft ( x_1 ight )$
Thus, it is pseudoconvex.
A pseudoconvex function is strictly quasiconvex. Thus, every local minima of pseudoconvex is also global minima.
Strictly pseudoconvex function
Let $f:S ightarrow mathbb{R}$ be a differentiable function and S be a non-empty convex set in $mathbb{R}^n$, then f is said to be pseudoconvex if for each $x_1,x_2 in S$ with $igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )geq 0$, we have $fleft ( x_2 ight )> fleft ( x_1 ight )$,or equivalently if $fleft ( x_1 ight )geq fleft ( x_2 ight )$ then $igtriangledown fleft ( x_1 ight )^Tleft ( x_2-x_1 ight )<0$
Theorem
Let f be a pseudoconvex function and suppose $igtriangledown fleft ( hat{x} ight )=0$ for some $hat{x} in S$, then $hat{x}$ is global optimal solution of f over S.
Proof
Let $hat{x}$ be a critical point of f, ie, $igtriangledown fleft ( hat{x} ight )=0$
Since f is pseudoconvex function, for $x in S,$ we have
$$igtriangledown fleft ( hat{x} ight )left ( x-hat{x} ight )=0 Rightarrow fleft ( hat{x} ight )leq fleft ( x ight ), forall x in S$$
Hence, $hat{x}$ is global optimal solution.
Remark
If f is strictly pseudoconvex function, $hat{x}$ is unique global optimal solution.
Theorem
If f is differentiable pseudoconvex function over S, then f is both strictly quasiconvex as well as quasiconvex function.
Remarks
The sum of two pseudoconvex fucntions defined on an open set S of $mathbb{R}^n$ may not be pseudoconvex.
Let $f:S ightarrow mathbb{R}$ be a quasiconvex function and S be a non-empty convex subset of $mathbb{R}^n$ then f is pseudoconvex if and only if every critical point is a global minima of f over S.
Let S be a non-empty convex subset of $mathbb{R}^n$ and $f:S ightarrow mathbb{R}$ be a function such that $igtriangledown fleft ( x ight ) eq 0$ for every $x in S$ then f is pseudoconvex if and only if it is a quasiconvex function.
Convex Optimization - Programming Problem
There are four types of convex programming problems −
Step 1 − $min :fleft ( x ight )$, where $x in S$ and S be a non-empty convex set in $mathbb{R}^n$ and $fleft ( x ight )$ is convex function.
Step 2 − $min : fleft ( x ight ), x in mathbb{R}^n$ subject to
$g_ileft ( x ight ) geq 0, 1 leq m_1$ and $g_ileft ( x ight )$ is a convex function.
$g_ileft ( x ight ) leq 0,m_1+1 leq m_2$ and $g_ileft ( x ight )$ is a concave function.
$g_ileft ( x ight ) = 0, m_2+1 leq m$ and $g_ileft ( x ight )$ is a pnear function.
where $fleft ( x ight )$ is a convex fucntion.
Step 3 − $max :fleft ( x ight )$ where $x in S$ and S be a non-empty convex set in $mathbb{R}^n$ and $fleft ( x ight )$ is concave function.
Step 4 − $min :fleft ( x ight )$, where $x in mathbb{R}^n$ subject to
$g_ileft ( x ight ) geq 0, 1 leq m_1$ and $g_ileft ( x ight )$ is a convex function.
$g_ileft ( x ight ) leq 0, m_1+1 leq m_2$ and $g_ileft ( x ight )$ is a concave function.
$g_ileft ( x ight ) = 0,m_2+1 leq m$ and $g_ileft ( x ight )$ is a pnear function.
where $fleft ( x ight )$ is a concave function.
Cone of feasible direction
Let S be a non-empty set in $mathbb{R}^n$ and let $hat{x} in :Closureleft ( S ight )$, then the cone of feasible direction of S at $hat{x}$, denoted by D, is defined as $D=left { d:d eq 0,hat{x}+lambda d in S, lambda in left ( 0, delta ight ), delta > 0 ight }$
Each non-zero vector $d in D$ is called feasible direction.
For a given function $f:mathbb{R}^n Rightarrow mathbb{R}$ the cone of improving direction at $hat{x}$ is denoted by F and is given by
$$F=left { d:fleft ( hat{x}+lambda d ight )leq fleft ( hat{x} ight ),forall lambda in left ( 0,delta ight ), delta >0 ight }$$
Each direction $d in F$ is called an improving direction or descent direction of f at $hat{x}$
Theorem
Necessary Condition
Consider the problem $min fleft ( x ight )$ such that $x in S$ where S be a non-empty set in $mathbb{R}^n$. Suppose f is differentiable at a point $hat{x} in S$. If $hat{x}$ is a local optimal solution, then $F_0 cap D= phi$ where $F_0=left { d:igtriangledown fleft ( hat{x} ight )^T d < 0 ight }$ and D is a cone of feasible direction.
Sufficient Condition
If $F_0 cap D= phi$ f is a pseudoconvex function at $hat{x}$ and there exists a neighbourhood of $hat{x},N_varepsilon left ( hat{x} ight ), varepsilon > 0$ such that $d=x-hat{x}in D$ for any $x in S cap N_varepsilon left ( hat{x} ight )$, then $hat{x}$ is local optimal solution.
Proof
Necessary Condition
Let $F_0 cap D eq phi$, ie, there exists a $d in F_0 cap D$ such that $d in F_0$ and $din D$
Since $d in D$, therefore there exists $delta_1 >0$ such that $hat{x}+lambda d in S, lambda in left ( 0,delta_1 ight ).$
Since $d in F_0$, therefore $igtriangledown f left ( hat{x} ight )^T d <0$
Thus, there exists $delta_2>0$ such that $fleft ( hat{x}+lambda d ight )< fleft ( hat{x} ight ),forall lambda in f left ( 0,delta_2 ight )$
Let $delta=min left {delta_1,delta_2 ight }$
Then $hat{x}+lambda d in S, fleft (hat{x}+lambda d ight ) < fleft ( hat{x} ight ),forall lambda in f left ( 0,delta ight )$
But $hat{x}$ is local optimal solution.
Hence it is contradiction.
Thus $F_0cap D=phi$
Sufficient Condition
Let $F_0 cap D eq phi$ nd let f be a pseudoconvex function.
Let there exists a neighbourhood of $hat{x}, N_{varepsilon}left ( hat{x} ight )$ such that $d=x-hat{x}, forall x in S cap N_varepsilonleft ( hat{x} ight )$
Let $hat{x}$ is not a local optimal solution.
Thus, there exists $ar{x} in S cap N_varepsilon left ( hat{x} ight )$ such that $f left ( ar{x} ight )< f left ( hat{x} ight )$
By assumption on $S cap N_varepsilon left ( hat{x} ight ),d=left ( ar{x}-hat{x} ight )in D$
By pseudoconvex of f,
$$fleft ( hat{x} ight )>fleft ( ar{x} ight )Rightarrow igtriangledown fleft ( hat{x} ight )^Tleft ( ar{x}-hat{x} ight )<0$$
$Rightarrow igtriangledown fleft ( hat{x} ight) ^T d <0$
$Rightarrow d in F_0$
hence $F_0cap D eq phi$
which is a contradiction.
Hence, $hat{x}$ is local optimal solution.
Consider the following problem:$min :fleft ( x ight )$ where $x in X$ such that $g_xleft ( x ight ) leq 0, i=1,2,...,m$
$f:X ightarrow mathbb{R},g_i:X ightarrow mathbb{R}^n$ and X is an open set in $mathbb{R}^n$
Let $S=left {x:g_ileft ( x ight )leq 0,forall i ight }$
Let $hat{x} in X$, then $M=left {1,2,...,m ight }$
Let $I=left {i:g_ileft ( hat{x} ight )=0, i in M ight }$ where I is called index set for all active constraints at $hat{x}$
Let $J=left {i:g_ileft ( hat{x} ight )<0,i in M ight }$ where J is called index set for all active constraints at $hat{x}$.
Let $F_0=left { d in mathbb{R}^m:igtriangledown fleft ( hat{x} ight )^T d <0 ight }$
Let $G_0=left { d in mathbb{R}^m:igtriangledown gIleft ( hat{x} ight )^T d <0 ight }$
or $G_0=left { d in mathbb{R}^m:igtriangledown gileft ( hat{x} ight )^T d <0 ,forall i in I ight }$
Lemma
If $S=left { x in X:g_ileft ( x ight ) leq 0, forall i in I ight }$ and X is non-empty open set in $mathbb{R}^n$. Let $hat{x}in S$ and $g_i$ are different at $hat{x}, i in I$ and let $g_i$ where $i in J$ are continuous at $hat{x}$, then $G_0 subseteq D$.
Proof
Let $d in G_0$
Since $hat{x} in X$ and X is an open set, thus there exists $delta_1 >0$ such that $hat{x}+lambda d in X$ for $lambda in left ( 0, delta_1 ight )$
Also since $g_hat{x}<0$ and $g_i$ are continuous at $hat{x}, forall i in J$, there exists $delta_2>0$, $g_ileft ( hat{x}+lambda d ight )<0, lambda in left ( 0, delta_2 ight )$
Since $d in G_0$, therefore, $igtriangledown g_ileft ( hat{x} ight )^T d <0, forall i in I$ thus there exists $delta_3 >0, g_ileft ( hat{x}+lambda d ight )< g_ileft ( hat{x} ight )=0$, for $lambda in left ( 0, delta_3 ight ) i in J$
Let $delta=minleft { delta_1, delta_2, delta_3 ight }$
therefore, $hat{x}+lambda d in X, g_ileft ( hat{x}+lambda d ight )< 0, i in M$
$Rightarrow hat{x}+lambda d in S$
$Rightarrow d in D$
$Rightarrow G_0 subseteq D$
Hence Proved.
Theorem
Necessary Condition
Let $f$ and $g_i, i in I$, are different at $hat{x} in S,$ and $g_j$ are continous at $hat{x} in S$. If $hat{x}$ is local minima of $S$, then $F_0 cap G_0=phi$.
Sufficient Condition
If $F_0 cap G_0= phi$ and f is a pseudoconvex function at $left (hat{x}, g_i 9x ight ), i in I$ are strictly pseudoconvex functions over some $varepsilon$ - neighbourhood of $hat{x}, hat{x}$ is a local optimal solution.
Remarks
Let $hat{x}$ be a feasible point such that $igtriangledown fleft(hat{x} ight)=0$, then $F_0 = phi$. Thus, $F_0 cap G_0= phi$ but $hat{x}$ is not an optimal solution
But if $igtriangledown gleft(hat{x} ight)=0$, then $G_0=phi$, thus $F_0 cap G_0= phi$
Consider the problem : min $fleft(x ight)$ such that $gleft(x ight)=0$.
Since $gleft(x ight)=0$, thus $g_1left(x ight)=gleft(x ight)<0$ and $g_2left(x ight)=-gleft(x ight) leq 0$.
Let $hat{x} in S$, then $g_1left(hat{x} ight)=0$ and $g_2left(hat{x} ight)=0$.
But $igtriangledown g_1left(hat{x} ight)= - igtriangledown g_2left(hat{x} ight)$
Thus, $G_0= phi$ and $F_0 cap G_0= phi$.
Convex Optimization - Fritz-John Conditions
Necessary Conditions
Theorem
Consider the problem − $min fleft ( x ight )$ such that $x in X$ where X is an open set in $mathbb{R}^n$ and let $g_i left ( x ight ) leq 0, forall i =1,2,....m$.
Let $f:X ightarrow mathbb{R}$ and $g_i:X ightarrow mathbb{R}$
Let $hat{x}$ be a feasible solution and let f and $g_i, i in I$ are differentiable at $hat{x}$ and $g_i, i in J$ are continuous at $hat{x}$.
If $hat{x}$ solves the above problem locally, then there exists $u_0, u_i in mathbb{R}, i in I$ such that $u_0 igtriangledown fleft ( hat{x} ight )+displaystylesumpmits_{iin I} u_i igtriangledown g_i left ( hat{x} ight )$=0
where $u_0,u_i geq 0,i in I$ and $left ( u_0, u_I ight ) eq left ( 0,0 ight )$
Furthermore, if $g_i,i in J$ are also differentiable at $hat{x}$, then above conditions can be written as −
$u_0 igtriangledown fleft ( hat{x} ight )+displaystylesumpmits_{i=1}^m u_i igtriangledown g_ileft ( hat{x} ight )=0$
$u_ig_ileft (hat{x} ight )$=0
$u_0,u_i geq 0, forall i=1,2,....,m$
$left (u_0,u ight ) eq left ( 0,0 ight ), u=left ( u_1,u_2,s,u_m ight ) in mathbb{R}^m$
Remarks
$u_i$ are called Lagrangian multippers.
The condition that $hat{x}$ be feasible to the given problem is called primal feasible condition.
The requirement $u_0 igtriangledown fleft (hat{x} ight )+displaystylesumpmits_{i=1}^m u-i igtriangledown g_ileft ( x ight )=0$ is called dual feasibipty condition.
The condition $u_ig_ileft ( hat{x} ight )=0, i=1, 2, ...m$ is called comppmentary slackness condition. This condition requires $u_i=0, i in J$
Together the primal feasible condition, dual feasibipty condition and comppmentary slackness are called Fritz-John Conditions.
Sufficient Conditions
Theorem
If there exists an $varepsilon$-neighbourhood of $hat{x}N_varepsilon left ( hat{x} ight ),varepsilon >0$ such that f is pseudoconvex over $N_varepsilon left ( hat{x} ight )cap S$ and $g_i,i in I$ are strictly pseudoconvex over $N_varepsilon left ( hat{x} ight )cap S$, then $hat{x}$ is local optimal solution to problem described above. If f is pseudoconvex at $hat{x}$ and if $g_i, i in I$ are both strictly pseudoconvex and quasiconvex function at $hat{x},hat{x}$ is global optimal solution to the problem described above.
Example
$min :fleft ( x_1,x_2 ight )=left ( x_1-3 ight )^2+left ( x_2-2 ight )^2$
such that $x_{1}^{2}+x_{2}^{2} leq 5, x_1+2x_2 leq 4, x_1,x_2 geq 0$ And $hat{x}=left ( 2,1 ight )$
Let $g_1left (x_1,x_2 ight )=x_{1}^{2}+x_{2}^{2} -5,$
$g_2left (x_1,x_2 ight )=x_1+2x_2-4,$
$g_3left (x_1,x_2 ight )=-x_1$ and $g_4left ( x_1, x_2 ight )= -x_2$.
Thus the above constraints can be written as −
$g_1left (x_1,x_2 ight )leq 0,$
$g_2left (x_1,x_2 ight )leq 0,$
$g_3left (x_1,x_2 ight )leq 0$ and
$g_4left (x_1,x_2 ight )leq 0$ Thus, $I = left {1,2 ight }$ therefore, $u_3=0,u_4=0$
$igtriangledown f left (hat{x} ight )=left (2,-2 ight ),igtriangledown g_1left (hat{x} ight )=left (4,2 ight )$ and $igtriangledown g_2left (hat{x} ight )=left (1,2 ight )$
Thus putting these values in the first condition of Fritz-John conditions, we get −
$u_0=frac{3}{2} u_2, ::u_1= frac{1}{2}u_2,$ and let $u_2=1$, therefore $u_0= frac{3}{2},::u_1= frac{1}{2}$
Thus Fritz John conditions are satisfied.
$min fleft (x_1,x_2 ight )=-x_1$.
such that $x_2-left (1-x_1 ight )^3 leq 0$,
$-x_2 leq 0$ and $hat{x}=left (1,0 ight )$
Let $g_1left (x_1,x_2 ight )=x_2-left (1-x_1 ight )^3$,
$g_2left (x_1,x_2 ight )=-x_2$
Thus the above constraints can be wriiten as −
$g_1left (x_1,x_2 ight )leq 0,$
$g_2left (x_1,x_2 ight )leq 0,$
Thus, $I=left {1,2 ight }$
$igtriangledown fleft (hat{x} ight )=left (-1,0 ight )$
$igtriangledown g_1 left (hat{x} ight )=left (0,1 ight )$ and $g_2 left (hat{x} ight )=left (0, -1 ight )$
Thus putting these values in the first condition of Fritz-John conditions, we get −
$u_0=0,:: u_1=u_2=a>0$
Thus Fritz John conditions are satisfied.
Karush-Kuhn-Tucker Optimapty Necessary Conditions
Consider the problem −
$min :fleft ( x ight )$ such that $x in X$, where X is an open set in $mathbb{R}^n$ and $g_i left ( x ight )leq 0, i=1, 2,...,m$
Let $S=left { x in X:g_ileft ( x ight )leq 0, forall i ight }$
Let $hat{x} in S$ and let $f$ and $g_i,i in I$ are differentiable at $hat{x}$ and $g_i, i in J$ are continuous at $hat{x}$. Furthermore, $igtriangledown g_ileft ( hat{x} ight), i in I$ are pnearly independent. If $hat{x}$ solves the above problem locally, then there exists $u_i,i in I$ such that
$igtriangledown fleft ( x ight)+displaystylesumpmits_{iin I} u_i igtriangledown g_ileft ( hat{x} ight)=0$, $::u_i geq 0, i in I$
If $g_i,i in J$ are also differentiable at $hat{x}$. then $hat{x}$, then
$igtriangledown fleft ( hat{x} ight)+displaystylesumpmits_{i= 1}^m u_i igtriangledown g_ileft ( hat{x} ight)=0$
$u_ig_ileft ( hat{x} ight)=0, forall i=1,2,...,m$
$u_i geq 0 forall i=1,2,...,m$
Example
Consider the following problem −
$min :fleft ( x_1,x_2 ight )=left ( x_1-3 ight )^2+left ( x_2-2 ight )^2$
such that $x_{1}^{2}+x_{2}^{2}leq 5$,
$x_1,2x_2 geq 0$ and $hat{x}=left ( 2,1 ight )$
Let $g_1left ( x_1, x_2 ight)=x_{1}^{2}+x_{2}^{2}-5$,
$g_2left ( x_1, x_2 ight)=x_{1}+2x_2-4$
$g_3left ( x_1, x_2 ight)=-x_{1}$ and $g_4left ( x_1,x_2 ight )=-x_2$
Thus the above constraints can be written as −
$g_1 left ( x_1,x_2 ight)leq 0, g_2 left ( x_1,x_2 ight) leq 0$
$g_3 left ( x_1,x_2 ight)leq 0,$ and $g_4 left ( x_1,x_2 ight) leq 0$ Thus, $I=left { 1,2 ight }$ therefore, $ u_3=0,:: u_4=0$
$igtriangledown f left ( hat{x} ight)=left ( 2,-2 ight), igtriangledown g_1 left ( hat{x} ight)= left ( 4,2 ight)$ and
$igtriangledown g_2left ( hat{x} ight ) =left ( 1,2 ight )$
Thus putting these values in the first condition of Karush-Kuhn-Tucker conditions, we get −
$u_1=frac{1}{3}$ and $u_2=frac{2}{3}$
Thus Karush-Kuhn-Tucker conditions are satisfied.
Algorithms for Convex Problem
Method of Steepest Descent
This method is also called Gradient method or Cauchy s method. This method involves the following terminologies −
$$x_{k+1}=x_k+alpha_kd_k$$
$d_k= - igtriangledown fleft ( x_k ight )$ or $ d_k= -frac{igtriangledown fleft ( x_k ight )}{left | igtriangledown fleft ( x_k ight ) ight |}$
Let $phi left (alpha ight )=fleft ( x_k +alpha d_k ight )$
By differentiating $phi$ and equating it to zero, we can get $alpha$.
So the algorithm goes as follows −
Initiapze $x_0$,$varepsilon_1$,$varepsilon_2$ and set $k=0$.
Set $d_k=-igtriangledown fleft ( x_k ight ) $or $d_k=-frac{igtriangledown fleft (x_k ight )}{left |igtriangledown fleft (x_k ight ) ight |}$.
find $alpha_k$ such that it minimizes $phileft ( alpha ight )=fleft ( x_k+alpha d_k ight )$.
Set $x_{k+1}=x_k+alpha_kd_k$.
If $left | x_{k+1-x_k} ight | <varepsilon_1$ or $left | igtriangledown fleft ( x_{k+1} ight ) ight | leq varepsilon_2$, go to step 6, otherwise set $k=k+1$ and go to step 2.
The optimal solution is $hat{x}=x_{k+1}$.
Newton Method
Newton Method works on the following principle −
$fleft ( x ight )=yleft ( x ight )=fleft ( x_k ight )+left ( x-x_k ight )^T igtriangledown fleft ( x_k ight )+frac{1}{2}left ( x-x_k ight )^T Hleft ( x_k ight )left ( x-x_k ight )$
$igtriangledown yleft ( x ight )=igtriangledown fleft ( x_k ight )+Hleft ( x_k ight )left ( x-x_k ight )$
At $x_{k+1}, igtriangledown yleft ( x_{k+1} ight )=igtriangledown fleft ( x_k ight )+Hleft ( x_k ight )left ( x_{k+1}-x_k ight )$
For $x_{k+1}$ to be optimal solution $igtriangledown yleft ( x_k+1 ight )=0$
Thus, $x_{k+1}=x_k-Hleft ( x_k ight )^{-1} igtriangledown fleft ( x_k ight )$
Here $Hleft ( x_k ight )$ should be non-singular.
Hence the algorithm goes as follows −
Step 1 − Initiapze $x_0,varepsilon$ and set $k=0$.
Step 2 − find $Hleft ( x_k ight ) igtriangledown fleft ( x_k ight )$.
Step 3 − Solve for the pnear system $Hleft ( x_k ight )hleft ( x_k ight )=igtriangledown fleft ( x_k ight )$ for $hleft ( x_k ight )$.
Step 4 − find $x_{k+1}=x_k-hleft ( x_k ight )$.
Step 5 − If $left | x_{k+1}-x_k ight |< varepsilon$ or $left | igtriangledown fleft ( x_k ight ) ight | leq varepsilon$ then go to step 6, else set $k=k+1$ and go to step 2.
Step 6 − The optimal solution is $hat{x}=x_{k+1}$.
Conjugate Gradient Method
This method is used for solving problems of the following types −
$min fleft ( x ight )=frac{1}{2}x^T Qx-bx$
where Q is a positive definite nXn matrix and b is constant.
Given $x_0,varepsilon,$ compute $g_0=Qx_0-b$
Set $d_0=-g_0$ for $k=0,1,2,...,$
Set $alpha_k=frac{g_{k}^{T}g_k}{d_{k}^{T}Q d_k}$
Compute $x_{k+1}=x_k+alpha_kd_k$
Set $g_{k+1}=g_k+alpha_kd_k$
Compute $eta_k=frac{g_{k}^{T}g_k}{d_{k}^{T}Qd_k}$
Compute $x_{k+1}=x_k+alpha_kd_k$
Set $g_{k+1}=x_k+alpha_kQd_k$
Compute $eta_k=frac{g_{k+1}^{T}g_{k+1}}{g_{k}^{T}gk}$
Set $d_{k+1}=-g_{k+1}+eta_kd_k$.
Advertisements