Back

Convexity

The notion of convexity plays an important role in mathematical optimization. We start with some definitions.

Lines and line segments

A line in \(\mathbb{R}^n\) is a set \(\{ \mathbf{x} + \lambda \mathbf{d} : \lambda \in \mathbb{R}\}\) for some \(\mathbf{x}, \mathbf{d} \in \mathbb{R}^n\) with \(\mathbf{d} \neq \mathbf{0}\).

For example, the line in \(\mathbb{R}^2\) defined by the equation \(y = x+1\) is given by the set \(\left\{ \begin{bmatrix} 0 \\1 \end{bmatrix} + \lambda \begin{bmatrix}1\\1\end{bmatrix} : \lambda \in \mathbb{R}\right\}\).

As a line segment is simply a portion of a line, we can use the definition for a line and restrict the values of \(\lambda\) to an interval. However, it is customary to specify a line segment by its endpoints. Hence, a line segment between \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^n\), denoted by \([\mathbf{u}, \mathbf{v}]\), is the set \(\{ (1 -\lambda)\mathbf{u} + \lambda \mathbf{v} : 0 \leq \lambda \leq 1 \}\).

Note that when \(\lambda = 0\), \((1 -\lambda)\mathbf{u} + \lambda \mathbf{v} =\mathbf{u}\), and when \(\lambda = 1\), \((1 -\lambda)\mathbf{u} + \lambda \mathbf{v} = \mathbf{v}\). Therefore, we can think of \(\lambda\) as the fraction of the line segment away from the starting point \(\mathbf{u}\). For example, the midpoint of the line segment \([\mathbf{u}, \mathbf{v}]\) is given by \(\frac{1}{2}\mathbf{u} + \frac{1}{2} \mathbf{v}\).

Convex sets

A set \(C \subseteq \mathbb{R}^n\) is said to be convex if for all \(\mathbf{x}, \mathbf{y} \in C\), \([\mathbf{x}, \mathbf{y}] \subseteq C\). We say that \(C\) is a convex set in \(\mathbb{R}^n\).

Examples.

  1. By definition, the empty set \(\emptyset\) is convex since the condition for convexity holds vacuously.

  2. The set \(\left \{ \begin{bmatrix} 0\\1\end{bmatrix}, \begin{bmatrix} 1\\0\end{bmatrix}\right \}\) consisting of just two elements from \(\mathbb{R}^2\) is not a convex set in \(\mathbb{R}^2\).

  3. The line segment between \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^n\) is a convex set. To see this, let \(C = [\mathbf{u},\mathbf{v}]\). Take any \(\mathbf{x},\mathbf{y}\in C\). Then, there exist \(\lambda_x, \lambda_y \in [0,1]\) such that \(\mathbf{x} = (1-\lambda_x) \mathbf{u} + \lambda_x \mathbf{v}\) and \(\mathbf{y} = (1-\lambda_y) \mathbf{u} + \lambda_y \mathbf{v}\). Any \(z \in [\mathbf{x}, \mathbf{y}]\) can be written as \(z = (1-\lambda)\mathbf{x} + \lambda \mathbf{y}\) for some \(\lambda \in [0,1]\). Hence, \begin{eqnarray*} z & = & (1-\lambda) ((1-\lambda_x) \mathbf{u} + \lambda_x \mathbf{v})+ \lambda ((1-\lambda_y) \mathbf{u} + \lambda_y \mathbf{v}) \\ & = & (1 -\lambda_x + \lambda \lambda_x - \lambda \lambda_y)\mathbf{u} + (\lambda_x -\lambda\lambda_x + \lambda \lambda_y)\mathbf{v} \\ & = & (1 -\lambda')\mathbf{u} + \lambda' \mathbf{v} \end{eqnarray*} where \(\lambda' = \lambda_x - \lambda \lambda_x + \lambda \lambda_y\). Now, \(\lambda' = \lambda_x(1 - \lambda) + \lambda \lambda_y \geq 0\) since \(\lambda_x,\lambda_y \geq 0\) and \(0 \leq \lambda \leq 1\), and \(\lambda' = \lambda_x(1 - \lambda) + \lambda \lambda_y \leq (1-\lambda) + \lambda = 1\). Thus, \(z \in C\), implying that \(C\) is convex.

Proposition 8.1. Given a collection of convex sets \((C_i)_{i \in I}\) where \(I\) is an index set, \(\displaystyle\bigcap_{i \in I} C_i\) is also a convex set.

In other words, the intersection of convex sets is a convex set. The proof of this proposition is left as an exercise.

In general, it is not that easy to prove that a given convex set is indeed convex. Fortunately, there is a family of sets that are easily shown to be convex that will be sufficient for much of the work that we will be doing.

For a set in \(\mathbb{R}^2\), it is perhaps easier to sketch the set and see if it is convex.

Polyhedron

A halfspace in \(\mathbb{R}^n\) is a set \(\{ \mathbf{x} \in\mathbb{R}^n : \mathbf{a}^\mathsf{T} \mathbf{x} \geq \beta \}\) for some nonzero \(\mathbf{a} \in \mathbb{R}^n\) and \(\beta \in \mathbb{R}\). In other words, a halfspace is the set of points satisfying a single nontrivial linear inequality.

For example, the set \( \left \{\begin{bmatrix} x_1 \\ x_2\end{bmatrix} : x_1 \geq 0\right\}\) is a halfspace in \(\mathbb{R}^2\),

Proposition 8.2. Halfspaces are convex.

Proof. Consider a halfspace \(H\) given by \(\{ \mathbf{x} \in\mathbb{R}^n : \mathbf{a}^\mathsf{T} \mathbf{x} \geq \beta \}\) for some nonzero \(\mathbf{a} \in \mathbb{R}^n\) and \(\beta \in \mathbb{R}\). Let \(\mathbf{u}, \mathbf{v}\in H\). Let \(\mathbf{z} = (1-\lambda) \mathbf{u} + \lambda \mathbf{v}\) where \(\lambda \in [0,1]\). We show that \(\mathbf{z} \in H\). Note that \(\mathbf{z} \in H\) if and only if \(\mathbf{a}^\mathsf{T}\mathbf{z} \geq \beta\). Now \(\mathbf{a}^\mathsf{T}\mathbf{z} = (1-\lambda) \mathbf{a}^\mathsf{T} \mathbf{u} +\lambda \mathbf{a}^\mathsf{T} \mathbf{v}\). Since \(1-\lambda \geq 0\) and \(\mathbf{a}^\mathsf{T}\mathbf{u}\geq \beta\) (as \(\mathbf{u} \in H\)), we get that \((1-\lambda) \mathbf{a}^\mathsf{T} \mathbf{u}\geq (1-\lambda)\beta\). Similary, we get that \(\lambda\mathbf{a}^\mathsf{T}\mathbf{v} \geq \lambda\beta\). Thus, \(\mathbf{a}^\mathsf{T}\mathbf{z} \geq (1-\lambda)\beta+\lambda(\beta) = \beta\) as desired.

A polyhedron in \(\mathbb{R}^n\) is the intersection of finitely many halfspaces in \(\mathbb{R}^n\). Note that the feasible region of a linear programming problem is a polyhedron.

As halfspaces are convex and intersections of convex sets are convex, we have the following result.

Theorem 8.3. Polyhedra are convex.

Worked examples

  1. Show that \(P = \left \{ \begin{bmatrix} x_1\\x_2 \end{bmatrix} \in \mathbb{R}^2: x_1, x_2 \geq 0 \right \}\) is a convex set.

  2. Show that \(C = \left \{ \begin{bmatrix} x_1\\x_2 \end{bmatrix} \in \mathbb{R}^2: x_1^2 + x_2^2 \leq 1 \right \}\) is a convex set.