Optimization problems seen in a typical first-year calculus course are often solved using differential calculus. In this course, we study a class of optimization problems that do not require calculus.
We consider two toy problems to introduce some fundamental ideas in optimization.
Let \(S\) be the set of all the four-letter English words. What is the maximum number of \(\ell\)'s a word in \(S\) can have?
Discussion. There are lots of four-letter words that contain the letter \(\ell\). For example, “line”, “long”, “tilt”, and “full”. From this short list alone, we know the maximum number of \(\ell\)'s is at least 2 and at most 4. As “llll” is not an English word, the maximum number cannot be 4. Can the maximum number be 3? Yes, because “lull” is a four-letter word with three \(\ell\)'s.
This example illustrates some fundamental ideas in optimization. To be able to say that 3 is the correct answer, we need to search for a word that has three \(\ell\)'s and provide an argument that rules out any value higher than 3. In this example, the only possible value higher than 3 is 4 which was easily ruled out. Unfortunately, life is not always this easy. For instance, if the question asked for the maximum number of “y”'s instead of \(\ell\)'s, what would you do?
A pirate landed on an island with a knapsack that can hold 50kg of treasure. He found a cave with the following items:
Item | Weight | Value | Value/kg |
---|---|---|---|
Iron shield | 20kg | $2800 | $140/kg |
Gold chest | 40kg | $4400 | $110/kg |
Brass sceptre | 30kg | $1200 | $40/kg |
Discussion. If the pirate does not take the gold chest, he can take both the iron shield and the brass sceptre for a total value of \$4000. If he takes the gold chest, he cannot take any of the remaining items. However, the value of the gold chest is \$4400, which is larger than the combined value of the iron shield and the brass sceptre. Hence, the pirate should take just the gold chest.
In the second example, we performed a case analysis and exhausted all the promising possibilities to arrive at our answer. Note that a greedy strategy that chooses items in descending value per weight would give us the suboptimal solution of taking the iron shield and brass sceptre. Even though there are problems for which the greedy approach would return an optimal solution, this example is not such a problem. In fact, the general version of this problem is the classic Binary Knapsack Problem and is known to be NP-hard. (Informally, NP-hard optimization problems are problems for which no algorithm that runs in polynomial number of steps in the size of the problem is known.) Many optimization problems coming from real-world applications are NP-hard. Despite the theoretical difficulty, practitioners often devise methods that return good-enough solutions using approximation methods and heuristics. There are also ways to obtain bounds to gauge the quality of the solutions obtained. We will be looking at these issues in the context of linear programming in this course.
A typical single-objective optimization problem consists of a domain set \(\mathcal{D}\), an objective function \(f:\mathcal{D} \rightarrow \mathbb{R}\), and predicates \(\mathcal{C}_i\) on \(\mathcal{D}\) where \(i = 1,\ldots,m\) for some nonnegative integer \(m\) called constraints such that one seeks to find, if possible, an element \(\mathbf{x} \in \mathcal{D}\) such that \(\mathcal{C}_i(\mathbf{x})\) holds for \(i = 1,\ldots,m\) and the value of \(f(\mathbf{x})\) is either as high (in the case of maximization) or as low (in the case of minimization) as possible.
A compact way to write down a single-objective optimization problem is: \[\begin{array}{rl} \min & f(\mathbf{x}) \\ \text{s.t.} & \mathcal{C}_i(\mathbf{x}) \quad i = 1,\ldots,m \\ & \mathbf{x} \in \mathcal{D}. \end{array}\] in the case of minimizing \(f(\mathbf{x})\), or \[\begin{array}{rl} \max & f(\mathbf{x}) \\ \text{s.t.} & \mathcal{C}_i(\mathbf{x}) \quad i = 1,\ldots,m \\ & \mathbf{x} \in \mathcal{D}. \end{array}\] in the case of maximizing \(f(\mathbf{x})\). Here, “s.t.” is an abbreviation for “subject to.”
To be technically correct, \(\min\) should be replaced with \(\inf\) (and \(\max\) with \(\sup\)) since the minimum value is not necessarily attained. However, we will abuse notation and ignore the subtlety.
Some common domain sets are
\(\mathbb{R}^n_+\) (the set of \(n\)-tuples of nonnegative real numbers)
\(\mathbb{Z}^n_+\) (the set of \(n\)-tuples of nonnegative integers)
\(\{0,1\}^n\) (the set of binary \(n\)-tuples)
We can now formulate the Binary Knapsack Problem using the notation we have just introduced. Suppose that there are \(n\) items and item \(i\) has weight \(w_i\) and value \(v_i > 0\) for \(i = 1,\ldots, n\). Let \(K\) denote the capacity of the knapsack. Then the Binary Knapsack Problem can be formulated as: \[\begin{array}{rl} \max & \sum\limits_{i = 1}^n v_i x_i \\ \text{s.t.} & \sum\limits_{i = 1}^n w_i x_i \leq K \\ & x_i \in \{0,1\} \quad i = 1,\ldots, n. \end{array}\] Here, there is only one constraint given by the inequality modelling the capacity of the knapsack. For the instance discussed in the introduction, the formulation is: \[\begin{array}{rl} \max & 2800 x_1 + 4400 x_2 + 1200 x_3 \\ \text{s.t.} & 20 x_1 + 40 x_2 + 30 x_3 \leq 50\\ & x_1, x_2, x_3 \in \{0,1\} \end{array}\]
An element \(\mathbf{x} \in \mathcal{D}\) satisfying all the constraints (that is, \(\mathcal{C}_i(\mathbf{x})\) holds for each \(i = 1,\ldots,m\)) is called a feasible solution and its objective function value is \(f(\mathbf{x})\).
For a minimization (maximization) problem, a feasible solution \(\mathbf{x}^*\) such that \(f(\mathbf{x}^*) \leq f(\mathbf{x})\) (\(f(\mathbf{x}^*) \geq f(\mathbf{x})\)) for every feasible solution \(\mathbf{x}\) is called an optimal solution.
The objective function value of an optimal solution, if it exists, is the optimal value of the optimization problem. Note that the optimal value, if it exists, is unique but there can be multiple optimal solutions as the following example shows: \[\begin{array}{rl} \min & x+y \\ \text{s.t.} & x+y \geq 1 \\ & \begin{bmatrix} x \\ y \end{bmatrix} \in \mathbb{R}^2 \end{array}\] Here, \(\begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 1-t \\ t \end{bmatrix}\) is an optimal solution for every \(t \in \mathbb{R}\) and the optimal value for the problem is 1.
It is possible that there exists no element \(\mathbf{x} \in \mathcal{D}\) such that \(\mathcal{C}_i(\mathbf{x})\) holds for all \(i = 1,\ldots,m\). In such a case, the optimization problem is said to be infeasible. For example, the following is infeasible: \[\begin{array}{rl} \min & x \\ \text{s.t.} & x \leq -1 \\ & x \geq 0 \\ & x \in \mathbb{R} \end{array}\]
An optimization problem that is not infeasible can fail to have optimal solutions. For example, the problem \[\begin{array}{rl} \max & x \\ \text{s.t.} & x \in \mathbb{R} \end{array}\] can have objective function value as large as we want. We call this problem unbounded.
The problem \[\begin{array}{rl} \min & e^{-x} \\ \text{s.t.} & x \in \mathbb{R} \end{array}\] has positive objective function value for every feasible solution. Even though the objective function value approaches zero as \(x\) approaches infinity, there is no feasible solution with objective function value 0. Note that this problem is not unbounded as the objective function value is bounded below by 0.
Given an optimization problem, the most natural task is to find an optimal solution (provided that one exists) and demonstrate that it is optimal. However, depending on the context of the problem, one might be tasked to find
a feasible solution or show that none exists
a local optimum
a good bound on the optimal value
a good (but not necessarily optimal) solution quickly
all global solutions
a good solution that is robust to small changes in problem data
the \(n\) best solutions
The last two tasks listed above are often important in practice. For example, if problem data come from measurements or forecasts, one needs to have a solution that is still feasible when deviations are taken into account. Multiple good solutions allow decision makers to choose a solution that has desirable properties that are not represented by or difficult to be represented by problem constraints.
The computational difficulty of optimization problems depends on the properties of the domain set, constraints, and the objective function.
Problems without constraints are said to be unconstrained. For example, least-squares minimization in statistics can be formulated as an unconstrained problem. The following is also an unconstrained problem: \[\begin{array}{rl} \min & x^2 - 3x \\ \text{s.t.} & x \in \mathbb{R} \end{array}\]
Problems with linear constraints (that is, linear inequalities or equalities) and a linear objective function form the important class of problems in linear programming, the central topic of this course. Linear programming problems are the “easiest” to solve in the sense that efficient algorithms exist both in theory and in practice. Linear programming is also the backbone for solving more complex models.
Convex problems are problems with a convex domain set, convex constraints and a convex objective function. Convex optimization problems have the property that every local optimum is also a global optimum. Such a property permits the development of effective algorithms that could also work well in practice. Linear programming is a special case of convex optimization. We will be looking at convexity a bit near the end of the course.
Nonconvex problems (such as problems involving integer variables and/or nonlinear constraints that are not convex) are the hardest problems to solve. In general, nonconvex problems are NP-hard. Such problems often arise in scheduling and engineering applications.
Optimization problems that arise in statistical learning and engineering applications that are usually modelled as nonconvex continuous models will not be discussed since they require different sets of techniques and methods.
This document will not go into the details of algorithms for solving the problems discussed. The reader is expected to be using an off-the-shelf solver for the various tasks. However, it is important to note the various types of algorithms or methods that exist for solving optimization problems.
Algorithms fall into two families: heuristics and exact methods.
Heuristics are normally quick to execute but do not provide guarantees of optimality. For example, the greedy heuristic for the Knapsack Problem is very quick but does not always return an optimal solution. In fact, there is no guarantee on how good a solution is. Other heuristics methods include ant colony, particle swarm, and evolutionary algorithms, just to name a few. There are also heuristics that are stochastic in nature and have proof of convergence to an optimal solution. Simulated annealing and multiple random starts are such heuristics. Unfortunately, there is no guarantee on the running time to reach optimality and there is no way to identify when one has reached an optimum.
Exact methods return a global optimum after finite time. However, most exact methods only guarantee that constraints are approximately satisfied though the violation is below some pre-specified tolerance. It is therefore possible for the returned solutions to be infeasible for the actual problem. There also exist exact methods that fully control the error. When using such a method, an optimum is usually given as a box guaranteed to contain an optimal solution rather than a single element. Returning boxes rather than single elements are needed in cases, for example, where the optimum cannot be expressed exactly as a vector of floating point numbers. Such exact methods are used mostly in academic research and in areas such as medicine and avionics where the tolerance for errors is practically zero.
Does the optimization problem \[\begin{array}{rl} \min & \dfrac{1}{x^2+1} \\ \text{s.t.} & x \in \mathbb{R} \end{array}\] have an optimal solution? Explain.
Solve the Binary Knapsack Problem where \(K = 10\), \(v_1 = 10\), \(v_2 = 9\), \(v_3 = 12\), \(v_4 = 11\), \(w_1 = 7\), \(w_2 = 4\), \(w_3 = 6\), \(w_4 = 3\).
Find the optimal value of \[\begin{array}{rl} \min & 2x + y \\ \text{s.t.} & x + y = 3\\ & x, y \geq 0 \\ & x, y \in \mathbb{R}. \end{array}\]