Home Artificial Intelligence Optimization of Nonlinear Functions via Piecewise Linearization

Optimization of Nonlinear Functions via Piecewise Linearization

0
Optimization of Nonlinear Functions via Piecewise Linearization

Photo by Jon Tyson on Unsplash

Allow us to consider a nonlinear function f(x), where x is a continuous variable. We would really like to search out the minimum value of this function f(x) by changing our decision variable x. The optimization problem can mathematically be formulated in the next manner:

Typically, the user comes across some constraints that have to be considered. Which may for instance be a minimum required temperature in a room, a constraint on how much pressure is placed on a cloth, or the minimum period of time you would like/have to wait until you drink your next coffee.

These constraints come either in the shape of equalities or inequalities. For the sake of simplicity, we’ll assume that we’ve got only inequality constraints, described by g(x)≤0. We will subsequently add this constraint to the optimization problem as as follows:

If the functions (f(x) and g(x), or only one in all them) are nonlinear, we would wish to make use of nonlinear solvers. We attempt to avoid this, which is where a linearization step comes into play. So let’s pause our discussion about optimization and first take a look at a way that enables us to approximate a nonlinear function.

To visualise and higher understand this part, consider a logarithmic function f(x)=log(x) within the range of x=[1,6]:

# Get some packages
import numpy as np
import matplotlib.pyplot as plt

# Create nonlinear function
def f(x):
return np.log(x)

# Define lower and upper values and an x-range
xlb = 1
xub = 6
numx = 100
x = np.linspace(xlb, xub, numx)
y = f(x)

# Plot the function
plt.figure()
plt.plot(x, y, color='blue', linestyle='--', label='Nonlinear f(x)')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.legend(frameon=False, loc='lower right')
plt.show()

With this piece of code, we will produce the next plot:

Figure 1. Logarithmic function as a nonlinear example.

Now, we start with the definition of a linear function, which has the next general form (with slope a and an intersection b):

Functions are here to explain something. That might for instance be a phyical behaviour or system. This technique can probably be sampled, so we will select our x and might observe what our f(x) is (independent of the very fact if the system is linear or nonlinear). Example. If we cook Spaghetti in a pot of water, we could wait 1 minute (x1) and observe how well our Spaghetti are cooked (f(x1)). We will wait one other 5 minutes (x2) and check again how well the Spaghetti are done (f(x2)). I suppose you understand what I mean. 😄 🍝

If we’ve got an environment where we get some sample points out of it, we will use them to calculate the corresponding slope a and the intersection b of the road that connects those two points. Let’s say we’ve got 2 points, f(x1) and f(x2). This implies, for the reason that linear model should hold in each points, we all know that these two equations hold in each points as well:

Now we have two unknowns (a and b) and two equations, so we will solve for a and b:

Using these expressions, we will attempt to approximate our considered function f(x)=log(x) within the range of x=[1,6].

Let’s do this in Python. First, we take the lower and upper bounds as our two points to define the segment (remember, Python starts indexing at zero, so don’t be confused if it says “segment 0”, which is just our first considered segment):

num_segments = 1
x_segments = np.linspace(xlb, xub, num_segments+1)
for i in range(num_segments):
print(f'[+] Segment {i} goes from x={x_segments[i]:.2f} to x={x_segments[i+1]:.2f}.')
[+] Segment 0 goes from x=1.00 to x=6.00.

Then, we write a function that returns the slope and the intersection given two supporting points (sometimes also called “nodes”), namely x=[x1,x2] and y=[f(x1),f(x2)]:

def calc_slope_and_intersection(x, y):
slope = (y[1:] - y[:-1])/(x[1:] - x[:-1])
intersection = y[:-1] - slope*x[:-1]
return float(slope), float(intersection)

We then calculate the slopes and intersections and store the values in lists:

slope, intersection = [], []
for i in range(num_segments):
x_segment = x_segments[i:i+2] # Get the x values for the segment
y_segment = f(x_segment) # Sample from nonlinear environment
slope_, intersection_ = calc_slope_and_intersection(x_segment, y_segment)
slope.append(slope_)
intersection.append(intersection_)
print(f'[+] Linear function of segment {i}: x*{slope[i]:.2f}+({intersection[i]:.2f}).')
[+] Linear function of segment 0: x*0.36+(-0.36).

The slope is calculated to be a=0.36, where the intersection has the identical value of b=0.36. This brings us to the next linear equation as approximation:

We will plot this along with the unique logarithmic function:

# Function that creates a linear function from slope and intersection
def get_linear(slope, intersection, xlb, xub):
x_ = np.array([xlb, xub])
return slope*x_ + intersection

# Plot the linear functions
plt.figure()
plt.plot(x, y, color='blue', linestyle='--', label='Nonlinear f(x)') # Nonlinear function
for i in range(num_segments):
x_segment = x_segments[i:i+2]
y_segment = get_linear(slope[i], intersection[i], x_segment[0], x_segment[1])
plt.plot(x_segment, y_segment, color='orange', linestyle='-', label='Linear segment' if i==0 else '__nolabel__') # Linear segments
plt.plot(x_segment, y_segment, color='black', linestyle='', marker='x', label='Nodes' if i==0 else '__nolabel__') # Nodes
plt.xlabel('x')
plt.ylabel('f(x)')
plt.legend(frameon=False, loc='lower right')
plt.show()

Figure 2. Linear approximation (solid orange line) of a log-function.

Well. That’s not very accurate. However it’s linear, and it goes through the 2 nodes we used. So in essence, it does what we would like. Now we will sample the function a bit more often (use more of those nodes), which brings us to the piecewise linearization technique.

Allow us to split your complete x-range into n segments (i=1,2,…,n), and perform the above-shown linearization of the function f(x) in each segment. Within the equations below, the notation N describes the set of those segments, meaning N={1,2,…,n}. For every linear function, we want its individual slope and intersection:

Since we will define quite a lot of desired values for x after which sample the corresponding values for f(x) from our (unknown) function, we will again calculate the slopes and intersections. Allow us to subsequently define some more x-values for different segments (the nodes). Say we use n=3.

The identical code snippets from above may be used, just adjust the variable num_segments to three. The x-ranges of the segments and their equations are given as follows:

[+] Segment 0 goes from x=1.00 to x=2.67.
[+] Segment 1 goes from x=2.67 to x=4.33.
[+] Segment 2 goes from x=4.33 to x=6.00.

[+] Linear function of segment 0: x*0.59+(-0.59).
[+] Linear function of segment 1: x*0.29+(0.20).
[+] Linear function of segment 2: x*0.20+(0.62).

Figure 3. Linear approximation (solid orange line) of a log-function (dashed blue line) using 3 segments.

Looks a bit higher. We could further increase the variety of segments, but living at limits shouldn’t be at all times what we would like, so let’s keep on with n=3 in the interim. 😎

With this, we identified a possible approximation of the nonlinear function f(x). Let’s return and replace this nonlinearity in our original optimization problem.

Again, the goal is to search out the minimum of the function f(x), considering that it must be above a given specification K. We will formulate this as shown above, where our inequality constraint g(x)≤0 is essentially K-f(x)≤0:

Since we’ve got linearized our function in segments, all segments the optimization problem should be considered. This is feasible by creating one other variable z for every segment. This variable must be binary (either 0 or 1). It’s 0 if the segment is inactive, and 1 if the segment is lively (meaning our minimum solution we’re on the lookout for is positioned on this segment).

We will now sum up all of the linear functions of the segments and multiply them with the corresponding z variable.

We also consider that our optimal solution can only be in a single segment on the time. In other words, just one segment may be lively, meaning the sum of all z variables must be 1:

Or in a rather different form:

Well, it looks like we didn’t a excellent job. This problem formulation is now nonlinear (because of the multiplication of x with z in the target function and the constraint) and even has integer variables (z).

Nonetheless, there is a straightforward trick that enables us rewriting such a “pseudo bilinear term” (which is a nonlinearity). First, we define an auxiliary variable x-tilde, which is the multiplication of the continual variable (x) with the binary variable (z):

Next, we want to define the domain through which x-tilde may be valid for the cases when z=0 (inactive segment) and z=1 (lively segment). If the segment is inactive, our auxiliary variable must be zero. We achieve this by the next inequalities:

Where xmin and xmax are the lower- and upper “nodes” which we calculated above. You possibly can easily confirm that if z is zero, x-tilde is forced to be zero as well.

As well as, if z=1, the auxilary variable x-tilde must be bounded by the “true” continuous variable x:

It’s value to say here that xub is the upper sure of the continual variable x (not those of the intermediate nodes). With this, we will rewrite our optimization problem from above:

It is advisable to undergo it again with a cup of coffee ☕

LEAVE A REPLY

Please enter your comment!
Please enter your name here