Categories

# Gradient Descent Introduction

Gradient descent is an optimization algorithm for finding the maximum/minimum of some function. It is an iterative approach where you are calculating the slope of a derivative function to move step wise to the maximum/minimum value.   The slope on this example is at 2. By decreasing/increasing x step wise and comparing the result with the previous one, the algorithm moves towards the maximum/minimum value.

The step size is also commonly known as the learning rate. A smaller learning rate means a more precise result but as a drawback it requires more iterations and has therefore a worse run time. Gradient descent algorithm

If no further background information is available, a random initialization position can be used. Putting it all together one iteration process is defined with the formula: Implementing the gradient descent algorithm for finding the minimum at the example formula is straight forward. It basically needs a while loop with two exit conditions, one if the previous result is smaller than the new one and the second if a maximum of iterations are reached. The second condition is for preventing infinite loops.

cur_x = 4
rate = 0.001
precision = 0.000001
previous_step_size = 1
max_iters = 10000
iters = 0
df = lambda x: 2*x

while previous_step_size > precision and iters < max_iters:
prev_x = cur_x
cur_x = cur_x - rate * df(prev_x)
previous_step_size = abs(cur_x - prev_x)
iters = iters + 1
print("iteration", iters, "\nX Value is", cur_x)

print("The local minimum occurs at", cur_x)

The final result should be a value close to 0!