Why ReLU function is not differentiable at x=0?

Last Updated : 06 Jan, 2025

ReLU activation function introduces non-linearity to the neural networks, enabling them to capture complex patterns in the data. It is defined as:

\text{ReLU}(x) = \max(0, x)

This means that for any input x, if x > 0, ReLU outputs x, and if x \leq 0, it outputs 0.

Relu-activation-function — ReLU Activation Function

When observing the graph of ReLU, we see that the function is continuous at x=0, meaning there is no abrupt jump or gap. This continuity is one of the properties a function must have in order to be differentiable.

Note: All differentiable functions are continuous, not all continuous functions are differentiable.

Checking Differentiability at x=0

To determine if a function is differentiable at a point, we need to check that the derivative from the left matches the derivative from the right at that point.

Let’s compute the derivatives:

Left-hand derivative (x \to 0^{-}):
For x < 0, f(x) = 0, so the derivative is f'(x) = 0.
Right-hand derivative (x \to 0^{+}):
For x > 0, f(x) = x, so the derivative is f'(x) = 1.

At x = 0, the left-hand derivative is 0, and the right-hand derivative is 1. Since these derivatives are not equal, the function f(x) = \text{ReLU}(x) is not differentiable at x = 0.

Handling Non-Differentiability in Practice

In practical applications, the non-differentiability of the ReLU function at x = 0 is generally not problematic. Most deep learning models handle this by defining the derivative of ReLU at x=0 as either 0 or 1 to simplify computation. This assumption rarely causes issues during training because exact values of x = 0 are uncommon in real-world datasets.

During the backpropagation process in neural network training, we can adjust the weights using the simplified derivatives:

For x > 0, the slope is 1.
For x \leq 0, the slope is 0.

This simplification allows the network to continue training without complications, even though the function is not mathematically differentiable at 0.

ReLU Activation Function in Deep Learning

sahilgupta03

Improve

Article Tags :

Why ReLU function is not differentiable at x=0?

Checking Differentiability at x=0

Handling Non-Differentiability in Practice

Similar Reads

Thank You!

What kind of Experience do you want to share?