ELU A Replacement For ReLU?

Keegan Fernandes
2 min readOct 16, 2022

I recently found a research paper that increased a model's effectiveness by using the ELU activation function instead of the ReLU function. I will be exploring the new ELU function in this blog. I have also made a notebook on Kaggle to demonstrate the working of the ELU function through code and interactive plot.

ReLU Function

Before Explaining the ELU, we'll start by explaining the ReLU function. The activation function returns all numbers and converts negative Weights to zero. The ReLU function was intended to reduce model complexity and add non-linearity to the network. The following is a graphical representation of the function. Some neurons may contain important information; however, due to the ReLU function, their weights might not update. In other words, ReLu can result in inactive neurons. The following is a graphical representation of the function.

In the ReLU Function, all negative values become zero after passing through the function.

ELU Function

Unlike ReLU, the ELU function smoothens slowly to a value constant value. One of the benefits of the ELU function is that it can pass negative values, unlike the ReLU function, while still reducing the complexity. One of the drawbacks is that it is slower than the ReLU function in training. It might not be noticeable on smaller networks, but the time piles on as the networks get larger. The following is a graphical representation of the ELU network.

The Blue line shows the ELU function

Conclusion

While there have been many talks on the effectiveness of the ReLU and its alternatives, I found the ELU function quite attractive since it can solve the dying ReLU problem and add non-linearity to the network.

Bibliography

Here is the research paper where I found the potential use of ELU.

Jung, H.-K.; Choi, G.-S. Improved YOLOv5: Efficient Object Detection Using Drone Images under Various Conditions. Appl. Sci. 2022, 12, 7255. https://doi.org/10.3390/app12147255

--

--

Keegan Fernandes

First year student in Msc Data Science. Writes about data science and machine learning tutorials and the impact it has on the world.