The word gradient sounds intimidating, but the idea is one you already understand intuitively. Think about standing on a hillside: the gradient tells you which direction is steepest and how steep that direction is.
In one dimension
If you have a simple curve โ say, a parabola โ the gradient at any point is just the slope. To find the lowest point, you move opposite to the slope.
In many dimensions
Neural networks have millions of weights, so the "hill" exists in very high-dimensional space. The gradient is now a vector โ one slope value per dimension.