Question 1

What variants of gradient descent exist?

Accepted Answer

Batch (uses all data per step), stochastic (one sample per step), and mini-batch (small batches per step). Adam, AdaGrad, and RMSprop are adaptive variants that adjust learning rates automatically.

Question 2

What is a learning rate?

Accepted Answer

The step size in gradient descent. It controls how much weights change with each update. It is the single most important hyperparameter in neural network training.

Question 3

Can gradient descent get stuck?

Accepted Answer

It can converge to local minima or saddle points. Modern optimizers like Adam and techniques like learning rate scheduling help escape these traps.

What is Gradient Descent?

Frequently Asked Questions

What variants of gradient descent exist?

What is a learning rate?

Can gradient descent get stuck?

Where does your
organization stand?

What is Gradient Descent?

Frequently Asked Questions

What variants of gradient descent exist?

What is a learning rate?

Can gradient descent get stuck?

Where does your organization stand?

Where does your
organization stand?