Tuesday, December 31, 2024

Stochastic Gradient Decend

Running through the whole test data set to determine the next weigh and bias values is ideal but when the data set is massive, it becomes computationally challenging to complete an epoch. 
A compromise is to randomly picks a subset of input called minibatch and uses a few of tthem to approximate the descend calculation. The act of randomly picking is called stochastic. 
The advantage of this approach is that it reduces the computational demand and at the same time provide a good approximation of the gradient of the loss function which is typically unknown at the beginning. A strong gradient hint can be ignored by early ti avoid  falling into local minima

No comments: