Update
This commit is contained in:
parent
2a3b06eb67
commit
3abdc93103
17 changed files with 160 additions and 24 deletions
|
@ -80,6 +80,6 @@ And we get:
|
|||
|
||||
And so on until we get all the derivatives.
|
||||
|
||||
Once we have them, we multiply them all by some value (**learning rate**, a distance by which we move in the computed direction) and substract them from the current weights by which we perform the gradient descent and lower the total error.
|
||||
Once we have them, we multiply them all by some value (**learning rate**, a distance by which we move in the computed direction) and subtract them from the current weights by which we perform the gradient descent and lower the total error.
|
||||
|
||||
Note that here we've only used one training sample, i.e. the error *E* was computed from the network against a single desired output. If more example are used in a single update step, they are usually somehow averaged.
|
||||
Note that here we've only used one training sample, i.e. the error *E* was computed from the network against a single desired output. If more example are used in a single update step, they are usually somehow averaged.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue