alltom.com,
Archive,
Currently Reading,
Have Read,
Micro.blog
Subscribe with RSS or @tom@micro.alltom.com
I love the opportunism in deep learning.
It’s not, “This seems like a good, fast approximation.”
It’s more like, “We want to invert the Hessian. I wish it were diagonal, but it’s ‘strongly nondiagonal’… What if we just only compute the diagonal anyway?”