Stochastic Gradient Descent Tricks (Microsoft Research, 2012).pdf 419 KB