### Reading notes: Learning Deep Architectures for AI (by Yoshua Bengio 2009)

(The notes for Energy-based Models and Boltzmann Machines are not included here. I will try to add it sometime in the future.) Assumption: Computational machinery necessary to express complex behaviors requires highly varying mathematical functions (eg, highly non-linear functions). Two things discussed: Depth of architecture Locality of estimators what matters for generalization is not dimensionality, …