CS089/CS189, Spring 2015
Deep Learning
Paper presentations
Each lecture, we will review two papers from the reading list. One student will be responsible to present each paper and to guide the in-class discussion. When it is your turn to present, I will ask you to upload your slides to Blackboard (in powerpoint, keynote, or PDF format) by 11:59pm of the day before the lecture. You should aim for a 40 minute presentation. A rule of thumb is to devote about 20 minutes to describe the objectives of the work and the proposed technical solution. About 5-10 minutes should be dedicated to presenting the experimental results. Finally, you should prepare a 10 minute discussion highlighting the contributions of the work (what differentiates this paper from previous work?), its strengths as well as its weaknesses (technical, applicative, or experimental). Don't be afraid to be controversial or to ask questions/opinions to your classmates: it is your responsibility to lead an interactive discussion session and you are free to choose the style. Try to conclude your presentation with a list of suggested extensions of the work presented. Don't simply report the future work items discussed in the conclusion section of the paper: think independently about how you would choose to continue the research addressed in the article.

Written critiques
All students must upload a short written critique (about half a page of text) of each paper presented in class by 11:59pm of the day before the lecture. Please summarize in a couple of sentences the objectives and the technical approach. Discuss in more detail the contributions, strengths and weaknesses of the work, exactly as if you were to present the paper in class. Don't forget to include your list of proposed extensions to the method. I will review all critiques every few weeks and provide detailed feedback. Each student can opt not to submit critiques for up to 5 papers without any penalty.

Reading list


    Sparse Coding

  1. Emergence of simple-cell receptive field properties by learning a sparse code for natural images.
    B.A. Olshausen, and D.J. Field.
    Nature, 1996.

  2. Efficient sparse coding algorithms.
    H. Lee, A. Battle, R. Raina, and A. Y. Ng.
    NIPS, 2007.

    Autoencoders, RBMs & Deep Networks

  3. Tutorial on RBMs.

  4. Reducing the dimensionality of data with neural networks.
    G. Hinton and R. Salakhutdinov.
    Science 2006.

  5. Greedy Layer-Wise Training of Deep Networks.
    Y. Bengio, P. Lamblin, P. Popovici, and H. Larochelle.
    NIPS 2006

  6. Extracting and Composing Robust Features with Denoising Autoencoders.
    P. Vincent, H. Larochelle, Y. Bengio, and P.A. Manzagol.
    ICML 2008.

    Unsupervised Pretraining, and Other Regularization Schemes

  7. Why Does Unsupervised Pre-training Help Deep Learning?
    D. Erhan, Y. Bengio, A. Courville, P.A. Manzagol, P. Vincent, and S. Bengio.
    JMLR, 2010.

  8. Improving neural networks by preventing co-adaptation of feature detectors.
    G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.R. Salakhutdinov.
    arXiv, 2012.

  9. Regularization of Neural Network using DropConnect
    L. Wan, M. Zeiler, S. Zhang, Y. LeCun, and R. Fergus.
    ICML 2013.

  10. Deeply-Supervised Nets
    C.Y. Lee, S. Xie, P. Gallagher, Z. Zhang, Z. Tu.
    AISTATS 2015.

    Image Analysis & Recognition

  11. ImageNet Classification with Deep Convolutional Neural Networks.
    A. Krizhevsky, I. Sutskever, and G. Hinton.
    NIPS, 2012.

  12. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition.
    J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell.
    ICML, 2014.

  13. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks.
    P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun.
    ICLR, 2014.

  14. Very Deep Convolutional Networks for Large-Scale Image Recognition.
    K. Simonyan, A. Zisserman.
    ICLR, 2015.

    Pose, Motion and Video

  15. DeepPose: Human Pose Estimation via Deep Neural Networks.
    A. Toshev, C. Szegedy.
    CVPR, 2014.

  16. Two-Stream Convolutional Networks for Action Recognition in Videos.
    K. Simonyan, A. Zisserman.
    NIPS, 2014.

    Text Analysis & Natural Language Processing

  17. Distributed Representations of Words and Phrases and their Compositionality.
    T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean.
    NIPS, 2013.

  18. DeViSE: A Deep Visual-Semantic Embedding Model.
    A. Frome, G. Corrado, J. Shlens, S. Bengio, J. Dean, M.A. Ranzato, T. Mikolov.
    NIPS, 2013.

  19. Memory Networks.
    J. Weston, S. Chopra, A. Bordes.
    ICLR, 2015.

  20. Sequence to Sequence Learning with Neural Networks.
    I. Sutskever, O. Vinyals, Q.V.V. Le.
    NIPS, 2014.

    Weakly Supervised Training of Deep Networks

  21. Weakly Supervised Semantic Segmentation with Convolutional Networks.
    P. O. Pinheiro, and R. Collobert.
    CVPR, 2015.

    Learning Deep Structured Models

  22. Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation.
    J.J. Tompson, A. Jain, Y. LeCun, and C. Bregler.
    NIPS, 2014.

  23. Deep Convolutional Neural Fields for Depth Estimation from a Single Image.
    F. Liu, C. Shen, and G. Lin.
    CVPR 2015.

  24. Deep Structured Output Learning for Unconstrained Text Recognition.
    M. Jaderberg, K. Simonyan, A. Vedaldi, A. Zisserman.
    ICLR, 2015.


Other useful pointers and material