recent news


  1. Two papers to be presented at NeurIPS 2020:

  2. Spotlight presentation:
    Self-Supervised Learning by Cross-Modal Audio-Video Clustering,
    with Humam Alwassel, Dhruv Mahajan, Bernard Ghanem, and Du Tran. 

  3. COBE: Contextualized Object Embeddings from Narrated Instructional Video,
    with Gedas Bertasius.


  1. Three new papers to be presented at WACV 2021:

  2. Only Time Can Tell: Discovering Temporal Data for Temporal Modeling,
    with Laura Sevilla-Lara, Shengxin Zha, Zhicheng Yan, Vedanuj Goswami, and Matt Feiszli.

  3. Supervoxel Attention Graphs for Long-Range Video Modeling,
    with Yang Wang, Gedas Bertasius, Tae-Hyun Oh, Abhinav Gupta, and Minh Hoai.

  4. Learn like a Pathologist: Curriculum Learning by Annotator Agreement for Histopathology Image Classification,
    with J. Wei, A. Suriawinata, B. Ren, X. Liu, M. Lisovsky, L. Vaickus, C. Brown, M. Baker, M. Nasir- Moin, N. Tomita, J. Wei, and S. Hassanpour.


  1. New paper presented at BMVC 2020:

  2. Attentive Action and Context Factorization,
    with Yang Wang, Vinh Tran, Gedas Bertasius, and Minh Hoai.


  1. MaskProp nominated for CVPR 2020 paper award and ranked first in one of the EPIC-Kitchens CVPR 2020 challenges:

  2. Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation, by Gedas Bertasius and Lorenzo Torresani, was one of the 29 paper award nominees at CVPR 2020 (out of 1470 accepted papers).

  3. Our software submission based on this paper received the First Place Award on Unseen Kitchens and the Third Place Award on Seen Kitchens at the IEEE CVPR 2020 EPIC-Kitchens Object Detection in Video Challenge.


  1. Three papers presented at CVPR 2020:

  2. Oral presentation:
    Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation,
    with Gedas Bertasius.

  3. Listen to Look: Action Recognition by Previewing Audio,
    with Ruohan Gao, Tae-Hyun Oh, and Kristen Grauman.

  4. Correlation Networks for Video Classification,
    with Heng Wang, Du Tran, and Matt Feiszli.


  1. New paper presented at AISTATS 2020:

  2. Stein Variational Inference for Discrete Distributions,
    with Jun Han, Fan Ding, Xianglong Liu, Jian Peng, and Qiang Liu.


  1. Two papers presented at NeurIPS 2019:

  2. Learning Temporal Pose Estimation from Sparsely-Labeled Videos,
    with Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi.

  3. STAR-Caps: Capsule Networks with Straight-Through Attentive Routing,
    with Karim Ahmed.


  1. Four papers presented at ICCV 2019:

  2. Oral presentation:
    SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition,
    with Bruno Korbar, and Du Tran.

  3. Video Classification with Channel-Separated Convolutional Networks,
    with Du Tran, Heng Wang, and Matt Feiszli.

  4. DistInit: Learning Video Representations without a Single Labeled Video,
    with Rohit Girdhar, Du Tran, and Deva Ramanan.

  5. HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization,
    with Hang Zhao, Zhicheng Yan, and Antonio Torralba.


  1. I co-organized a tutorial at ICCV 2019:

  2. Tutorial on Visual Learning with Limited Labeled Data


  1. I co-organized a tutorial and a workshop at CVPR 2019:

  2. Tutorial on Action Classification and Video Modeling

  3. Workshop on Computer Vision for Global Challenges


  1. We released a new dataset for action recognition and temporal localization, named HACS. It includes over 1.5M video clips. Give it a try!


  1. New paper presented at NIPS 2018:

  2. Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization,
    with Bruno Korbar, and Du Tran.


  1. Three papers presented at ECCV 2018:

  2. Object Detection in Video with Spatiotemporal Sampling Networks,
    with Gedas Bertasius, and Jianbo Shi.

  3. MaskConnect: Connectivity Learning by Gradient Descent,
    with Karim Ahmed.

  4. Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset,
    with Jamie Ray, Heng Wang, Du Tran, Yufei Wang, Matt Feiszli, and Manohar Paluri.


  1. New paper presented at BMVC 2018:

  2. Self-Supervised Feature Learning for Semantic Segmentation of Overhead Imagery,
    with Suriya Singh, Anil Batra, Guan Pang, Saikat Basu, Manohar Paluri, and C.V. Jawahar.


  1. Three papers on video models presented at CVPR 2018:

  2. A Closer Look at Spatiotemporal Convolutions for Action Recognition,
    with Du Tran, Heng Wang, Jamie Ray, Yann LeCun, and Manohar Paluri.

  3. Detect-and-Track: Efficient Pose Estimation in Videos,
    with Rohit Girdhar, Georgia Gkioxari, Manohar Paluri, and Du Tran.

  4. What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets,
    with De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Juan Carlos Niebles,
    Fei-Fei  Li, and Manohar Paluri.


  1. Together with collaborators, I organized two workshops at CVPR 2018:

  2. Brave New Ideas for Video Understanding

  3. DeepGlobe: A Challenge for Parsing the Earth through Satellite Images


  1. From January 2018 to May 2018 I was a Fulbright U.S. Scholar at Ashesi University in Ghana.


research overview


My research interests are in computer vision and machine learning. My current work is primarily focused on learning representations for image and video recognition. You can read more about the research of my group here.



previous affiliations


  1. Microsoft Research Cambridge, Machine Learning and Perception

  2. Riya/Like.com

  3. New York University, Computer Science

  4. Stanford University, Computer Science

  5. DigitalPersona

  6. IRST

  7. University of Milan, Computer Science

Lorenzo Torresani


Professor of Computer Science

Visual Learning Group

Dartmouth College


Research Scientist

Facebook AI


Email: LT@dartmouth.edu