Workshop on Multimodal Continual Learning

ICCV 2025 Workshop

TBD, 2025


Introduction

In recent years, the machine learning and computer vision community has made significant advancements in continual learning (CL)—also known as lifelong learning or incremental learning—which enables models to learn new tasks incrementally while retaining previously acquired knowledge without requiring full-data retraining. Most of the early work in CL has focused on unimodal data, such as images, primarily for classification problems. However, with the rise of powerful multimodal models, which unify images, videos, text, and even audio, multimodal continual learning (MCL) has emerged as a crucial research direction. Unlike unimodal CL, where knowledge retention is constrained within a single modality, MCL must handle multiple modalities simultaneously. This introduces new challenges, such as modality-specific forgetting, modality imbalance, and the preservation of cross-modal associations over time. This MCL workshop aims to address these challenges, explore emerging research opportunities, and advance the development of more inclusive, efficient, and continual learning systems. It will provide a platform for discussing cutting-edge research, identifying critical challenges, and fostering collaboration across academia, industry, and accessibility communities. This workshop is highly relevant to computer vision and machine learning researchers, AI practitioners, and those working on multimodal AI systems.


Call For Papers

Topics: This workshop will cover, but is not limited to, the following topics at the intersection of computer vision, multimodal learning, continual learning:

  • Multimodal Continual Learning (MCL)
  • Multimodal Class-Incremental Learning
  • Multimodal Domain-Incremental Learning
  • Multimodal Task-Incremental Learning
  • Multimodal Continual Learning in Generative Models
  • Continual Learning in Multimodal Foundation Models
  • Continual Learning in Multimodal Large Language Models (MLLMs)
  • Audio-Visual Continual Learning
  • Vision-Language Continual Learning
  • Bias, Fairness, and Transparency in Continual Learning
  • Benchmarking and Evaluation
  • Applications

Submission: The workshop invite both extended abstract and full paper submissions. We use OpenReview to manage submissions. The submission should be in the ICCV format. Please download the ICCV 2025 Author Kit for detailed formatting instructions. Reviewing will be double-blind. Consistent with the review process for ICCV 2025 main conference, submissions under review will be visible only to their assigned members of the program committee. The reviews and author responses will never be made public. All accepted full papers and extended abstracts will be invited to present a poster. A selection of outstanding full papers will also be invited for oral presentations.

  • Extended Abstracts: We accept 2-4 page extended abstracts. Accepted extended abstracts will not be published in the conference proceedings, allowing future submissions to archival conferences or journals. We also welcome already published papers that are within the scope of the workshop, including papers from the main ICCV conference.
  • Full papers: Papers should be longer than 4 pages but limited to 8 pages, excluding references, following the ICCV style. Accepted full papers will be published in the ICCV workshop proceedings.

Note: Since we are using the same submission system to manage all submissions, we have set the extended abstract deadline: August 01 as the workshop paper submission deadline on Openreview. However, please note that the full paper submission deadline is much earlier— June 30—as we are required to provide paper information to the ICCV 2025 conference by their deadline.


Important Dates

Call for papers announced May 12
Full Paper submission deadline June 30
Notifications to accepted Full papers July 10
Extended Abstract submission deadline Aug 01
Notifications to accepted Extended abstract papers Aug 10
Camera-ready deadline for accepted full and Extended abstract papers Aug 16
Workshop date TBA


Schedule (Hawaii Standard Time)

TBA


Invited Speakers (Tentative)


Ziwei Liu is an Associate Professor at MMLab@NTU, College of Computing and Data Science in Nanyang Technological University, Singapore. Previously, he was a research fellow in Chinese University of Hong Kong with Prof. Dahua Lin and a post-doc researcher in University of California, Berkeley with Prof. Stella Yu. His research interests include computer vision, machine learning and computer graphics. Ziwei is the recipient of PAMI Mark Everingham Prize, MIT TR Innovators under 35 Asia Pacific, ICBS Frontiers of Science Award, CVPR Best Paper Award Candidate and Asian Young Scientist Fellowship.


Marc'Aurelio Ranzato is a research scientist director at Google DeepMind in London. He is generally interested in Machine Learning, Computer Vision, Natural Language Processing and, more generally, Artificial Intelligence. His long term endeavor is to enable machines to learn more efficiently and with less supervision by transferring and acrruing knowledge over time. He joined Facebook and was a founding member of the Facebook AI Research lab. He has been at Google DeepMind in London since August 2021, where He leads the continual learning team.


Bing Liu is a Distinguished Professor and Peter L. and Deborah K. Wexler Professor of Computing at the University of Illinois at Chicago (UIC). He received his PhD in Artificial Intelligence from the University of Edinburgh. Before joining UIC, he was a faculty member (associate professor) at School of Computing, National University of Singapore (NUS). He was also with Peking University for one year (2019-2020). His current Research interests include lifelong or continual learning, open-world AI and continual learning, continual learning language models and dialogue systems, natural language processing, Machine learning, Artificial General Intellgience (AGI).


Zsolt Kira is an Asssociate Professor at the School of Interactive Computing in the College of Computing, and serve as an Associate Director of ML@GT which is the machine learning center recently created at Georgia Tech. Previously He was a Branch Chief at the Georgia Tech Research Institute (GTRI) and Research Scientist at SRI International Sarnoff in Princeton. He leads the RobotIcs Perception and Learning (RIPL) lab. His areas of research specifically focus on the intersection of learning methods for sensor processing and robotics, developing novel machine learning algorithms and formulations towards solving some of the more difficult perception problems in these areas. He is especially interested in moving beyond supervised learning (un/semi/self-supervised and continual/lifelong learning) as well as distributed perception (multi-modal fusion, learning to incorporate information across a group of robots, etc.).



Organizers

Yunhui Guo
University of Texas at Dallas
Yapeng Tian
University of Texas at Dallas
Mingrui Liu
George Mason University
Sayna Ebrahimi
Google DeepMind
Henry Gouk
University of Edinburgh


Contact

To contact the organizers please use mclworkshop25@gmail.com.



Acknowledgments

Thanks to https://languagefor3dscenes.github.io/ for the webpage format.