CV/AI techniques are quickly taking the central role in driving this growth by creating video
conferencing applications that deliver more natural, contextual, and relevant meeting
experiences. For example, high-quality video matting and synthesis is crucial to the
now-essential functionality of virtual background; gaze correction and gesture tracking can add
to interactive user engagement; automatic color and light correction can improve the user’s
visual appearance and self-image; and all those have to be backed up by high-efficacy video
compression/transmission and efficient edge processing which can also benefit from AI
advances nowadays. Those challenges have drawn increasing R&D attraction, e.g. NVIDIA
recently released their fully accelerated platform for building video conferencing services with
many advanced AI features: https://developer.nvidia.com/maxine .
While we seem to already start embracing a mainstream adoption of AI-based video
collaboration, we recognize that building the next-generation video conferencing system
involves multi-fold interdisciplinary challenges, and face many technical gaps to close.
Centered at this theme, this proposed workshop aims to provide the first comprehensive forum
for CVPR researchers, to systematically discuss relevant techniques that we can contribute to
as a community. Examples include but are not limited to:
- Image display and quality enhancement for teleconferencing
- Video compression and transmission for teleconferencing
- Video object segmentation, matting and synthesis (for virtual background, etc.)
- HCI (gesture recognition, head tracking, gaze tracking, etc.), AR and VR applications in video conferencing
- Efficient video processing on the edge and IoT camera devices
- Multi-modal information processing and fusion in video conferencing (audio transcription, image to text, video captioning, etc.)
- Societal and Ethical Aspects: privacy intrusion & protection, attention engagement, fatigue avoidance, etc
- Emerging Applications where video conferencing would be the cornerstone: remote education, telemedicine, etc.
... and many more interesting features.
We aim to collectively address this core question: what CV techniques are/will be ready for the
next-generation video conference, and how will they fundamentally change the experience of
remote work, education and more? We aim to bring together experts in interdisciplinary fields to
discuss the recent advances along these topics and to explore new directions. As one of the
expected workshop outcomes, we expect to generate a joint report defining the key CV
problems, characterizing the technical demands and barriers, and discussing potential solutions
or discussions.