[visionlist] SAVE THE DATE: JIVP Webinar by Liang Zheng, "Image generation with end-to-end training and benefits of a good VAE" (April 9, 2026 at 1PM CEST)

Mon Mar 30 18:23:12 -05 2026

(Apologies if you receive multiple copies of this message)

The next JIVP webinar will take place on Thursday, April 9, 2026 at 1:00 PM CEST, with Prof. Liang Zheng (Australian National University).

RSVP to join here: https://cassyni.com/events/EeUKWPASZtukyQQuuEu3Jm?cb=0.8dx2 

Title: Image generation with end-to-end training and benefits of a good VAE

Abstract: 

Latent diffusion models underly modern image generation, which requires a variational auto-encoder (VAE) for image encoding and decoding, and a diffusion transformer for generation. While end-to-end training has been the spirit of deep learning, it is surprising that latent diffusion models are not trained end-to-end, causing representation bottlenecks. In this talk, I will introduce our work that jointly trains the VAE and diffusion transformer and show how it accelerates training and yields high quality images. Further, I will discuss use cases where the resulting end-to-end trained VAEs bring significant benefits. This includes higher-quality text-to-image generation and automatic agentic search of diffusion transformer architectures. I will conclude with new perspectives.

Bio:

Dr. Liang Zheng is an Associate Professor at the Australian National University and a Research Scientist at Canva. He is interested in representation learning for perception and generation. He contributed many useful datasets and methods to the object re-identification field that were later used in wider domains. He is currently working on image generation in both aspects of pre-training and post-training. He is a Program Chair for ACM MM’24, MM’28, andAVSS'24, and a General Chair for AVSS’27 and DICTA 2027. He is a regular area chair for important conferences and an Associate Editor for TPAMI. He has bachelor degrees in Biology, Economics and a PhD degree in Computer Science from Tsinghua University.

—
__________________________________________
Dr. Giuseppe Valenzise
CNRS Senior Researcher (Directeur de recherche)
Université Paris-Saclay — CentraleSupelec — CNRS
Laboratoire des Signaux et Systèmes (L2S) — UMR 8506
3, rue Joliot Curie
91192 Gif-sur-Yvette Cedex, France
https://l2s.centralesupelec.fr/u/valenzise-giuseppe/

Editor in Chief Journal on Image and Video Processing (Springer)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
“Immersive Video Technologies”
https://www.elsevier.com/books/immersive-video-technologies/valenzise/978-0-323-91755-1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://visionscience.com/pipermail/visionlist_visionscience.com/attachments/20260331/73fcefe8/attachment.html>