日時: 12月4日(木) 15:00-16:00
場所:総合研究7号館講義室1(1階 107)
For cameras to be more widely deployed, there are two important requirements they need to satisfy.
First, they need to be self-sustaining, in that, they should be able to capture and transmit visual information without the use of a power supply and without being tethered to other devices or infrastructure.
Second, when used in applications that do not require identification of humans, they should preserve privacy.
We will present recent results on the creation of such self-sustaining imaging systems.
Our approach uses energy harvested from the light falling on the camera to fully power the camera, enabling it to make visual measurements as well as wirelessly transmit these measurements, without the use of a battery or an external power supply.
We will conclude the talk with a related result that enables forecasting the energy harvested by a solar panel from a single image taken from close to it.
日時:11月28日(金) 13:15-14:45
場所:総合研究7号館セミナー室2(1階 131)
Over the past decade, advances in image sensor technologies have transformed the 2D and 3D imaging capabilities of our smartphones, cars, robots, drones, and scientific instruments. As these technologies continue to evolve, what new capabilities might they unlock? I will discuss one possible point of convergence—the ultimate video camera—which is enabled by emerging single-photon image sensors and photon-processing algorithms. We will explore the extreme imaging capabilities of this camera within the broader historical context of high-speed imaging systems, highlighting its potential to capture the physical world in entirely new ways.
日時:11月18日(火) 10:30-12:00
場所:総合研究7号館セミナー室1(1階 127)
In this talk, I will present some recent research works from the Computer Vision Laboratory at National Tsing Hua University (NTHU). The NTHU CV Lab focuses on several key areas, including video understanding, face-related analysis, anomaly detection, and medical imaging. I will begin with two of our latest advances in video understanding, HERMES and VADER. HERMES introduces two versatile modules that can be seamlessly integrated into existing video–language models or deployed as a standalone framework for long-form video comprehension, achieving state-of-the-art performance across multiple benchmarks. VADER, on the other hand, is an LLM-driven framework for video anomaly reasoning, which combines keyframe-level object-relation modeling with visual contextual cues to enhance anomaly interpretation. Next, I will discuss one of our recent works in anomaly detection, LFQUIAD. LFQUIAD integrates a quantization-driven autoencoder with a modular Anomaly Generation Module to improve representation learning. Finally, I will briefly present two medical imaging projects leveraging diffusion models—one for generating paired 3D CT image–mask datasets, and the other for synthesizing contrast-enhanced 3D CT volumes from non-contrast scans. Through these examples, I will highlight our lab’s ongoing efforts toward building generalizable, interpretable, and efficient computer vision systems bridging visual understanding and generative modeling.
日時:7月7日(月) 13:15〜14:45
場所:総合研究7号館1階情報2講義室
Over the past decade, I’ve worked on perception systems spanning everyday robot manipulation, self-driving cars, and large-scale video understanding across academia and industry labs like Waymo, Google Research, DeepMind, and now Agility Robotics. In this talk, I’ll share perspectives on developments in the field and industry, and lessons from deploying computer vision systems in the real world. I’ll tell you why construction cones are harder than they look, what it takes to build vision-language models that understand YouTube videos without human supervision, how to transfer fundamental research to products that touch billions of users, and I’ll revisit how we once tried to enable robots to use tools they had never seen before—and how we've come back full circle with new tools at our disposal in my work at Agility Robotics. Beyond the research itself, I’ll reflect on navigating careers between academia and industry, choosing impactful problems, and the exciting challenges ahead in embodied AI.