IST COLLOQUIUM 2025

Can a Camera be Self-Sustaining?

講演者の画像

T.C. Chang Prof. Shree K. Nayar 🌐

Dept of Computer Science, Columbia University, USA
講演者経歴 Shree K. Nayar is the T. C. Chang Professor of Computer Science at Columbia University. He heads the Columbia Imaging and Vision Laboratory (CAVE), which develops computational imaging and computer vision systems. Nayar received his PhD degree in Electrical and Computer Engineering from the Robotics Institute at Carnegie Mellon University. For his research and teaching he has received several honors including the David Marr Prize (1990 and 1995), the David and Lucile Packard Fellowship (1992), the National Young Investigator Award (1993), the NTT Distinguished Scientific Achievement Award (1994), the Keck Foundation Award for Excellence in Teaching (1995), the Columbia Great Teacher Award (2006), the Carnegie Mellon Alumni Achievement Award (2009), Sony Appreciation Honor (2014), the Columbia Engineering Distinguished Faculty Teaching Award (2015), the IEEE PAMI Distinguished Researcher Award (2019), the Funai Achievement Award (2021), and the Okawa Prize (2022). For his contributions to computer vision and computational imaging, he was elected to the National Academy of Engineering in 2008, the American Academy of Arts and Sciences in 2011, and the National Academy of Inventors in 2014.
日時: 12月4日(木) 15:00-16:00

場所:総合研究7号館講義室1(1階 107)

For cameras to be more widely deployed, there are two important requirements they need to satisfy. First, they need to be self-sustaining, in that, they should be able to capture and transmit visual information without the use of a power supply and without being tethered to other devices or infrastructure. Second, when used in applications that do not require identification of humans, they should preserve privacy.
We will present recent results on the creation of such self-sustaining imaging systems. Our approach uses energy harvested from the light falling on the camera to fully power the camera, enabling it to make visual measurements as well as wirelessly transmit these measurements, without the use of a battery or an external power supply. We will conclude the talk with a related result that enables forecasting the energy harvested by a solar panel from a single image taken from close to it.

The Ultimate Video Camera

講演者の画像

Prof. Kyros Kutulakos 🌐

Dept. of Computer Science, University of Toronto, Canada
講演者経歴 Kyros is a Professor of Computer Science at the University of Toronto and an expert in computational imaging and computer vision. His research over the past decade has focused on combining programmable light sources, sensors, optics and algorithms to create cameras with unique capabilities—from seeing through scatter and looking around corners to capturing surfaces with complex material properties robustly in 3D. He is currently leading efforts to harness the potential of technologies such as single-photon cameras and programmable-pixel image sensors for applications in extreme computer vision and scientific imaging. Kyros is a recipient of an Alfred P. Sloan Fellowship, an NSF CAREER Award, and eight paper awards at ICCV, CVPR and ECCV, including two Marr Prizes (2023 and 1999), and the CVPR Best Paper Award in 2019.
日時:11月28日(金) 13:15-14:45

場所:総合研究7号館セミナー室2(1階 131)

Over the past decade, advances in image sensor technologies have transformed the 2D and 3D imaging capabilities of our smartphones, cars, robots, drones, and scientific instruments. As these technologies continue to evolve, what new capabilities might they unlock? I will discuss one possible point of convergence—the ultimate video camera—which is enabled by emerging single-photon image sensors and photon-processing algorithms. We will explore the extreme imaging capabilities of this camera within the broader historical context of high-speed imaging systems, highlighting its potential to capture the physical world in entirely new ways.

Advancing Visual Understanding and Generation: Recent Research from NTHU Computer Vision Lab

講演者の画像

Prof. Shang-Hong Lai 🌐

Dept. of Computer Science, National Tsing Hua University, Taiwan
講演者経歴 Shang-Hong Lai is a Professor in the Department of Computer Science at National Tsing Hua University (NTHU), Taiwan, where he also serves as the Associate Dean of the College of Electrical Engineering and Computer Science (EECS). From 2018 to 2022, Dr. Lai was on leave at the Microsoft AI R&D Center in Taiwan, where he worked as a Principal Research Manager, leading a science team focused on face-related AI research. Dr. Lai’s research interests span computer vision, image processing, and machine learning. He has authored over 300 publications in leading international journals and conferences in these fields and holds approximately 30 patents related to computer vision and medical imaging technologies. He has served as Area, Theme, or Program Chair for major international conferences such as CVPR, ICCV, ECCV, NeurIPS, ICML, IJCAI, ACCV, and ICPR, and as an Associate Editor for International Journal of Computer Vision (IJCV), IEEE Transactions on Image Processing (TIP), and Pattern Recognition.
日時:11月18日(火) 10:30-12:00

場所:総合研究7号館セミナー室1(1階 127)

In this talk, I will present some recent research works from the Computer Vision Laboratory at National Tsing Hua University (NTHU). The NTHU CV Lab focuses on several key areas, including video understanding, face-related analysis, anomaly detection, and medical imaging. I will begin with two of our latest advances in video understanding, HERMES and VADER. HERMES introduces two versatile modules that can be seamlessly integrated into existing video–language models or deployed as a standalone framework for long-form video comprehension, achieving state-of-the-art performance across multiple benchmarks. VADER, on the other hand, is an LLM-driven framework for video anomaly reasoning, which combines keyframe-level object-relation modeling with visual contextual cues to enhance anomaly interpretation. Next, I will discuss one of our recent works in anomaly detection, LFQUIAD. LFQUIAD integrates a quantization-driven autoencoder with a modular Anomaly Generation Module to improve representation learning. Finally, I will briefly present two medical imaging projects leveraging diffusion models—one for generating paired 3D CT image–mask datasets, and the other for synthesizing contrast-enhanced 3D CT volumes from non-contrast scans. Through these examples, I will highlight our lab’s ongoing efforts toward building generalizable, interpretable, and efficient computer vision systems bridging visual understanding and generative modeling.

Vision at Work: Reflections on Real-World Computer Vision from Robots to Video Understanding and Back

講演者の画像

Austin Myers 🌐

Agility Robotics
講演者経歴 Austin Myers is a research engineer currently working on embodied perception at Agility Robotics. His career has spanned academia and industry, with roles at Waymo, Google Research, and DeepMind, where he developed vision systems for robot manipulation, self-driving vehicles, and large-scale video understanding. Austin received his PhD from the University of Maryland, focusing on understanding the affordances of object parts through geometric reasoning. His broad research interests lie at the intersection of joint video and language representation learning, large multimodal models, and embodied perception for everyday robots.
日時:7月7日(月) 13:15〜14:45

場所:総合研究7号館1階情報2講義室

Over the past decade, I’ve worked on perception systems spanning everyday robot manipulation, self-driving cars, and large-scale video understanding across academia and industry labs like Waymo, Google Research, DeepMind, and now Agility Robotics. In this talk, I’ll share perspectives on developments in the field and industry, and lessons from deploying computer vision systems in the real world. I’ll tell you why construction cones are harder than they look, what it takes to build vision-language models that understand YouTube videos without human supervision, how to transfer fundamental research to products that touch billions of users, and I’ll revisit how we once tried to enable robots to use tools they had never seen before—and how we've come back full circle with new tools at our disposal in my work at Agility Robotics. Beyond the research itself, I’ll reflect on navigating careers between academia and industry, choosing impactful problems, and the exciting challenges ahead in embodied AI.

2024年の講演はこちら>>