SCIS/ResWORK Seminar by Dr. Tarek Abdelzaher, Dr. Soujanya Poria, Dr. Lin Shao & Dr. Archan Misra

Date of Event: 17 Sep 2025 - 17 Sep 2025
Talk #1: Edge AI Services and Foundation Models for Internet of Things Applications

Talk #1: Edge AI Services and Foundation Models for Internet of Things Applications

Speaker: Dr. Tarek Abdelzaher, Professor, Department of Computer Science, University of Illinois at Urbana-Champaign


This talk explores advances in self-supervised AI and the challenges of applying these methods in IoT contexts, particularly on lower-end distributed embedded devices with multimodal specialized sensors and limited training data. It highlights issues in adapting self-supervised training pipelines for embedded sensing, developing foundation models for IoT to handle multimodal time-series data, incorporating spatial understanding for reconstructing physical environments from distributed sensors, and addressing data scarcity in specialized sensor domains. Initial empirical results are presented on training small foundation models for embedded sensor data.


Talk #2: Generative Planning and Contact Synthesis for  General-Purpose Robotic Manipulation

Talk #2: Generative Planning and Contact Synthesis for   General-Purpose Robotic Manipulation

Speaker: Dr. Lin Shao, Assistant Professor, Department of Computer Science at   the School of Computing, National University of Singapore

This talk focuses on developing foundation models that generalize across tasks, objects, and robot embodiments to enable robots to operate effectively in unstructured environments. Two recent methods are presented: FLIP, a flow-centric generative planning framework that produces long-horizon task plans from images and natural language instructions, supporting general-purpose manipulation and video generation; and a contact synthesis model, which formulates manipulation as a contact synthesis problem using point cloud data, object properties, target motion, and manipulation masks to output contact points and forces for execution. The talk concludes with future directions for advancing general-purpose robotic intelligence.


Talk #3: 10 Open Challenges Steering the Future of   Vision-Language-Action Models

Talk #3: 10 Open Challenges Steering the Future of   Vision-Language-Action Models

Speaker: Dr. Soujanya Poria, Associate Professor, School of Electrical and   Electronic Engineering, National Technological University

This talk examines the rise of vision-language-action (VLA) models in embodied AI, emphasizing their ability to translate natural language instructions into real-world actions. It outlines ten milestones that define progress and challenges—covering multimodality, reasoning, data, evaluation, generalization across robots, efficiency, whole-body coordination, safety, intelligent agents, and human collaboration. Emerging trends such as spatial understanding, modeling world dynamics, post-training refinements, and synthetic data generation are highlighted as key directions. Together, these advances form a roadmap toward deploying VLA models as trustworthy, widely adopted embodied intelligence with broad societal impact.


Talk #4: Efficient, Embodied AI for Collaborative  Human-Machine Tasking

Talk #4: Efficient, Embodied AI for Collaborative   Human-Machine Tasking

Speaker: Dr. Archan Misra, Vice Provost (Research) and Lee Kong Chian   Professor of Computer Science, Singapore Management University

The talk discusses advances in machine intelligence for perception, decision-making, and navigation that enable robots to function as co-workers in diverse environments beyond manufacturing. Current embodied AI models remain too large and complex for execution on resource-constrained platforms. To address this, the talk introduces research directions aimed at reducing sensing and computational overheads for key embodied AI tasks, including 2D/3D visual grounding of human instructions and robotic task planning in dynamic environments.