NSF Workshop on the Science of Safe AI
NSF Workshop on the Science of Safe AI will be held at University of Pennsylvania in Philadelphia on February 26, 2025 hosted by Penn Engineering’s ASSET Center for Trustworthy AI. This will be an invitation-only, one-day workshop focused on exploring the future of safe AI. The workshop will also serve as the PI Meeting for the NSF program on Safe Learning-Enabled Systems. The workshop will discuss a variety of themes related to the AI safety studied in AI/ML community to broaden the scope of SLES. Further details about the workshop will be shared soon.
The registration is now closed. If you have questions, please email Maggie Weglos at mweglos@seas.upenn.edu.
-
Date
Wednesday, February 26th
-
Time
8:00 am - 6:00 pm
-
Location
Singh Center for Nanotechnology, Glandt Forum Room
Agenda
8:00 – 8:30 am: Coffee and light breakfast
8:30 – 8:35 am: Welcome by Rajeev Alur, Workshop Organizer
8:35 – 8:45 am: Welcome by Michael Littman, Division Director, IIS, NSF
8:45 – 9:10 am: Moshe Vardi (Rice University), “Autonomous Systems with Neural Controllers: From Verification to Falsification”
9:10 – 9:35 am: Olga Russakovsky (Princeton University), “Data, Models, Society”
9:35 – 10:00 am: Aaron Roth (University of Pennsylvania), “Task Specific Uncertainty Quantification”
10:00 – 10:30 am: Coffee Break
10:30 am – 12:00 pm: Lightning talks by SLES PIs
Andreas Malikopoulos, “Improving Safety by Synthesizing Interacting Model-based and Model-free Learning Approaches”
Cho-Jui Hsieh, “Formal and Empirical Robustness of Sequential Generative Models”
Claire Tomlin, “Certified Learning of Safety Certificates”
Daniel Brown, “High-Confidence Guarantees for Safe Reward and Policy Learning Under Uncertainty”
Dinesh Jayaraman, “Specification-Guided Safety for Vision-Based Robot Learners”
Dung Hoang Tran, “ProbStar Temporal Logic for Verification of LES’s Temporal Properties”
Han Zhao, “Efficient Model Editing for Safe Information Localization and Stitching”
Hanghang Tong, “NetSafe: Towards a Computational Foundation of Safe Graph Neural Networks”
Hanlin Zhang, “The Impossibility of Strong Watermarking”
Lars Lindemann, “A Neurosymbolic Approach for Safe Multi-Agent Systems”
Madhur Behl, “CRASH – Challenging Reinforcement Learning Based Adversarial Scenarios for Safety Hardening”
Ming Jin, “Antifragility through Test-Time Adaptation”
Momotaz Begum, “Learning Safe Policy from Human Demonstrations to Support Robot-Assisted Aging-in-Place”
Naira Hovakimyan, “DeSimplex: Data Enabled Simplex for Safe Operation of Autonomous Systems”
Nikolai Matni, “Domain Randomization is Sample Efficient for Controlling an Unknown System”
Sandhya Saisubramanian, “No Bad Surprises: Aligning Agent and Human Norms via Specification Refinements”
Sharon Li, “Foundations of Safety-Aware Learning in the Wild”
Tianyi Zhang, “Testing and Debugging Multi-module Autonomous Vehicles in Near-Collision Traffic Scenarios”
Weiming Xiang, “Foundations of Qualitative and Quantitative Safety Assessment of Learning-enabled Systems”
Xian Yu, “Safe Distributional-Reinforcement Learning-Enabled Systems: Theories, Algorithms and Experiments”
Xueru Zhang, “Long-Term Safety for Human-AI Ecosystem”
12:00 – 1:30 pm: Lunch and poster session
Alexander Robey, “Jailbreaking LLM-Controlled Robots”
Andrea Bajcsy, “Generalizing Safety Beyond Collision-Avoidance”
Dung Hoang Tran, “ProbStar Temporal Logic for Verification of LES’s Temporal Properties”
Han Zhao, “Efficient Model Editing for Safe Information Localization and Stitching”
Huan Zhang, “Advancements in Neural Network Verification and Applications in Control, Robotics, Graphics, and More”
Kai Shu, “Combating Misinformation Risks in the Era of Large Language Models”
Lars Lindemann, “A Neurosymbolic Approach for Safe Multi-Agent Systems”
Madhur Behl, “CRASH – Challenging Reinforcement Learning Based Adversarial Scenarios for Safety Hardening”
Mayur Naik, “IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities”
Ming Jin, “Antifragility through Test-Time Adaptation”
Naira Hovakimyan, “Guaranteed Tubes for Safe Learning across Autonomy Architectures”
Osbert Bastani, “Leveraging Specifications and Large Language Models for Reliable Reward Design”
René Vidal, “Certified Robustness against Sparse Adversarial Perturbations via Data Localization”
Sanghamitra Dutta, “Model Reconstruction Using Counterfactual Explanations: A Perspective from Polytope Theory”
Sayan Mitra, “Formal Visual Reasoning with Abstract Rendering and Perception Contracts”
Shreyas Kousik, “Making Robot Embodied AI Safe and Fast”
Sourya Dey, “Investigating Knowledge Closure of Large Language Models via Token Embeddings”
Taylor Johnson, “Verification of Neural Networks and Neural Network Control Systems”
ThanhVu Nguyen, “NeuralSAT: a DPLL(T) approach to DNN Verification”
Xi Peng, “Orchestrating Model, System, and Hardware for Safe Learning in Autonomous Vehicles”
Ziyu Yao, “Efficient but Vulnerable: Benchmarking and Defending LLMs in Batch Prompting Attacks”
1:30 – 3:00 pm: Four Parallel Working Group Sessions
Working Group A: Defining Safety
Location: Singh 035
Lead: Hadas Kress-Gazit (Cornell)
Questions:
- What type of safety assurance people need when systems are deployed in everyday use vs in critical decision making?
- What are examples of safety properties that can be formalized rigorously?
- What are opportunities/challenges in context of existing research?
Working Group B: Design for Safety
Location: Singh 221
Lead: René Vidal (Penn)
Questions:
- What are the current trends in incorporating safety as a design goal, in addition to accuracy, during training?
- What’s missing, and what are new opportunities?
Working Group C: Safety Analysis
Location: Glandt Forum Room
Lead: Corina Pasareanu (CMU and NASA Ames)
Questions:
- What type of formal guarantees are possible using verification/testing/monitoring tools?
- What are trade-offs between worst-case guarantees vs statistical guarantees?
- What are new challenges/opportunities?
Working Group D: Attacks and Defenses
Location: Singh 313
Lead: Greg Durrett (UT Austin)
Questions:
- What are new types of attacks on LLMs, VLMs, and AI agents?
- What are potential defenses?
3:00 – 4:00 pm: Coffee, poster presentations continue
4:00 – 5:00 pm: 10 minute report from each of the working group leads, Q+A, and concluding remarks
5:00 – 6:00 pm: Light reception
Organizer – Rajeev Alur
NSF Program Managers – Anindya Banerjee, David Corman, Pavithra Prabhakar, Jie Yang
Local Organization Point of Contact – Maggie Weglos