This event has passed.

Eric Wong (University of Pennsylvania): “Provable vs Impossible Trust: Reasoning, Steering, and Safety”

Name: Eric Wong (University of Pennsylvania): “Provable vs Impossible Trust: Reasoning, Steering, and Safety”
Start: 2025-09-03T12:00:00-04:00
End: 2025-09-03T13:15:00-04:00
Location: Amy Gutmann Hall, Room 414

September 3 @ 12:00 PM - 1:15 PM

Abstract:

Abstract: In this talk, I will discuss a collection of highlights from our recent work in trustworthy AI.
(1) Certifying reasoning explanations with reliability guarantees and aligning with expert knowledge,
(2) Simple yet effective steering inspired from theoretical rule-following mechanisms for transformers, and
(3) The impossibility of monitoring stateless attackers and what safety defenses should be doing.

Biography:

Eric Wong is an Assistant Professor in the Department of Computer and Information Science at the University of Pennsylvania. He researches the foundations of robust systems, building on elements of machine learning and optimization to debug, understand, and develop reliable systems.

Seminar Recording: https://drive.google.com/file/d/1FNeVVPXb_vZiNWVexFTgTFoVKBM_QnqQ/view?usp=sharing

Details

Date:: September 3
Time:: 12:00 PM - 1:15 PM

Venue

: Amy Gutmann Hall, Room 414
: 3333 Chestnut Street
Philadelphia, 19104 United States + Google Map