Loading Events

« All Events

  • This event has passed.

Hamed Hassani (University of Pennsylvania): “Robustness in the Era of LLMs: Jailbreaking Attacks and Defenses”

September 25 @ 12:00 PM - 1:15 PM

Abstract: 

Despite efforts to align large language models (LLMs) with human intentions, popular LLMs such as chatGPT, Llama, Claude, and Gemini are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content. For this reason, interest has grown in improving the robustness of LLMs against such attacks. In this talk, we review the current state of the jailbreaking literature, including new questions about robust generalization, discussions of new black-box attacks on LLMs, defenses against jailbreaking attacks, and a new leaderboard to evaluate the robust generalization of production LLMs.

Biography:
Hamed Hassani is currently an associate professor of the Electrical and Systems Engineering Department, the Computer and Information Systems Department, and the Department of Statistics and Data Science at the Universityof Pennsylvania. Prior to that, he was a research fellow at Simons Institute for the Theory of Computing (UC Berkeley) affiliated with the program of Foundations of Machine Learning, and a post-doctoral researcher at the Institute ofMachine Learning at ETH Zurich. He received a Ph.D. degree in Computer and Communication Sciences from EPFL, Lausanne. He is the recipient of the 2014 IEEE Information Theory Society Thomas M. Cover Dissertation Award, 2015 IEEE International Symposium on Information Theory Student Paper Award, 2017 Simons- Berkeley Fellowship, 2018 NSF-CRII Research Initiative Award, 2020 Air Force Office of Scientific Research (AFOSR) Young Investigator Award, 2020 National Science Foundation (NSF) CAREER Award, 2020 Intel Rising Star award, the distinguished lecturer of the IEEE Information Society in 2022-23, and the 2023 IEEE Communications Society & Information theory Society Joint Paper Award. Moreover, he was selected as the recipient of the 2023 IEEE Information Theory Society’s James L. Massey Research and Teaching Award for Young Scholars.

Details

Date:
September 25
Time:
12:00 PM - 1:15 PM

Venue

Raisler Lounge (Room 225), Towne Building
220 S 33rd Street
Philadelphia, PA United States
+ Google Map