Loading Events

« All Events

  • This event has passed.

Atlas Wang (University of Texas at Austin): “Scaling Your Large Language Models on a Budget”

January 17 @ 12:00 PM - 1:15 PM

Presentation Abstract: 

As the sizes of Large Language Models (LLMs) continue to grow exponentially, it becomes imperative to explore novel computing paradigms that can address the dual challenge of scaling these models while adhering to constraints posed by compute and data resources. This presentation will delve into several strategies aimed at alleviating this dilemma: (1) refraining from training models entirely from scratch, instead making use of readily available pre-trained models to optimize the training starting point of a new, larger model; (2) leveraging this concept of progressive initialization to enhance compute and data efficiency during the neural scaling process; (3) integrating hardness-aware data sampling, and more memory-efficient optimizers (work in progress). The talk will be concluded by a few (informal) thoughts and reflections.

Speaker Bio:

Atlas Wang is a tenured Associate Professor and holds the Temple Foundation Endowed Faculty Fellowship #7, in the Chandra Family Department of Electrical and Computer Engineering at The University of Texas at Austin. He is also a faculty member of UT Computer Science and the Oden Institute. Meanwhile, in a part-time role, he serves as the Director of AI Research & Technology for Picsart, where he leads the development of cutting-edge, GenAI-powered tools for creative visual editing. Prof. Wang has broad research interests spanning from the theory to the application aspects of machine learning (ML). At present, his core research mission is to leverage, understand and expand the role of low dimensionality in machine learning and optimization, whose impacts span over many topics such as: efficient scaling, training and inference of large language models (LLMs); robustness and trustworthiness; learning to optimize (L2O); and generative vision. Prof. Wang has received many research awards and is fortunate enough to work with a sizable group of accomplished students (https://vita-group.github.io/)

 

Recording of Seminar: 

 

 

Details

Date:
January 17
Time:
12:00 PM - 1:15 PM

Venue

Raisler Lounge (Room 225), Towne Building
220 S 33rd Street
Philadelphia, PA United States
+ Google Map