OpenXLA: Compiling Machine Learning for Peak Performance

This week, our MLSys seminar is pleased to present a talk by Dr. Jinliang Wei scheduled on Thursday (5/9) 5 - 6.30 pm. We welcome all interested students and faculty to attend the talk on Zoom: https://ucsd.zoom.us/j/8430869005.

Talk title: OpenXLA: Compiling Machine Learning for Peak Performance

Talk Abstract: Many domain-specific accelerators were developed in the past a couple of years to meet ML's fast increasing computational demands. A good compiler is essential to the success of DSAs. XLA is a compiler developed at Google and is open-sourced at OpenXLA. XLA has been widely used in Google and significantly boosts the performance of many ML applications running on TPUs. XLA supports various ML frameworks, such as TensorFlow, JAX, and PyTorch, as well as many hardware backends besides TPUs, such as GPUs and CPUs. OpenXLA is developed collaboratively by Google and many other leading ML hardware and software organizations and welcomes external collaboration and contributions. This talk provides a gentle instruction to XLA. We will discuss some of XLA's key optimizations, such as fusion and layout assignment, and techniques to support distributed training and serving, such as SPMD and collective-matmul.

Bio: Jinliang is a software engineer at Google, working on ML performance with a special focus on TPUs and scalability. Prior to Google, Jinliang received his Ph.D. from CMU, and his thesis was on scalable ML systems.