We are a group of faculty, researchers, and students targeting at the intersection of machine learning and systems.
Our current members span the Computer Science and Engineering Department (CSE) and the Halıcıoğlu Data Science Institute (HDSI) at the University of California, San Diego. Our research focuses on a broad spectrum of topics aimed at advancing next-generation systems for machine learning and developing innovative algorithms.
Congratulations to MLsys group student Hanxian Huang on being selected as a 2024 MLCommons Rising Star. She was among the 41 junior researchers selected from over 170 applicants globally. The MLCommons Rising Stars are selected based on their excellence in Machine Learning (ML) and Systems research and stand out for their current and future contributions and potential.
Numerous domain-specific accelerators have been developed recently to address the growing computational needs of machine learning, and the success of these DSAs hinges on effective ML compilers like Google's XLA, which enhances ML performance on various hardware and supports multiple frameworks, and is further advanced through collaborative development in OpenXLA.
PyTorch 2 leverages new technologies like TorchDynamo and TorchInductor to significantly enhance training and inference speeds without compromising its ease of use, flexibility, and Pythonic environment. TorchDynamo optimizes unmodified PyTorch code at the Python bytecode level, while TorchInductor translates programs for efficient execution on GPUs and CPUs, maintaining the dynamism inherent in PyTorch and allowing for easy user customization.
With the ubiquitous use-cases of modern LLMs, the deployment scale of these models is unforeseen. This has led to a large-scale datacenter expansion with GPUs, currently running into an energy wall worldwide. This talk will focus on the properties of generative LLMs that can be used to make the deployment of these models more power-efficient. The talk will also introduce POLCA and Splitwise, two techniques to reduce the power consumption for the LLM serving.