All Posts

  • Published on
    Speaker: Dr. Jason Ansel, Meta AI
    Preview of PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph
    PyTorch 2 leverages new technologies like TorchDynamo and TorchInductor to significantly enhance training and inference speeds without compromising its ease of use, flexibility, and Pythonic environment. TorchDynamo optimizes unmodified PyTorch code at the Python bytecode level, while TorchInductor translates programs for efficient execution on GPUs and CPUs, maintaining the dynamism inherent in PyTorch and allowing for easy user customization.
  • Published on
    Speaker: Dr. Esha Choukse, Microsoft
    Preview of Rapid LLM deployments: with great power comes great responsibility
    With the ubiquitous use-cases of modern LLMs, the deployment scale of these models is unforeseen. This has led to a large-scale datacenter expansion with GPUs, currently running into an energy wall worldwide. This talk will focus on the properties of generative LLMs that can be used to make the deployment of these models more power-efficient. The talk will also introduce POLCA and Splitwise, two techniques to reduce the power consumption for the LLM serving.