Publications

6 results for Sara Kokkila Schumacher

vllm-triton-backend: How to get state-of-the-art performance on NVIDIA and AMD with just triton
- - Burkhard Ringlein
  - Thomas Parnell
  - et al.
- 2025
- PyTorch Conference 2025
Talk
Triton in Action: Real-World Optimizations for Mamba2 and vLLM
- - Jamie Yang
  - Sara Kokkila Schumacher
  - et al.
- 2025
- Triton Developer Conference 2025
Poster
The Anatomy of a Triton Attention Backend
- - Burkhard Ringlein
  - Jan van Lunteren
  - et al.
- 2025
- Triton Developer Conference 2025
Poster
Lowering the Barrier: A Science Gateway for Scalable Machine Learning
- - Vismayak Mohanarajan
  - Luigi Marini
  - et al.
- 2025
- eScience 2025
Short paper
Efficient and Cost-Effective HPC on the Cloud
- - Aditya Bhosale
  - Laxmikant Kale
  - et al.
- 2025
- FlexScience 2025
Workshop paper
Automated Data Management and Learning-based Scheduling for Ray-based Hybrid HPC-Cloud Systems
- - Tingkai Liu
  - Huili Tao
  - et al.
- 2024
- Euro-PAR 2024
Conference paper