vllm-triton-backend: How to get state-of-the-art performance on NVIDIA and AMD with just tritonBurkhard RingleinThomas Parnellet al.2025PyTorch Conference 2025
Automated Data Management and Learning-based Scheduling for Ray-based Hybrid HPC-Cloud SystemsTingkai LiuHuili Taoet al.2024Euro-PAR 2024