Publications

7 results at CLOUD 2025

Routing Strategies for RoCE Networks in AI Clouds
- - Abdul Alim
  - Ali Sydney
  - et al.
- 2025
- CLOUD 2025
Conference paper
A Lossless Compression for AI Models
- - Moshik Hershcovitch
  - Andrew Wood
  - et al.
- 2025
- CLOUD 2025
Conference paper
ClusterLink: Redefining Application Connectivity for the Multi-cloud Era
- - Kfir Toledo
  - Pravein Govindan Kannan
  - et al.
- 2025
- CLOUD 2025
Conference paper
Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM Inference
- - Pol G. Recasens
  - Ferran Agullo
  - et al.
- 2025
- CLOUD 2025
Conference paper
Causal Latency Modelling for Cloud Microservices
- - Christopher Lohse
  - Diego Tsutsumi
  - et al.
- 2025
- CLOUD 2025
Conference paper
Speeding up Model Loading with Fastsafetensors
- - Takeshi Yoshimura
  - Tatsuhiro Chiba
  - et al.
- 2025
- CLOUD 2025
Conference paper
Towards Efficient Key-Value Cache Management for Prefix Prefilling in LLM Inference
- - Yue Zhu
  - Hao Yu
  - et al.
- 2025
- CLOUD 2025
Short paper