Towards Pareto Optimal Throughput in Small Language Model ServingPol G. RecasensYue Zhuet al.2024EuroMLSys 2024
Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUsConnor EspenshadeRachel Penget al.2024EuroMLSys 2024
Reducing Datacenter Compute Carbon Footprint by Harnessing the Power of Specialization: Principles, Metrics, Challenges and OpportunitiesTamar EilamPradip Boseet al.2024IEEE Trans Semicond Manuf