PaperOn the Optimality of the Probability Ranking Scheme in Storage ApplicationsP.C. Yue, C.K. WongJournal of the ACM
Conference paperA Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language ModelsTakuma Udagawa, Aashka Trivedi, et al.EMNLP 2023