Hybrid reinforcement learning with expert state sequences
Xiaoxiao Guo, Shiyu Chang, et al.
AAAI 2019
The least-squares predictor for a random process which is generated by linear difference equations is known to obey similar linear difference equations. A stability theory is developed for such equations. Conditions under which the infinite covariance matrix of the process, considered as a bounded operator: l2 → l2, has a bounded inverse are shown to be both necessary and sufficient conditions for the stability of the optimum predictor. The same conditions also ensure the convergence of an algorithm for factoring recursively the infinite covariance matrix as a product of upper and lower triangular factors. Finally, it is shown that the stability obtained in this fashion is equivalent to uniform asymptotic stability. © 1969 American Elsevier Publishing Company, Inc.
Xiaoxiao Guo, Shiyu Chang, et al.
AAAI 2019
Annina Riedhauser, Viacheslav Snigirev, et al.
CLEO 2023
Kenneth L. Clarkson, Elad Hazan, et al.
Journal of the ACM
R. Sebastian, M. Weise, et al.
ECPPM 2022