Statistical inference using SGD
Tianyang Li, Anastasios Kyrillidis, et al.
AAAI 2018
A rank-r matrix X ∈ Rm×ncan be written as a product UV⊤, where U ∈ Rm×rand V ∈ Rn×r. One could exploit this observation in optimization: e.g., consider the minimization of a convex function f(X) over rank-r matrices, where the set of low-rank matrices is modeled via UV⊤. Though such parameterization reduces the number of variables and is more computationally efficient (of particular interest is the case r ≪ min{m, n}), it comes at a cost: f(UV⊤) becomes a nonconvex function w.r.t. U and V. We study such parameterization on generic convex objectives f and focus on first-order, gradient descent algorithms. We propose the bifactored gradient descent (BFGD) algorithm, an efficient first-order method that operates directly on the U, V factors. We show that when f is (restricted) smooth, BFGD has local sublinear convergence; when f is both (restricted) smooth and (restricted) strongly convex, it has local linear convergence. For several applications, we provide simple and efficient initialization schemes that provide initial conditions, good enough for the above convergence results to hold, globally. Extensive experimental results support our arguments that BFGD is an efficient and accurate nonconvex method, compared to state-of-the-art approaches.
Tianyang Li, Anastasios Kyrillidis, et al.
AAAI 2018
Michail Vlachos, Francesco Fusco, et al.
CIKM 2014
Anastasios Kyrillidis, Amir Kalev, et al.
npj Quantum Information
Anastasios Kyrillidis, Amir Kalev, et al.
npj Quantum Information