{rfName}
MA

License and use

Icono OpenAccess

Citations

Altmetrics

Analysis of institutional authors

Rozada, SCorresponding AuthorMarques, AgAuthor

Share

April 25, 2024
Publications
>
Meeting

MATRIX LOW-RANK TRUST REGION POLICY OPTIMIZATION

Publicated to:2023 Ieee 9th International Workshop On Computational Advances In Multi-Sensor Adaptive Processing, Camsap. 406-410 - 2023-01-01 (), DOI: 10.1109/CAMSAP58249.2023.10403480

Authors: Rozada, Sergio; Marques, Antonio G

Affiliations

King Juan Carlos Univ, Dept Signal Theory & Communicat - Author

Abstract

Most methods in reinforcement learning use a Policy Gradient (PG) approach to learn a parametric stochastic policy that maps states to actions. The standard approach is to implement such a mapping via a neural network (NN) whose parameters are optimized using stochastic gradient descent. However, PG methods are prone to large policy updates that can render learning inefficient. Trust region algorithms, like Trust Region Policy Optimization (TRPO), constrain the policy update step, ensuring monotonic improvements. This paper introduces low-rank matrix-based models as an efficient alternative for estimating the parameters of TRPO algorithms. By gathering the stochastic policy's parameters into a matrix and applying matrixcompletion techniques, we promote and enforce low rank. Our numerical studies demonstrate that low-rank matrix-based policy models effectively reduce both computational and sample complexities compared to NN models, while maintaining comparable aggregated rewards.

Keywords

ApproximationFactorizationGradientGradient approachGradient methodsLearning systemsLow-rank matricesMatrixMatrix factorizationMatrix factorization.Matrix factorizationsOptimizationPolicy gradientPolicy gradientsPolicy optimizationReinforcement learningReinforcement learningsStochastic policyStochastic systemsTrpoTrust regionTrust region policy optimization

Quality index

Impact and social visibility

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Rozada Doval, Sergio) and Last Author (García Marqués, Antonio).

the author responsible for correspondence tasks has been Rozada Doval, Sergio.