Abstract
The present study explores the application of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to reinforcement learning tasks. This evolutionary algorithm optimises agent behaviour without requiring gradient information, rendering it particularly suitable for complex control problems. The research herein documents the implementation, training methodology, and performance analysis of a CMA-ES agent within the standardised CartPole-v1 environment.
"Evolution strategies represent a compelling alternative to traditional policy gradient methods, offering robust performance characteristics without backpropagation requirements."
— Journal of Evolutionary Computation, 2023
This investigation contributes to the growing body of literature on sample-efficient evolutionary algorithms in reinforcement learning contexts, demonstrating remarkable convergence properties and stability in policy optimisation.