Quizlearn
.app
What is the primary role of the reward function in Policy Gradients algorithms?
To define the goal of the task
To generate training data
To update the policy parameters
To estimate the policy's performance
Machine Learning Algorithms Exercises are loading ...