EpsilonGreedyExplorationEpsilon Property |
Namespace: Accord.MachineLearning
The value determines the amount of exploration driven by the policy. If the value is high, then the policy drives more to exploration - choosing random action, which excludes the best one. If the value is low, then the policy is more greedy - choosing the beat so far action.