SingleAgentEnvRunner API#
rllib.env.single_agent_env_runner.SingleAgentEnvRunner#
- class ray.rllib.env.single_agent_env_runner.SingleAgentEnvRunner(*, config: AlgorithmConfig, **kwargs)[source]#
The generic environment runner for the single agent case.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
- __init__(*, config: AlgorithmConfig, **kwargs)[source]#
Initializes a SingleAgentEnvRunner instance.
- Parameters:
config – An
AlgorithmConfig
object containing all settings needed to build thisEnvRunner
class.
- sample(*, num_timesteps: int = None, num_episodes: int = None, explore: bool = None, random_actions: bool = False, force_reset: bool = False) List[SingleAgentEpisode] [source]#
Runs and returns a sample (n timesteps or m episodes) on the env(s).
- Parameters:
num_timesteps – The number of timesteps to sample during this call. Note that only one of
num_timetseps
ornum_episodes
may be provided.num_episodes – The number of episodes to sample during this call. Note that only one of
num_timetseps
ornum_episodes
may be provided.explore – If True, will use the RLModule’s
forward_exploration()
method to compute actions. If False, will use the RLModule’sforward_inference()
method. If None (default), will use theexplore
boolean setting fromself.config
passed into this EnvRunner’s constructor. You can change this setting in your config viaconfig.env_runners(explore=True|False)
.random_actions – If True, actions will be sampled randomly (from the action space of the environment). If False (default), actions or action distribution parameters are computed by the RLModule.
force_reset – Whether to force-reset all (vector) environments before sampling. Useful if you would like to collect a clean slate of new episodes via this call. Note that when sampling n episodes (
num_episodes != None
), this is fixed to True.
- Returns:
A list of
SingleAgentEpisode
instances, carrying the sampled data.
- get_metrics() Dict [source]#
Returns metrics (in any form) of the thus far collected, completed episodes.
- Returns:
Metrics of any form.