Base Agent

class rlforge.agents.base_agent.BaseAgent

Abstract base class for all RLForge agents.

This class defines the standard interface that every agent in RLForge must implement. It provides a consistent structure for interacting with environments, handling episodes, and selecting actions. By inheriting from BaseAgent, new agents can be integrated seamlessly into the RLForge framework.

Notes

  • All methods are abstract and must be implemented by subclasses.

  • The interface is designed to be environment-agnostic, so agents can be applied to both discrete and continuous tasks.

abstract end(reward)

Complete an episode.

This method is called when the environment signals that the episode has terminated. The agent can use the final reward to update its estimates.

Parameters

rewardfloat

The final reward received at the end of the episode.

abstract reset()

Reset the agent’s internal state.

This method is called between episodes to clear any temporary variables or statistics. It ensures that each episode starts from a clean state, without residual information from previous runs.

abstract select_action(state)

Select an action given the current state.

This method encapsulates the agent’s policy. Depending on the implementation, it may be deterministic (e.g., greedy) or stochastic (e.g., epsilon-greedy, softmax, Gaussian).

Parameters

stateobject

The current state observed from the environment.

Returns

actionint or float or numpy.ndarray

The action chosen by the agent.

abstract start(state)

Begin a new episode.

Parameters

stateobject

The initial state observed from the environment.

Returns

actionint or float or numpy.ndarray

The first action selected by the agent given the initial state.

abstract step(reward, state)

Take a step in the environment.

This method is called after the agent receives a reward and the next state from the environment. The agent should update its internal estimates and return the next action.

Parameters

rewardfloat

The reward received from the previous action.

stateobject

The new state observed from the environment.

Returns

actionint or float or numpy.ndarray

The next action chosen by the agent.