Shortcut Maze
- class rlforge.environments.shortcut_maze.ShortcutMaze(shortcut_episodes=20, render_mode=None)
Grid-world environment with a dynamic shortcut for testing agent adaptability.
The Shortcut Maze is a 6x9 grid where the agent starts at the bottom and must reach the terminal goal at the top-right corner. Initially, a wall blocks the direct path, forcing the agent to take a longer route. After a fixed number of episodes, a shortcut opens, allowing faster convergence if the agent adapts its policy.
Features
Discrete state space: each cell in the grid corresponds to a unique state.
Discrete action space: four possible moves (UP, RIGHT, DOWN, LEFT).
Obstacles: a row of blocked cells initially prevents direct access.
Dynamic environment: after
shortcut_episodesepisodes, one obstacle is removed, creating a shortcut.Terminal state: reaching cell (0, 8) ends the episode with reward 1.
All other transitions yield reward 0.
Compatible with Gymnasium API.
Notes
Transition probabilities are deterministic (always 1.0).
This environment is useful for studying how agents adapt to non-stationary dynamics and benefit from planning or exploration.
- reset(*, seed: int | None = None, options: dict | None = None)
Reset the environment to its initial state.
Parameters
- seedint, optional
Random seed for reproducibility.
- optionsdict, optional
Additional options (unused).
Returns
- observationint
The starting state index.
- infodict
Additional information, including probability of starting state.
- step(a)
Execute one step in the environment.
The transition model depends on whether the shortcut has opened. Before
shortcut_episodesepisodes, transitions followP1. Afterward, transitions followP2.Parameters
- aint
Action index (0=UP, 1=RIGHT, 2=DOWN, 3=LEFT).
Returns
- observationint
The new state index.
- rewardfloat
Reward obtained from the transition.
- terminatedbool
Whether the episode has ended.
- truncatedbool
Always False (no time limit).
- infodict
Additional information, including transition probability.