Shortcut Maze

class rlforge.environments.shortcut_maze.ShortcutMaze(shortcut_episodes=20, render_mode=None)

Grid-world environment with a dynamic shortcut for testing agent adaptability.

The Shortcut Maze is a 6x9 grid where the agent starts at the bottom and must reach the terminal goal at the top-right corner. Initially, a wall blocks the direct path, forcing the agent to take a longer route. After a fixed number of episodes, a shortcut opens, allowing faster convergence if the agent adapts its policy.

Features

Discrete state space: each cell in the grid corresponds to a unique state.
Discrete action space: four possible moves (UP, RIGHT, DOWN, LEFT).
Obstacles: a row of blocked cells initially prevents direct access.
Dynamic environment: after shortcut_episodes episodes, one obstacle is removed, creating a shortcut.
Terminal state: reaching cell (0, 8) ends the episode with reward 1.
All other transitions yield reward 0.
Compatible with Gymnasium API.

Notes

Transition probabilities are deterministic (always 1.0).
This environment is useful for studying how agents adapt to non-stationary dynamics and benefit from planning or exploration.

reset(*, seed: int | None = None, options: dict | None = None)

Reset the environment to its initial state.

Parameters

seedint, optional: Random seed for reproducibility.
optionsdict, optional: Additional options (unused).

Returns

observationint: The starting state index.
infodict: Additional information, including probability of starting state.

step(a)

Execute one step in the environment.

The transition model depends on whether the shortcut has opened. Before shortcut_episodes episodes, transitions follow P1. Afterward, transitions follow P2.

Parameters

aint: Action index (0=UP, 1=RIGHT, 2=DOWN, 3=LEFT).

Returns

observationint: The new state index.
rewardfloat: Reward obtained from the transition.
terminatedbool: Whether the episode has ended.
truncatedbool: Always False (no time limit).
infodict: Additional information, including transition probability.