Shortcut Maze

class rlforge.environments.shortcut_maze.ShortcutMaze(shortcut_episodes=20, render_mode=None)

Grid-world environment with a dynamic shortcut for testing agent adaptability.

The Shortcut Maze is a 6x9 grid where the agent starts at the bottom and must reach the terminal goal at the top-right corner. Initially, a wall blocks the direct path, forcing the agent to take a longer route. After a fixed number of episodes, a shortcut opens, allowing faster convergence if the agent adapts its policy.

Features

  • Discrete state space: each cell in the grid corresponds to a unique state.

  • Discrete action space: four possible moves (UP, RIGHT, DOWN, LEFT).

  • Obstacles: a row of blocked cells initially prevents direct access.

  • Dynamic environment: after shortcut_episodes episodes, one obstacle is removed, creating a shortcut.

  • Terminal state: reaching cell (0, 8) ends the episode with reward 1.

  • All other transitions yield reward 0.

  • Compatible with Gymnasium API.

Notes

  • Transition probabilities are deterministic (always 1.0).

  • This environment is useful for studying how agents adapt to non-stationary dynamics and benefit from planning or exploration.

reset(*, seed: int | None = None, options: dict | None = None)

Reset the environment to its initial state.

Parameters

seedint, optional

Random seed for reproducibility.

optionsdict, optional

Additional options (unused).

Returns

observationint

The starting state index.

infodict

Additional information, including probability of starting state.

step(a)

Execute one step in the environment.

The transition model depends on whether the shortcut has opened. Before shortcut_episodes episodes, transitions follow P1. Afterward, transitions follow P2.

Parameters

aint

Action index (0=UP, 1=RIGHT, 2=DOWN, 3=LEFT).

Returns

observationint

The new state index.

rewardfloat

Reward obtained from the transition.

terminatedbool

Whether the episode has ended.

truncatedbool

Always False (no time limit).

infodict

Additional information, including transition probability.