Trajectory Tracking

class rlforge.environments.trajectory_tracking.TrajectoryTracking(x_range=(-2, 2), y_range=(-2, 2), initial_state=(0, 0, 0), trajectory=[(1, 1)], d_min=0.05, obstacles=[], dt=0.01)

Mobile robot environment for trajectory tracking tasks.

The TrajectoryTracking environment models a robot navigating in a bounded 2D workspace. The agent’s objective is to follow a sequence of waypoints (trajectory) while avoiding obstacles and staying within the workspace limits.

Features

  • State space: three-dimensional vector [x, y, theta]. * x, y: robot position in the plane. * theta: robot orientation (wrapped to [-π, π]).

  • Action space: discrete set of velocity commands. * Forward motion. * Rotate counterclockwise. * Rotate clockwise.

  • Reward: * Negative distance and heading error to the current waypoint. * Large penalty if leaving the workspace. * Penalty if colliding with an obstacle. * Positive reward when reaching waypoints (scaled by waypoint index).

  • Terminal condition: reaching the final waypoint in the trajectory.

  • Compatible with Gymnasium API.

Notes

  • Obstacles are defined as circles with center coordinates and radius.

  • The environment uses simple kinematic equations with Euler integration.

  • Waypoints are reached when the robot is within a distance threshold d_min.

reset()

Reset the robot to its initial state and restart the trajectory.

Returns

observationtuple

A 5-element tuple (state, reward, terminated, truncated, info): - state (numpy.ndarray): [x, y, theta]. - reward (float): initial reward (set to -1). - terminated (bool): always False. - truncated (bool): always False. - info (dict or None): unused, set to None.

step(action)

Advance the robot dynamics by one time step.

Parameters

actionint

Index of the chosen action: - 0: forward motion - 1: rotate counterclockwise - 2: rotate clockwise

Returns

observationtuple

A 5-element tuple (state, reward, terminated, truncated, info): - state (numpy.ndarray): [x, y, theta]. - reward (float): reward based on distance, heading error, penalties, and waypoint progress. - terminated (bool): True if the final waypoint is reached. - truncated (bool): always False (no time limit). - info (dict or None): unused, set to None.

Notes

  • Position is updated using simple kinematics with orientation.

  • Orientation is wrapped to [-π, π].

  • Leaving the workspace yields a large penalty and resets to the previous state.

  • Colliding with an obstacle yields a penalty and resets to the previous state.

  • Reaching a waypoint yields a positive reward proportional to the waypoint index.

  • Reaching the final waypoint ends the episode.