Trajectory Tracking
- class rlforge.environments.trajectory_tracking.TrajectoryTracking(x_range=(-2, 2), y_range=(-2, 2), initial_state=(0, 0, 0), trajectory=[(1, 1)], d_min=0.05, obstacles=[], dt=0.01)
Mobile robot environment for trajectory tracking tasks.
The TrajectoryTracking environment models a robot navigating in a bounded 2D workspace. The agent’s objective is to follow a sequence of waypoints (trajectory) while avoiding obstacles and staying within the workspace limits.
Features
State space: three-dimensional vector
[x, y, theta]. *x, y: robot position in the plane. *theta: robot orientation (wrapped to [-π, π]).Action space: discrete set of velocity commands. * Forward motion. * Rotate counterclockwise. * Rotate clockwise.
Reward: * Negative distance and heading error to the current waypoint. * Large penalty if leaving the workspace. * Penalty if colliding with an obstacle. * Positive reward when reaching waypoints (scaled by waypoint index).
Terminal condition: reaching the final waypoint in the trajectory.
Compatible with Gymnasium API.
Notes
Obstacles are defined as circles with center coordinates and radius.
The environment uses simple kinematic equations with Euler integration.
Waypoints are reached when the robot is within a distance threshold
d_min.
- reset()
Reset the robot to its initial state and restart the trajectory.
Returns
- observationtuple
A 5-element tuple
(state, reward, terminated, truncated, info): - state (numpy.ndarray): [x, y, theta]. - reward (float): initial reward (set to -1). - terminated (bool): always False. - truncated (bool): always False. - info (dict or None): unused, set to None.
- step(action)
Advance the robot dynamics by one time step.
Parameters
- actionint
Index of the chosen action: - 0: forward motion - 1: rotate counterclockwise - 2: rotate clockwise
Returns
- observationtuple
A 5-element tuple
(state, reward, terminated, truncated, info): - state (numpy.ndarray): [x, y, theta]. - reward (float): reward based on distance, heading error, penalties, and waypoint progress. - terminated (bool): True if the final waypoint is reached. - truncated (bool): always False (no time limit). - info (dict or None): unused, set to None.
Notes
Position is updated using simple kinematics with orientation.
Orientation is wrapped to [-π, π].
Leaving the workspace yields a large penalty and resets to the previous state.
Colliding with an obstacle yields a penalty and resets to the previous state.
Reaching a waypoint yields a positive reward proportional to the waypoint index.
Reaching the final waypoint ends the episode.