Gymnasium Environments
The 3we project provides OpenAI Gymnasium-compatible environments for training and evaluating reinforcement learning agents. These environments wrap the simulation backend and expose standard reset() / step() interfaces.
Quick Start
Section titled “Quick Start”import gymnasium as gymimport threewe.gym # registers environments
env = gym.make("3we/Navigation-v1")obs, info = env.reset()
for _ in range(1000): action = env.action_space.sample() obs, reward, terminated, truncated, info = env.step(action) if terminated or truncated: obs, info = env.reset()
env.close()Available Environments
Section titled “Available Environments”| Environment ID | Description | Action Space | Observation Space |
|---|---|---|---|
3we/Navigation-v1 | Navigate to a goal in a cluttered room. | Box(3,) - [vx, vy, omega] | Dict(image, lidar, pose, velocity, goal) |
3we/Exploration-v1 | Maximize area covered in unknown map. | Box(3,) - [vx, vy, omega] | Dict(image, lidar, pose, velocity, map) |
3we/ObjectNav-v1 | Navigate to an object by category name. | Box(3,) - [vx, vy, omega] | Dict(image, lidar, pose, velocity, object_goal) |
3we/VLN-v1 | Follow natural language navigation instructions. | Box(3,) - [vx, vy, omega] | Dict(image, lidar, pose, velocity, instruction) |
Observation Details
Section titled “Observation Details”For 3we/Navigation-v1:
obs = { "image": np.ndarray, # shape (64, 64, 3), uint8, RGB "lidar": np.ndarray, # shape (360,), float32, range in meters "pose": np.ndarray, # shape (3,), float32 [x, y, theta] "velocity": np.ndarray, # shape (3,), float32 [vx, vy, omega] "goal": np.ndarray, # shape (2,), float32 [goal_x, goal_y]}Reward Structure
Section titled “Reward Structure”The navigation environment uses a shaped reward:
- Distance reward:
-0.1 * delta_distance_to_goalper step - Arrival bonus:
+10.0when within 0.1 m of the goal - Collision penalty:
-5.0on contact with an obstacle - Time penalty:
-0.01per step to encourage efficiency
Custom Configuration
Section titled “Custom Configuration”Pass configuration via gym.make kwargs:
env = gym.make( "3we/Navigation-v1", render_mode="human", max_episode_steps=500, room_size=5.0, num_obstacles=8, goal_tolerance=0.1,)Rendering
Section titled “Rendering”Environments support Gymnasium render modes:
env = gym.make("3we/Navigation-v1", render_mode="human") # opens a windowenv = gym.make("3we/Navigation-v1", render_mode="rgb_array") # returns framesVectorized Environments
Section titled “Vectorized Environments”For parallel training with Stable-Baselines3 or similar:
from gymnasium.vector import AsyncVectorEnv
envs = AsyncVectorEnv([ lambda: gym.make("3we/Navigation-v1") for _ in range(8)])Sim-to-Real Compatibility
Section titled “Sim-to-Real Compatibility”These environments use the same physics parameters and sensor models as the full simulation, ensuring policies trained here transfer well to hardware. See the Sim-to-Real Transfer guide for domain randomization settings.