legged_robot_config.py#
This document provides detailed descriptions of all configuration parameters defined in legged_robot_config.py.
LeggedRobotCfg#
env Environment Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int |
4096 |
Number of parallel environments |
|
int |
48 |
Size of the observation vector |
|
int |
None |
Number of privileged observations. If not None, step() returns priviledge_obs_buf (critic obs for asymmetric training), otherwise returns None |
|
int |
12 |
Size of the action vector |
|
bool |
True |
Whether to send timeout information to the algorithm |
|
float |
20 |
Episode length in seconds |
|
float |
2.0 |
Spacing between environments in the scene, only for plane terrain |
|
float |
0.1 |
Time before a fail state leads to environment reset (seconds) |
|
bool |
False |
Whether to enable debug drawings in the simulator |
|
bool |
False |
Whether to obtain height measurements around the base |
|
bool |
False |
Whether to obtain height measurements around the feet (9 points around each foot) |
|
bool |
False |
Whether to draw all height points of the terrain |
|
bool |
False |
Whether to draw key body points for mimic tasks |
|
float |
-0.1 |
Maximum allowed projected gravity in z-axis direction |
terrain Terrain Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
‘plane’ |
Terrain mesh type: ‘plane’, ‘heightfield’, ‘trimesh’ |
|
float |
200.0 |
Plane size [m], default is 200x200x10 |
|
float |
0.1 |
Distance between height samples in x and y direction [m] |
|
float |
0.005 |
Distance between height samples in z direction [m] |
|
float |
5 |
Length of the border surrounding the terrain [m] |
|
float |
1.0 |
Height of the border surrounding the terrain [m] |
|
bool |
False |
Whether to use terrain curriculum, starting from easier terrains and gradually increasing difficulty |
|
float |
1.0 |
Coefficient of static friction of the terrain |
|
float |
1.0 |
Coefficient of dynamic friction of the terrain |
|
float |
0. |
Coefficient of restitution of the terrain |
|
bool |
False |
Whether to obtain terrain height information around feet (default: 9 points around each foot) |
|
bool |
False |
Whether to obtain height measurements |
|
list |
[-0.8, …, 0.8] |
X positions of height sampling around the base (relative to robot base), 1m x 1.6m rectangular area |
|
list |
[-0.5, …, 0.5] |
Y positions of height sampling around the base (relative to robot base) |
|
bool |
False |
Whether to select a unique terrain type and pass all arguments |
|
dict |
None |
Dictionary of arguments for selected terrain |
|
int |
1 |
Starting curriculum level |
|
float |
6.0 |
Length of each subterrain [m] (X direction) |
|
float |
6.0 |
Width of each subterrain [m] (Y direction) |
|
float |
3.0 |
Size of the flat platform at the center of each subterrain [m] |
|
int |
4 |
Number of terrain rows (levels), X direction |
|
int |
4 |
Number of terrain columns (types), Y direction |
|
list |
[0.1, 0.1, 0.35, 0.25, 0.2] |
Terrain type proportions: [smooth slope, rough slope, stairs up, stairs down, discrete] |
|
float |
0.75 |
Trimesh only: slopes above this threshold will be corrected to vertical surfaces |
init_state Initial State Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
list |
[0.0, 0.0, 1.] |
Initial position x, y, z [m] |
|
list |
[0.0, 0.0, 0.0, 1.0] |
Initial orientation quaternion x, y, z, w. Note: Gym format is xyzw, Genesis format is wxyz |
|
list |
[0.0, 0.0, 0.0] |
Initial linear velocity x, y, z [m/s] |
|
list |
[0.0, 0.0, 0.0] |
Initial angular velocity x, y, z [rad/s] |
|
float |
0.0 |
Roll angle randomization range |
|
float |
0.0 |
Pitch angle randomization range |
|
float |
0.0 |
Yaw angle randomization range |
|
dict |
{“joint_a”: 0., “joint_b”: 0.} |
Target joint angles when action = 0.0 |
control Control Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
‘P’ |
Control type: ‘P’ (position), ‘V’ (velocity), ‘T’ (torque) |
|
dict |
{‘joint_a’: 10.0, ‘joint_b’: 15.} |
PD control stiffness [N*m/rad] |
|
dict |
{‘joint_a’: 1.0, ‘joint_b’: 1.5} |
PD control damping [Nms/rad] |
|
float |
0.5 |
Action scale factor, target angle = actionScale * action + defaultAngle |
|
float |
0.02 |
Control frequency, default 50Hz |
|
int |
4 |
Ratio of control action updates to policy updates |
asset Asset Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
None |
Asset name |
|
str |
“” |
URDF file path |
|
str |
“” |
Foot body name, bodies containing this substring will be considered as feet |
|
list |
[] |
List of important bodies to be tracked in mimic tasks |
|
list |
[] |
Penalize contacts on links containing these substrings |
|
list |
[] |
Terminate episode after contacts on links containing these substrings |
|
bool |
False |
Whether to fix the base link to the world frame |
|
bool |
False |
Whether to obtain contact states of specific links, can be used for privileged policy |
|
list |
[“thigh”, “calf”, “foot”] |
Link names for obtaining contact states |
|
str |
“” |
Full name of the base link |
|
int |
0 |
Self-collision setting: 1 to disable, 0 to enable (bitwise filter) |
|
list |
[“joint_a”, “joint_b”] |
Sequence of DOFs in actions and observations |
|
list |
[0.0] |
Armature of each DOF |
|
list |
[] |
Genesis only: links that are not merged due to fixed joints |
|
list |
[] |
DOF velocity limits [rad/s], obtained from URDF |
|
bool |
False |
IsaacGym/IsaacLab only: whether to disable gravity |
|
bool |
True |
IsaacGym/IsaacLab only: whether to merge bodies connected by fixed joints |
|
int |
3 |
IsaacGym/IsaacLab only: DOF drive mode (0 none, 1 position target, 2 velocity target, 3 effort) |
|
bool |
False |
IsaacGym/IsaacLab only: replace collision cylinders with capsules, improves simulation speed and stability |
|
bool |
False |
IsaacGym/IsaacLab only: whether to flip certain .obj meshes from y-up to z-up |
|
float |
0.001 |
Density |
|
float |
0. |
Angular damping |
|
float |
0. |
Linear damping |
|
float |
1000. |
Maximum angular velocity |
|
float |
1000. |
Maximum linear velocity |
|
float |
0. |
Armature |
|
float |
0.01 |
Thickness |
rewards Reward Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
True |
Whether to only use positive rewards |
|
float |
0.25 |
Tracking reward calculation parameter: reward = exp(-error^2/sigma) |
|
float |
1. |
Joint position soft limit (percentage of URDF limits), values above this limit are penalized |
|
float |
1. |
Joint velocity soft limit |
|
float |
1. |
Torque soft limit |
|
float |
1. |
Target base height |
|
float |
0.04 |
Desired foot clearance above ground [m] |
|
float |
0.0 |
Height of the foot coordinate origin above ground [m] |
|
float |
0.01 |
Sigma value for foot clearance tracking |
scales Reward Weights#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
-0.0 |
Termination penalty weight |
|
float |
0 |
Linear velocity tracking reward weight |
|
float |
0 |
Angular velocity tracking reward weight |
|
float |
0 |
Z-direction linear velocity penalty weight |
|
float |
0 |
XY-plane angular velocity penalty weight |
|
float |
-0. |
Orientation penalty weight |
|
float |
0 |
Torque penalty weight |
|
float |
-0. |
Joint velocity penalty weight |
|
float |
0 |
Joint acceleration penalty weight |
|
float |
-0. |
Base height penalty weight |
|
float |
0 |
Feet air time reward weight |
|
float |
0 |
Collision penalty weight |
|
float |
-0.0 |
Feet stumble penalty weight |
|
float |
0 |
Action rate penalty weight |
|
float |
-0. |
Joint position penalty weight when standing still |
commands Command Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
False |
Whether to use command curriculum |
|
float |
1. |
Maximum curriculum level |
|
int |
4 |
Number of commands, default: lin_vel_x, lin_vel_y, ang_vel_yaw, heading |
|
float |
10. |
Time interval for command resampling [s] |
|
bool |
True |
If True: compute angular velocity command from heading error |
|
float |
0.8 |
Curriculum learning threshold, increase command range when tracking reward exceeds this threshold |
|
float |
0.4 |
Probability of sampling zero command during resampling, encourages robot to learn standing still behavior |
ranges Command Ranges#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
list |
[-1.0, 1.0] |
X-direction linear velocity range [m/s] |
|
list |
[-1.0, 1.0] |
Y-direction linear velocity range [m/s] |
|
list |
[-1, 1] |
Yaw angular velocity range [rad/s] |
|
list |
[-3.14, 3.14] |
Heading angle range [rad] |
domain_rand Domain Randomization Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
True |
Whether to randomize rigid body friction coefficient |
|
list |
[0.5, 1.25] |
Friction coefficient randomization range |
|
bool |
True |
Whether to randomize base mass |
|
list |
[-1., 1.] |
Added mass range [kg] |
|
bool |
True |
Whether to apply random velocity perturbations to the base |
|
float |
15 |
Time interval for pushing robots [s] |
|
float |
1. |
Maximum push velocity in XY plane [m/s] |
|
bool |
True |
Whether to randomize center of mass position to simulate modeling errors |
|
list |
[-0.01, 0.01] |
CoM x-position randomization range [m] |
|
list |
[-0.01, 0.01] |
CoM y-position randomization range [m] |
|
list |
[-0.01, 0.01] |
CoM z-position randomization range [m] |
|
bool |
False |
Whether to apply random delay to actions to simulate control loop latency |
|
list |
[0, 1] |
Control delay step range |
|
bool |
False |
Whether to randomize PD gains by a scale factor |
|
list |
[0.8, 1.2] |
Kp scale range |
|
list |
[0.8, 1.2] |
Kd scale range |
|
bool |
False |
Whether to randomize joint armature (significantly slows simulation in Genesis) |
|
list |
[0.0, 0.05] |
Joint armature range [Nms/rad] |
|
bool |
False |
Whether to randomize joint friction |
|
list |
[0.0, 0.1] |
Joint friction range |
|
bool |
False |
Whether to randomize joint damping |
|
list |
[0.0, 1.0] |
Joint damping range |
|
bool |
False |
Whether to apply random push forces to robot links |
|
float |
10.0 |
Maximum magnitude of random push force applied to each link [N] |
|
float |
15.0 |
Time interval between random pushes [s] |
normalization Normalization Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
100. |
Observation clipping range |
|
float |
100. |
Action clipping range |
obs_scales Observation Scales#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
1.0 |
Linear velocity observation scale factor |
|
float |
0.25 |
Angular velocity observation scale factor |
|
float |
1.0 |
Joint position observation scale factor |
|
float |
0.05 |
Joint velocity observation scale factor |
|
float |
5.0 |
Height measurement observation scale factor |
noise Noise Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
True |
Whether to add noise |
|
float |
1.0 |
Noise scale factor |
noise_scales Noise Scales#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
0.01 |
Joint position noise scale |
|
float |
0.5 |
Joint velocity noise scale |
|
float |
0.1 |
Linear velocity noise scale |
|
float |
0.2 |
Angular velocity noise scale |
|
float |
0.05 |
Gravity observation noise scale |
|
float |
0.1 |
Height measurement noise scale |
constraints Constraints Configuration#
Used for CaT (Constraints as Termination) method.
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
class |
- |
Constraint limits configuration |
viewer Viewer Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int |
0 |
Reference environment index |
|
list |
[4.0, 4.0, 2.0] |
Camera position relative to robot position [m] |
|
list |
[0., 0, 0.] |
Point the camera looks at (relative to robot position) [m] |
|
list |
[0, 1, 2, 3, 4] |
Genesis only: list of environment indices to render |
sensor Sensor Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
False |
Whether to add depth camera |
|
bool |
False |
Whether to use warp-based model |
depth_camera_config Depth Camera Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int |
1 |
Number of sensors |
|
int |
1 |
History frames for depth images |
|
float |
0.1 |
Near clipping plane |
|
float |
10.0 |
Far clipping plane |
|
float |
0.1 |
Near plane |
|
float |
10.0 |
Far plane |
|
tuple |
(80, 60) |
Image resolution |
|
float |
75 |
Horizontal field of view [degrees] |
|
tuple |
(0.3, 0.0, 0.1) |
Camera position |
|
tuple |
(0.0, 0.0, 0.0) |
Camera Euler angles |
|
int |
5 |
Decimation factor |
|
bool |
True |
Warp only: whether to calculate depth |
|
bool |
False |
Warp only: whether to use segmentation camera |
|
bool |
False |
Warp only: whether to return point cloud |
|
bool |
False |
Warp only: whether point cloud is in world frame |
sim Simulation Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
0.005 |
Simulation timestep, default 200Hz |
|
int |
1 |
Number of substeps |
|
int |
100 |
Genesis only: maximum number of collision pairs, more collision pairs will occupy more GPU memory and slow down simulation |
|
int |
2 |
Genesis only: number of IK targets, fewer targets will reduce memory usage |
|
list |
[0., 0., -9.81] |
IsaacGym only: gravity acceleration [m/s^2] |
|
int |
1 |
IsaacGym only: up axis direction, 0 for y, 1 for z |
|
bool |
True |
IsaacGym only: whether to use GPU pipeline |
physx PhysX Engine Configuration (IsaacGym only)#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
bool |
True |
Whether to use GPU |
|
int |
0 |
Number of subscenes |
|
int |
10 |
Number of threads |
|
int |
1 |
Solver type: 0 for pgs, 1 for tgs |
|
int |
4 |
Number of position iterations |
|
int |
0 |
Number of velocity iterations |
|
float |
0.01 |
Contact offset [m] |
|
float |
0.0 |
Rest offset [m] |
|
float |
0.5 |
Bounce threshold velocity [m/s] |
|
float |
1.0 |
Maximum depenetration velocity |
|
int |
2**23 |
Maximum GPU contact pairs |
|
int |
5 |
Default buffer size multiplier |
|
int |
2 |
Contact collection mode: 0 never, 1 last substep, 2 all substeps |
LeggedRobotCfgPPO#
PPO algorithm configuration class.
Basic Parameters#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int |
1 |
Random seed |
|
str |
‘OnPolicyRunner’ |
Runner class name |
policy Policy Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
LeggedRobotCfg.normalization.clip_actions |
Action clipping range |
|
float |
1.0 |
Initial noise standard deviation |
|
list |
[512, 256, 128] |
Actor network hidden layer dimensions |
|
list |
[512, 256, 128] |
Critic network hidden layer dimensions |
|
str |
‘elu’ |
Activation function, options: elu, relu, selu, crelu, lrelu, tanh, sigmoid |
RNN Configuration (for ActorCriticRecurrent only)#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
‘lstm’ |
RNN type |
|
int |
512 |
RNN hidden layer size |
|
int |
1 |
Number of RNN layers |
algorithm Algorithm Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
1.0 |
Value loss coefficient |
|
bool |
True |
Whether to use clipped value loss |
|
float |
0.2 |
PPO clipping parameter |
|
float |
0.01 |
Entropy coefficient |
|
int |
5 |
Number of learning epochs |
|
int |
4 |
Number of mini batches, mini batch size = num_envs * nsteps / nminibatches |
|
float |
1.e-3 |
Learning rate |
|
str |
‘adaptive’ |
Learning rate schedule: ‘adaptive’ or ‘fixed’ |
|
float |
0.99 |
Discount factor |
|
float |
0.95 |
GAE lambda parameter |
|
float |
0.01 |
Target KL divergence |
|
float |
1. |
Maximum gradient clipping norm |
|
bool |
False |
Whether to use SPO (Simple Policy Optimization). Note: SPO may be incompatible with default PPO parameters, recommended: learning_rate=2.5e-4, schedule=’fixed’ |
runner Runner Configuration#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
‘ActorCritic’ |
Policy class name |
|
str |
‘PPO’ |
Algorithm class name |
|
int |
24 |
Number of steps per environment per iteration |
|
int |
1500 |
Maximum number of policy update iterations |
|
bool |
False |
Whether to sync logs to wandb |
|
int |
50 |
Check for potential saves every this many iterations |
|
str |
‘test’ |
Experiment name |
|
str |
‘’ |
Run name |
|
bool |
False |
Whether to resume from checkpoint |
|
int |
-1 |
Run ID to load, -1 for last run |
|
int |
-1 |
Checkpoint ID, -1 for last saved model |
|
str |
None |
Resume path, updated from load_run and checkpoint |