legged_robot_config.py#

This document provides detailed descriptions of all configuration parameters defined in legged_robot_config.py.


LeggedRobotCfg#

env Environment Configuration#

Parameter

Type

Default

Description

num_envs

int

4096

Number of parallel environments

num_observations

int

48

Size of the observation vector

num_privileged_obs

int

None

Number of privileged observations. If not None, step() returns priviledge_obs_buf (critic obs for asymmetric training), otherwise returns None

num_actions

int

12

Size of the action vector

send_timeouts

bool

True

Whether to send timeout information to the algorithm

episode_length_s

float

20

Episode length in seconds

env_spacing

float

2.0

Spacing between environments in the scene, only for plane terrain

fail_to_terminal_time_s

float

0.1

Time before a fail state leads to environment reset (seconds)

debug

bool

False

Whether to enable debug drawings in the simulator

debug_draw_height_points_around_base

bool

False

Whether to obtain height measurements around the base

debug_draw_height_points_around_feet

bool

False

Whether to obtain height measurements around the feet (9 points around each foot)

debug_draw_terrain_height_points

bool

False

Whether to draw all height points of the terrain

debug_draw_key_body_points

bool

False

Whether to draw key body points for mimic tasks

max_projected_gravity

float

-0.1

Maximum allowed projected gravity in z-axis direction


terrain Terrain Configuration#

Parameter

Type

Default

Description

mesh_type

str

‘plane’

Terrain mesh type: ‘plane’, ‘heightfield’, ‘trimesh’

plane_length

float

200.0

Plane size [m], default is 200x200x10

horizontal_scale

float

0.1

Distance between height samples in x and y direction [m]

vertical_scale

float

0.005

Distance between height samples in z direction [m]

border_size

float

5

Length of the border surrounding the terrain [m]

border_height

float

1.0

Height of the border surrounding the terrain [m]

curriculum

bool

False

Whether to use terrain curriculum, starting from easier terrains and gradually increasing difficulty

static_friction

float

1.0

Coefficient of static friction of the terrain

dynamic_friction

float

1.0

Coefficient of dynamic friction of the terrain

restitution

float

0.

Coefficient of restitution of the terrain

obtain_terrain_info_around_feet

bool

False

Whether to obtain terrain height information around feet (default: 9 points around each foot)

measure_heights

bool

False

Whether to obtain height measurements

measured_points_x

list

[-0.8, …, 0.8]

X positions of height sampling around the base (relative to robot base), 1m x 1.6m rectangular area

measured_points_y

list

[-0.5, …, 0.5]

Y positions of height sampling around the base (relative to robot base)

selected

bool

False

Whether to select a unique terrain type and pass all arguments

terrain_kwargs

dict

None

Dictionary of arguments for selected terrain

max_init_terrain_level

int

1

Starting curriculum level

terrain_length

float

6.0

Length of each subterrain [m] (X direction)

terrain_width

float

6.0

Width of each subterrain [m] (Y direction)

platform_size

float

3.0

Size of the flat platform at the center of each subterrain [m]

num_rows

int

4

Number of terrain rows (levels), X direction

num_cols

int

4

Number of terrain columns (types), Y direction

terrain_proportions

list

[0.1, 0.1, 0.35, 0.25, 0.2]

Terrain type proportions: [smooth slope, rough slope, stairs up, stairs down, discrete]

slope_treshold

float

0.75

Trimesh only: slopes above this threshold will be corrected to vertical surfaces


init_state Initial State Configuration#

Parameter

Type

Default

Description

pos

list

[0.0, 0.0, 1.]

Initial position x, y, z [m]

rot

list

[0.0, 0.0, 0.0, 1.0]

Initial orientation quaternion x, y, z, w. Note: Gym format is xyzw, Genesis format is wxyz

lin_vel

list

[0.0, 0.0, 0.0]

Initial linear velocity x, y, z [m/s]

ang_vel

list

[0.0, 0.0, 0.0]

Initial angular velocity x, y, z [rad/s]

roll_random_scale

float

0.0

Roll angle randomization range

pitch_random_scale

float

0.0

Pitch angle randomization range

yaw_random_scale

float

0.0

Yaw angle randomization range

default_joint_angles

dict

{“joint_a”: 0., “joint_b”: 0.}

Target joint angles when action = 0.0


control Control Configuration#

Parameter

Type

Default

Description

control_type

str

‘P’

Control type: ‘P’ (position), ‘V’ (velocity), ‘T’ (torque)

stiffness

dict

{‘joint_a’: 10.0, ‘joint_b’: 15.}

PD control stiffness [N*m/rad]

damping

dict

{‘joint_a’: 1.0, ‘joint_b’: 1.5}

PD control damping [Nms/rad]

action_scale

float

0.5

Action scale factor, target angle = actionScale * action + defaultAngle

dt

float

0.02

Control frequency, default 50Hz

decimation

int

4

Ratio of control action updates to policy updates


asset Asset Configuration#

Parameter

Type

Default

Description

name

str

None

Asset name

file

str

“”

URDF file path

foot_name

str

“”

Foot body name, bodies containing this substring will be considered as feet

key_bodies

list

[]

List of important bodies to be tracked in mimic tasks

penalize_contacts_on

list

[]

Penalize contacts on links containing these substrings

terminate_after_contacts_on

list

[]

Terminate episode after contacts on links containing these substrings

fix_base_link

bool

False

Whether to fix the base link to the world frame

obtain_link_contact_states

bool

False

Whether to obtain contact states of specific links, can be used for privileged policy

contact_state_link_names

list

[“thigh”, “calf”, “foot”]

Link names for obtaining contact states

base_link_name

str

“”

Full name of the base link

self_collisions

int

0

Self-collision setting: 1 to disable, 0 to enable (bitwise filter)

dof_names

list

[“joint_a”, “joint_b”]

Sequence of DOFs in actions and observations

dof_armature

list

[0.0]

Armature of each DOF

links_to_keep

list

[]

Genesis only: links that are not merged due to fixed joints

dof_vel_limits

list

[]

DOF velocity limits [rad/s], obtained from URDF

disable_gravity

bool

False

IsaacGym/IsaacLab only: whether to disable gravity

collapse_fixed_joints

bool

True

IsaacGym/IsaacLab only: whether to merge bodies connected by fixed joints

default_dof_drive_mode

int

3

IsaacGym/IsaacLab only: DOF drive mode (0 none, 1 position target, 2 velocity target, 3 effort)

replace_cylinder_with_capsule

bool

False

IsaacGym/IsaacLab only: replace collision cylinders with capsules, improves simulation speed and stability

flip_visual_attachments

bool

False

IsaacGym/IsaacLab only: whether to flip certain .obj meshes from y-up to z-up

density

float

0.001

Density

angular_damping

float

0.

Angular damping

linear_damping

float

0.

Linear damping

max_angular_velocity

float

1000.

Maximum angular velocity

max_linear_velocity

float

1000.

Maximum linear velocity

armature

float

0.

Armature

thickness

float

0.01

Thickness


rewards Reward Configuration#

Parameter

Type

Default

Description

only_positive_rewards

bool

True

Whether to only use positive rewards

tracking_sigma

float

0.25

Tracking reward calculation parameter: reward = exp(-error^2/sigma)

soft_dof_pos_limit

float

1.

Joint position soft limit (percentage of URDF limits), values above this limit are penalized

soft_dof_vel_limit

float

1.

Joint velocity soft limit

soft_torque_limit

float

1.

Torque soft limit

base_height_target

float

1.

Target base height

foot_clearance_target

float

0.04

Desired foot clearance above ground [m]

foot_height_offset

float

0.0

Height of the foot coordinate origin above ground [m]

foot_clearance_tracking_sigma

float

0.01

Sigma value for foot clearance tracking

scales Reward Weights#

Parameter

Type

Default

Description

termination

float

-0.0

Termination penalty weight

tracking_lin_vel

float

0

Linear velocity tracking reward weight

tracking_ang_vel

float

0

Angular velocity tracking reward weight

lin_vel_z

float

0

Z-direction linear velocity penalty weight

ang_vel_xy

float

0

XY-plane angular velocity penalty weight

orientation

float

-0.

Orientation penalty weight

torques

float

0

Torque penalty weight

dof_vel

float

-0.

Joint velocity penalty weight

dof_acc

float

0

Joint acceleration penalty weight

base_height

float

-0.

Base height penalty weight

feet_air_time

float

0

Feet air time reward weight

collision

float

0

Collision penalty weight

feet_stumble

float

-0.0

Feet stumble penalty weight

action_rate

float

0

Action rate penalty weight

dof_pos_stand_still

float

-0.

Joint position penalty weight when standing still


commands Command Configuration#

Parameter

Type

Default

Description

curriculum

bool

False

Whether to use command curriculum

max_curriculum

float

1.

Maximum curriculum level

num_commands

int

4

Number of commands, default: lin_vel_x, lin_vel_y, ang_vel_yaw, heading

resampling_time

float

10.

Time interval for command resampling [s]

heading_command

bool

True

If True: compute angular velocity command from heading error

curriculum_threshold

float

0.8

Curriculum learning threshold, increase command range when tracking reward exceeds this threshold

zero_cmd_prob

float

0.4

Probability of sampling zero command during resampling, encourages robot to learn standing still behavior

ranges Command Ranges#

Parameter

Type

Default

Description

lin_vel_x

list

[-1.0, 1.0]

X-direction linear velocity range [m/s]

lin_vel_y

list

[-1.0, 1.0]

Y-direction linear velocity range [m/s]

ang_vel_yaw

list

[-1, 1]

Yaw angular velocity range [rad/s]

heading

list

[-3.14, 3.14]

Heading angle range [rad]


domain_rand Domain Randomization Configuration#

Parameter

Type

Default

Description

randomize_friction

bool

True

Whether to randomize rigid body friction coefficient

friction_range

list

[0.5, 1.25]

Friction coefficient randomization range

randomize_base_mass

bool

True

Whether to randomize base mass

added_mass_range

list

[-1., 1.]

Added mass range [kg]

push_robots

bool

True

Whether to apply random velocity perturbations to the base

push_interval_s

float

15

Time interval for pushing robots [s]

max_push_vel_xy

float

1.

Maximum push velocity in XY plane [m/s]

randomize_com_displacement

bool

True

Whether to randomize center of mass position to simulate modeling errors

com_pos_x_range

list

[-0.01, 0.01]

CoM x-position randomization range [m]

com_pos_y_range

list

[-0.01, 0.01]

CoM y-position randomization range [m]

com_pos_z_range

list

[-0.01, 0.01]

CoM z-position randomization range [m]

randomize_ctrl_delay

bool

False

Whether to apply random delay to actions to simulate control loop latency

ctrl_delay_step_range

list

[0, 1]

Control delay step range

randomize_pd_gain

bool

False

Whether to randomize PD gains by a scale factor

kp_range

list

[0.8, 1.2]

Kp scale range

kd_range

list

[0.8, 1.2]

Kd scale range

randomize_joint_armature

bool

False

Whether to randomize joint armature (significantly slows simulation in Genesis)

joint_armature_range

list

[0.0, 0.05]

Joint armature range [Nms/rad]

randomize_joint_friction

bool

False

Whether to randomize joint friction

joint_friction_range

list

[0.0, 0.1]

Joint friction range

randomize_joint_damping

bool

False

Whether to randomize joint damping

joint_damping_range

list

[0.0, 1.0]

Joint damping range

push_links

bool

False

Whether to apply random push forces to robot links

max_push_force

float

10.0

Maximum magnitude of random push force applied to each link [N]

push_links_interval_s

float

15.0

Time interval between random pushes [s]


normalization Normalization Configuration#

Parameter

Type

Default

Description

clip_observations

float

100.

Observation clipping range

clip_actions

float

100.

Action clipping range

obs_scales Observation Scales#

Parameter

Type

Default

Description

lin_vel

float

1.0

Linear velocity observation scale factor

ang_vel

float

0.25

Angular velocity observation scale factor

dof_pos

float

1.0

Joint position observation scale factor

dof_vel

float

0.05

Joint velocity observation scale factor

height_measurements

float

5.0

Height measurement observation scale factor


noise Noise Configuration#

Parameter

Type

Default

Description

add_noise

bool

True

Whether to add noise

noise_level

float

1.0

Noise scale factor

noise_scales Noise Scales#

Parameter

Type

Default

Description

dof_pos

float

0.01

Joint position noise scale

dof_vel

float

0.5

Joint velocity noise scale

lin_vel

float

0.1

Linear velocity noise scale

ang_vel

float

0.2

Angular velocity noise scale

gravity

float

0.05

Gravity observation noise scale

height_measurements

float

0.1

Height measurement noise scale


constraints Constraints Configuration#

Used for CaT (Constraints as Termination) method.

Parameter

Type

Default

Description

limits

class

-

Constraint limits configuration


viewer Viewer Configuration#

Parameter

Type

Default

Description

ref_env

int

0

Reference environment index

pos

list

[4.0, 4.0, 2.0]

Camera position relative to robot position [m]

lookat

list

[0., 0, 0.]

Point the camera looks at (relative to robot position) [m]

rendered_envs_idx

list

[0, 1, 2, 3, 4]

Genesis only: list of environment indices to render


sensor Sensor Configuration#

Parameter

Type

Default

Description

add_depth

bool

False

Whether to add depth camera

use_warp

bool

False

Whether to use warp-based model

depth_camera_config Depth Camera Configuration#

Parameter

Type

Default

Description

num_sensors

int

1

Number of sensors

num_history

int

1

History frames for depth images

near_clip

float

0.1

Near clipping plane

far_clip

float

10.0

Far clipping plane

near_plane

float

0.1

Near plane

far_plane

float

10.0

Far plane

resolution

tuple

(80, 60)

Image resolution

horizontal_fov_deg

float

75

Horizontal field of view [degrees]

pos

tuple

(0.3, 0.0, 0.1)

Camera position

euler

tuple

(0.0, 0.0, 0.0)

Camera Euler angles

decimation

int

5

Decimation factor

calculate_depth

bool

True

Warp only: whether to calculate depth

segmentation_camera

bool

False

Warp only: whether to use segmentation camera

return_pointcloud

bool

False

Warp only: whether to return point cloud

pointcloud_in_world_frame

bool

False

Warp only: whether point cloud is in world frame


sim Simulation Configuration#

Parameter

Type

Default

Description

dt

float

0.005

Simulation timestep, default 200Hz

substeps

int

1

Number of substeps

max_collision_pairs

int

100

Genesis only: maximum number of collision pairs, more collision pairs will occupy more GPU memory and slow down simulation

IK_max_targets

int

2

Genesis only: number of IK targets, fewer targets will reduce memory usage

gravity

list

[0., 0., -9.81]

IsaacGym only: gravity acceleration [m/s^2]

up_axis

int

1

IsaacGym only: up axis direction, 0 for y, 1 for z

use_gpu_pipeline

bool

True

IsaacGym only: whether to use GPU pipeline

physx PhysX Engine Configuration (IsaacGym only)#

Parameter

Type

Default

Description

use_gpu

bool

True

Whether to use GPU

num_subscenes

int

0

Number of subscenes

num_threads

int

10

Number of threads

solver_type

int

1

Solver type: 0 for pgs, 1 for tgs

num_position_iterations

int

4

Number of position iterations

num_velocity_iterations

int

0

Number of velocity iterations

contact_offset

float

0.01

Contact offset [m]

rest_offset

float

0.0

Rest offset [m]

bounce_threshold_velocity

float

0.5

Bounce threshold velocity [m/s]

max_depenetration_velocity

float

1.0

Maximum depenetration velocity

max_gpu_contact_pairs

int

2**23

Maximum GPU contact pairs

default_buffer_size_multiplier

int

5

Default buffer size multiplier

contact_collection

int

2

Contact collection mode: 0 never, 1 last substep, 2 all substeps


LeggedRobotCfgPPO#

PPO algorithm configuration class.

Basic Parameters#

Parameter

Type

Default

Description

seed

int

1

Random seed

runner_class_name

str

‘OnPolicyRunner’

Runner class name


policy Policy Configuration#

Parameter

Type

Default

Description

clip_actions

float

LeggedRobotCfg.normalization.clip_actions

Action clipping range

init_noise_std

float

1.0

Initial noise standard deviation

actor_hidden_dims

list

[512, 256, 128]

Actor network hidden layer dimensions

critic_hidden_dims

list

[512, 256, 128]

Critic network hidden layer dimensions

activation

str

‘elu’

Activation function, options: elu, relu, selu, crelu, lrelu, tanh, sigmoid

RNN Configuration (for ActorCriticRecurrent only)#

Parameter

Type

Default

Description

rnn_type

str

‘lstm’

RNN type

rnn_hidden_size

int

512

RNN hidden layer size

rnn_num_layers

int

1

Number of RNN layers


algorithm Algorithm Configuration#

Parameter

Type

Default

Description

value_loss_coef

float

1.0

Value loss coefficient

use_clipped_value_loss

bool

True

Whether to use clipped value loss

clip_param

float

0.2

PPO clipping parameter

entropy_coef

float

0.01

Entropy coefficient

num_learning_epochs

int

5

Number of learning epochs

num_mini_batches

int

4

Number of mini batches, mini batch size = num_envs * nsteps / nminibatches

learning_rate

float

1.e-3

Learning rate

schedule

str

‘adaptive’

Learning rate schedule: ‘adaptive’ or ‘fixed’

gamma

float

0.99

Discount factor

lam

float

0.95

GAE lambda parameter

desired_kl

float

0.01

Target KL divergence

max_grad_norm

float

1.

Maximum gradient clipping norm

use_spo

bool

False

Whether to use SPO (Simple Policy Optimization). Note: SPO may be incompatible with default PPO parameters, recommended: learning_rate=2.5e-4, schedule=’fixed’


runner Runner Configuration#

Parameter

Type

Default

Description

policy_class_name

str

‘ActorCritic’

Policy class name

algorithm_class_name

str

‘PPO’

Algorithm class name

num_steps_per_env

int

24

Number of steps per environment per iteration

max_iterations

int

1500

Maximum number of policy update iterations

sync_wandb

bool

False

Whether to sync logs to wandb

save_interval

int

50

Check for potential saves every this many iterations

experiment_name

str

‘test’

Experiment name

run_name

str

‘’

Run name

resume

bool

False

Whether to resume from checkpoint

load_run

int

-1

Run ID to load, -1 for last run

checkpoint

int

-1

Checkpoint ID, -1 for last saved model

resume_path

str

None

Resume path, updated from load_run and checkpoint