Sim Scene Configuration#
This page explains the simulation scene API in python/rcs/envs/scenes.py and shows how the example scene configs in python/rcs/envs/configs.py fit together.
If you are new to the frame conventions first, read RCS Conventions.
world frame
└── root frame
├── all composed scene objects live here
└── shared base frame
(same kinematic node as root frame)
(common robot-coordinate frame)
├── i-th robot base
│ └── i-th robot attachment_site
├── j-th robot base
│ └── j-th robot attachment_site
└── ...
The simple mental model#
When building an RCS sim scene, it helps to think about five frames:
World frame
The global MuJoCo frame.
This is the outermost reference for the whole scene.
Root frame
The scene-local frame for the composed robot setup.
In the usual composed setup, all robot-scene objects are placed from here.
Shared base frame
The common coordinate frame used for all robot actions and observations.
It is attached to the same kinematic node as the root frame, but represents the common robot coordinate convention.
i-th robot base frame
The base frame of one specific robot.
Low-level kinematics and Cartesian commands are expressed here.
i-th robot
attachment_siteThe end-effector mounting frame for that robot, before any optional
tcp_offset.Use this for wrist-mounted objects, cameras, and tools.
A good rule of thumb is:
want the outer global reference -> world frame
want the composed scene placement frame -> root frame
want one common coordinate frame for all robots -> shared base frame
want per-robot kinematics -> i-th robot base frame
want wrist or tool mounting -> i-th robot
attachment_site
If needed, extra world_frame_objects can still be placed directly in world coordinates.
How the main placement frames work#
For composed robot scenes, SimEnvCreator.create_model() combines three scene-placement transforms in this order:
robot2world = root_frame_to_world * shared_base_frame_to_root_frame * robot_to_shared_base_frame[robot_name]
In simple terms:
root_frame_to_worldplaces the whole rig into the MuJoCo worldshared_base_frame_to_root_framedefines the shared robot coordinate frame relative to the root framerobot_to_shared_base_frameplaces each robot relative to that shared robot frame
Single-robot intuition#
For a simple single-arm scene, all three can often be identity transforms:
root_frame_to_world = rcs.common.Pose()
shared_base_frame_to_root_frame = rcs.common.Pose()
robot_to_shared_base_frame = {"robot": rcs.common.Pose()}
That means the robot base, shared base, root frame, and world origin all coincide.
Dual-arm intuition#
The FR3 duo example in python/rcs/envs/configs.py uses these frames more meaningfully:
root_frame_to_worldkeeps the whole duo rig aligned with the worldshared_base_frame_to_root_framelifts the shared base to the duo mount heightrobot_to_shared_base_frameoffsets the left and right robots from the center
That is why the dual-arm example can expose one common action frame while still placing each robot correctly.
SimEnvCreatorConfig keys#
This is the main top-level config for the scene API.
Key |
What it controls |
Typical use |
|---|---|---|
|
Maps robot names to |
Define one robot or multiple named robots such as |
|
MuJoCo runtime settings ( |
Realtime, async control, frequency, convergence behavior |
|
Action representation |
For example |
|
Optional task-specific config |
Add pick/place or other task logic |
|
Base scene XML path or scene key |
Usually from |
|
Optional gripper config per robot |
Add one gripper per robot |
|
Optional camera config dictionary |
Define resolution, type, and frame rate for named cameras |
|
Relative action limit |
Limit per-step Cartesian delta |
|
Relative action reference |
Usually |
|
Per-robot offset relative to the shared base frame |
Multi-robot layouts |
|
Add gravity compensation to the composed scene |
Often useful for manipulation scenes |
|
Wrapper behavior flags |
Binary gripper, home-on-reset, depth output |
|
GUI toggle |
|
|
Offset from shared base frame to root frame |
Move shared command origin inside the rig |
|
Offset from root frame to MuJoCo world |
Place the whole setup in the room |
|
Use a pre-combined robot MJCF instead of composing robots one by one |
Advanced custom scenes |
|
Objects placed directly in world coordinates |
Loose props, room-fixed assets |
|
Objects placed in root-frame coordinates |
Tables, mounts, fixtures that should move with the rig |
|
Objects attached in a robot attachment-site frame |
Wrist mounts, end-effector payloads |
|
Cameras to add to the scene |
Fixed overhead cameras or wrist cameras |
|
Pose offsets for mounted grippers |
Align visual or tool frames |
|
Internal helper used after prefixing |
Usually ignore this in user code |
What usually lives inside the nested configs#
The scene config mostly wires together three lower-level config types.
robot_cfgs: dict[str, SimRobotConfig]#
Each robot entry usually defines things such as:
robot type
kinematic model path
attachment_sitetcp_offsetjoint names and actuator names
base link name
degrees of freedom, joint limits, and
q_home
The single-arm and dual-arm examples in python/rcs/envs/configs.py are good templates for this.
gripper_cfgs: dict[str, SimGripperConfig]#
Each gripper entry usually defines:
gripper type
gripper joint names and actuator name
min/max width or actuator range
collision geometry settings
callback timing
camera_cfgs: dict[str, SimCameraConfig]#
Each camera entry usually defines:
camera identifier
camera type
resolution
frame rate
A useful pattern is:
camera_cfgsdefines the camera runtime propertiescamera_addsdefines where that camera is placed in the scene
The most important nested configs#
WrapperConfig#
WrapperConfig controls behavior of the environment wrappers around the raw simulation.
Key |
Meaning |
|---|---|
|
If |
|
If |
|
If |
CameraAdderConfig#
CameraAdderConfig describes how a camera is added to the scene.
Key |
Meaning |
|---|---|
|
Optional camera XML asset to insert directly |
|
Camera field of view, used when creating a camera directly |
|
Camera pose offset |
|
Attachment site to use if mounted on a robot |
|
If set, mount the camera on that robot; otherwise add it as a scene camera |
The important frame detail is:
if
robot_nameis not set,offsetis interpreted in the root frame and then moved into world byroot_frame_to_worldif
robot_nameis set,offsetis interpreted relative to that robot’s attachment site
Easy examples#
Minimal single-robot scene#
This is the basic shape used by EmptyWorldFR3 in python/rcs/envs/configs.py:
cfg = SimEnvCreatorConfig(
robot_cfgs={"robot": robot_cfg},
sim_cfg=SimConfig(async_control=False, realtime=True, frequency=1),
control_mode=ControlMode.CARTESIAN_TQuat,
scene=SCENE_PATHS["empty_world"],
gripper_cfgs={"robot": gripper_cfg},
camera_cfgs={"bird_eye": bird_eye_cfg, "wrist": wrist_cfg},
robot_to_shared_base_frame={"robot": rcs.common.Pose()},
shared_base_frame_to_root_frame=rcs.common.Pose(),
root_frame_to_world=rcs.common.Pose(),
)
What this means in plain language:
there is one robot named
robotit uses Cartesian
tquatactionsthe base scene is the empty world
there is one gripper on the robot
there are two cameras
all high-level frames start at the same origin
Dual-arm scene#
This is the important part of the EmptyWorldFR3Duo example:
robot_cfgs = {"left": robot_cfg_left, "right": robot_cfg_right}
robot_to_shared_base_frame = {
"left": DEFAULT_TRANSFORMS["FR3_DUOMOUNT_LEFT_ROBOT"],
"right": DEFAULT_TRANSFORMS["FR3_DUOMOUNT_RIGHT_ROBOT"],
}
shared_base_frame_to_root_frame = DEFAULT_TRANSFORMS["FR3_DUOMOUNT_HEIGHT_OFFSET"]
root_frame_to_world = rcs.common.Pose()
In plain language:
the shared base frame sits at the logical center of the duo setup
the left and right robot bases are offset from that center
the whole setup can still be moved together by changing
root_frame_to_world
Object placement: which dictionary should I use?#
world_frame_objects#
Use this when the object should stay fixed in the room.
world_frame_objects = {
"cube": (OBJECT_PATHS["green_cube"], rcs.common.Pose(translation=np.array([0.5, 0.0, 0.2]))),
}
Example meaning: place a cube at a fixed world position.
root_frame_objects#
Use this when the object belongs to the rig and should move together with it.
root_frame_objects = {
"duo_mount": (OBJECT_PATHS["fr3_duo_mount"], DEFAULT_TRANSFORMS["FR3_DUOMOUNT_BASE"]),
}
Example meaning: the duo mount is part of the setup, not a free world object.
robot_frame_objects#
Use this when the object should be attached to one robot’s tool frame.
robot_frame_objects = {
"left": {
"left_d405_mount": (
OBJECT_PATHS["robotiq_d405_mount"],
DEFAULT_TRANSFORMS["FR3_ROBOTIQ_WRIST_D405_MOUNT"],
)
}
}
Example meaning: attach a wrist mount to the left robot only.
Camera depth units#
When depth is enabled, the camera wrapper exposes depth images as scaled metric depth:
sim depth is first converted to meters in
python/rcs/camera/sim.pycamera frames use
BaseCameraSet.DEPTH_SCALE = 1000depth is then stored as
uint16
So in practice:
divide by
1000to get metersor read the values directly as millimeters
Example:
depth[y, x] == 1500means the point is about 1.5 m away from the camera
Camera placement#
Fixed scene camera#
This pattern from EmptyWorldFR3 adds an overhead camera:
camera_adds = {
"bird_eye": CameraAdderConfig(
fovy=60.0,
offset=rcs.common.Pose(
translation=np.array([0.271, 0.0, 2.080]),
quaternion=np.array([0.0060, -0.0060, -0.7067, 0.7074]),
),
)
}
Because robot_name is not set, this pose is interpreted in the root frame.
Wrist camera#
This pattern mounts a camera to a robot:
camera_adds = {
"wrist": CameraAdderConfig(
fovy=60.0,
offset=some_pose,
robot_name="robot",
)
}
Because robot_name is set, offset is interpreted relative to that robot’s attachment site.
Common mistakes#
Mixing up world and root frame
If the whole rig should move together, use
root_frame_to_worldorroot_frame_objects, notworld_frame_objects.
Using the wrong frame for camera offsets
Scene cameras use root-frame offsets.
Robot-mounted cameras use attachment-site offsets.
Forgetting matching camera names
If a camera is added without
xml_path, its name must also exist incamera_cfgs.
Putting wrist-mounted assets into world objects
Use
robot_frame_objectsfor things that should follow the robot wrist.
Using
alternative_combined_robot_mjcfwithout the expected prefixesThe docstring in
scenes.pyrequires names likerobot{robot_name}.
A practical workflow#
When building a new scene, this usually works well:
Start with one robot and identity transforms.
Add
root_frame_objectsfor mounts or fixtures.Add
world_frame_objectsonly for room-fixed props.Add
robot_frame_objectsfor wrist payloads.Add cameras with
camera_adds.Only then introduce non-trivial
shared_base_frame_to_root_frameandrobot_to_shared_base_frameoffsets.
That order keeps the frame reasoning much easier.