Isaac Sim 4.5 Domain Randomization: Production Tutorial (2026)

Sim-trained robot policies still fall over the first time they meet a fluorescent tube they have not seen in simulation. That gap — the sim-to-real gap — is the single biggest reason promising RL policies die between a demo reel and a deployed cell. Isaac Sim 4.5 domain randomization, built on OpenUSD and the rebuilt omni.replicator.core graph, is the most direct lever we have to close it in 2026 without hand-labeling another terabyte of real images.

This tutorial is the one I wish I had when I first wired Isaac Lab 2.0 into a four-node H100 training cluster. It is hands-on: by the end you will have a reproducible Isaac Sim 4.5 project that randomizes lighting, materials, masses, friction, and sensor noise across 4,096 vectorized environments, exports a policy to ONNX, and validates it on a held-out sim-to-real bench. No marketing diagrams — just code, configs, and the gotchas that cost me two weekends so they do not cost you one.

What this post covers: the Isaac Sim 4.5 → Isaac Lab → policy stack, a working setup script, three real Replicator randomization recipes, training and validation on RTX GPUs, the trade-offs nobody puts in slide decks, and a deployment checklist.

Why Sim-to-Real Still Hurts in 2026

The sim-to-real gap is not one problem; it is four stacked problems. Visual gap — the renderer’s photons do not match the camera’s photons. Dynamics gap — PhysX assumptions about contact, friction, and motor torque differ from steel-on-steel reality. Sensor gap — your sim depth camera is noise-free; the real D455 is not. Distribution gap — the training scene is one of a thousand possible warehouses.

Domain randomization (DR), first formalized by Tobin et al. (2017) and refined through ADR (Akkaya et al., 2019) and DeepMind’s quadruped work, treats the real world as just another sample from a wide distribution the policy already saw. If the policy is robust to every lighting condition between 500 and 5,000 lux, the warehouse’s actual 1,800-lux fluorescents are inside the training distribution by construction.

Why Isaac Sim 4.5 specifically? Three reasons. First, the entire scene representation is OpenUSD — every material, light, and joint is a USD attribute that Replicator can rewrite per-frame without rebuilding the stage. Second, the renderer is RTX path-tracing with real-time fallback, so randomized materials produce photometrically plausible images, not flat-shaded approximations. Third, Isaac Lab 2.0 wraps the simulator in a GPU-tensor environment API that scales linearly to thousands of parallel envs on a single H100, making DR cheap enough to do at training scale rather than as a post-hoc data augmentation.

The version matters. Isaac Sim 4.5 ships on Kit 106.5, PhysX 5.4, and tightly couples with Isaac Lab 2.0. The Replicator graph was rewritten between 4.2 and 4.5 — recipes from older NVIDIA samples often break silently. This tutorial targets 4.5 exclusively.

A short history check helps frame why we are doing this here and not in MuJoCo or Gazebo. MuJoCo MJX is excellent for fast physics-only DR and is what you want if your policy is state-based. Gazebo Ignition with Fortress works for ROS-native teams who do not need photoreal visuals. Isaac Sim’s value proposition is the combination: photoreal rendering, GPU-batched PhysX, and a USD-native asset pipeline that survives going to a real PLM system. If your stack is already on OpenUSD — and increasingly the entire DCC ecosystem is in 2026 — Isaac Sim is the only choice that does not force a re-authoring step.

System Overview: Isaac Sim, Isaac Lab, and the DR Loop

Before you write a single randomization line, it pays to internalize how the four layers — assets, simulator, environment manager, and policy trainer — actually fit together. The diagram below is the mental model I use every time I debug a stuck training run.

Assets live as .usd/.usdc files (scene), URDF/MJCF (robot), and MaterialX/MDL (surfaces). Everything is composed in an OpenUSD stage at load time. Isaac Sim 4.5 runs the stage: Kit 106.5 is the application shell, RTX is the renderer, PhysX 5.4 GPU is the dynamics engine, and omni.replicator.core is the randomization graph that writes USD attributes between simulation steps. Isaac Lab 2.0 wraps that simulator with three managers — InteractiveScene (USD + sensors), EnvManager (vectorized envs), and EventManager (hooks DR onto reset/interval). Policy training is a separate process (RSL-RL or RL Games) that consumes the GPU tensor batches Isaac Lab emits and runs PPO/SAC, exporting the final policy to ONNX/TensorRT for the real robot.

Replicator is the part most tutorials gloss over. It is not a Python loop that randomizes things; it is a scene graph that, once registered, fires automatically on triggers you define (reset, interval, frame). This matters because doing randomization in a naive env.reset() Python loop is 8–12× slower at 4,096 envs than letting Replicator’s graph batch-write USD attributes on the GPU.

Prerequisites

Isaac Sim 4.5 (workstation install or container — installation guide).
Isaac Lab 2.0 (GitHub; install via ./isaaclab.sh --install).
Python 3.10+ (3.10.12 tested; Kit 106.5 ships its own interpreter — do not mix system Python).
RTX GPU with ≥16 GB VRAM. Ampere or Ada strongly recommended; Turing works for small batches only. I run RTX 6000 Ada (48 GB) on the dev box and 2× H100 80 GB per worker in the cluster.
Linux (Ubuntu 22.04 LTS or 24.04). Windows works for development but headless training requires Linux + Vulkan. A Linux container (Enroot/Singularity) on a slurm cluster is the production path.
Driver ≥ 550.54 for full RTX 4.5 features.

Setup Script

Drop this in setup_env.sh at the project root and run once:

#!/usr/bin/env bash
set -euo pipefail

# 1. Clone Isaac Lab 2.0 and pin to the 4.5-compatible tag
git clone https://github.com/isaac-sim/IsaacLab.git
cd IsaacLab
git checkout v2.0.2  # 4.5-compatible tag

# 2. Symlink your Isaac Sim 4.5 install
ln -s "${HOME}/.local/share/ov/pkg/isaac-sim-4.5.0" _isaac_sim

# 3. Install Isaac Lab and dev extras into Kit's Python
./isaaclab.sh --install rsl_rl

# 4. Sanity check: render a single frame headless
./isaaclab.sh -p source/standalone/tutorials/00_sim/spawn_prims.py \
  --headless --enable_cameras

If the last command returns a .png and exits cleanly, you have a working install. If it segfaults at startup, you are almost certainly on a driver < 550 or missing Vulkan loader — fix that before going further.

Project Structure

Keep the project predictable. This is the layout I use across teams:

isaac-dr-tutorial/
├── envs/
│   ├── warehouse_pick/
│   │   ├── warehouse_pick_env_cfg.py   # env config (Isaac Lab cfg)
│   │   ├── mdp/                        # rewards, terminations, observations
│   │   └── usd/                        # USD scene + robot
├── randomization/
│   ├── lighting_dr.py
│   ├── material_dr.py
│   └── physics_dr.py
├── train/
│   ├── train.py                        # RSL-RL entrypoint
│   └── eval.py                         # sim-to-real eval harness
├── deploy/
│   ├── export_onnx.py
│   └── tensorrt_engine.py
└── setup_env.sh

randomization/ is where every DR recipe lives. Keep them as standalone modules — that way you can A/B test “lighting only” vs “lighting + materials” without surgery.

Step-by-Step: Three Domain Randomization Recipes

DR in Isaac Sim 4.5 happens in two places. Continuous parameters that PhysX or RTX cares about (mass, friction, light intensity, material albedo) are best set through Isaac Lab’s EventManager with Replicator distributions, because they need to be in sync with the GPU tensor view. One-shot stage randomizations (asset swaps, HDRI rotation) go directly into the Replicator graph.

The diagram above shows the full per-reset flow: sample DR parameters from the configured distributions, push them through the Replicator graph, render the new observation, and start the rollout. The key insight is that DR is a single graph evaluation, not 4,096 Python loops — that is what makes it tractable at scale.

Recipe 1: Lighting Randomization

Lighting is the highest-leverage DR knob. A policy that learned only one lighting condition will fail spectacularly under fluorescent flicker, sunlight through a skylight, or amber loading-dock lamps. The recipe below randomizes intensity, color temperature, and HDRI rotation per reset across a six-state distribution (studio, warehouse fluorescent, outdoor noon, outdoor dusk, indoor warm, mixed specular).

# randomization/lighting_dr.py
import omni.replicator.core as rep
from isaaclab.utils import configclass
from isaaclab.managers import EventTermCfg, SceneEntityCfg
from isaaclab.envs.mdp import randomize_visual_light_attributes

@configclass
class LightingDRCfg:
    """Lighting DR registered with Isaac Lab's EventManager."""

    reset_dome_intensity = EventTermCfg(
        func=randomize_visual_light_attributes,
        mode="reset",
        params={
            "asset_cfg": SceneEntityCfg("dome_light"),
            "intensity_range": (500.0, 5000.0),         # lux
            "color_temperature_range": (3500.0, 7500.0),# Kelvin
            "rotation_range": ((0.0, 360.0),
                               (0.0, 360.0),
                               (0.0, 360.0)),
        },
    )


def build_lighting_replicator_graph(stage_dome_path: str = "/World/DomeLight"):
    """One-shot HDRI swap via Replicator. Runs on env init + every N resets."""
    hdri_paths = [
        "/Isaac/Environments/Studio/studio_4k.hdr",
        "/Isaac/Environments/Warehouse/warehouse_flicker.hdr",
        "/Isaac/Environments/Outdoor/sun_noon.hdr",
        "/Isaac/Environments/Outdoor/sun_dusk.hdr",
        "/Isaac/Environments/Indoor/warm_bulbs.hdr",
        "/Isaac/Environments/Mixed/spotlit_room.hdr",
    ]

    with rep.trigger.on_frame(interval=200):
        dome = rep.get.prim_at_path(stage_dome_path)
        with dome:
            rep.modify.attribute("inputs:texture:file",
                                 rep.distribution.choice(hdri_paths))
            rep.modify.attribute("inputs:intensity",
                                 rep.distribution.uniform(800.0, 4500.0))
    return rep.orchestrator

Two things to flag. randomize_visual_light_attributes is the Isaac Lab MDP helper that updates rasterizer lights efficiently per env. The Replicator graph above is scene-global — use it for HDRI rotation that is too expensive to do per env. Mixing the two is what most production setups actually look like.

Recipe 2: Material Randomization

Material randomization is the second-biggest visual lever. The trick in 4.5 is to randomize on the MDL parameters directly, not by swapping entire material prims — the latter triggers a stage recomposition that nukes your training throughput.

# randomization/material_dr.py
import omni.replicator.core as rep
from isaaclab.utils import configclass
from isaaclab.managers import EventTermCfg, SceneEntityCfg
from isaaclab.envs.mdp import randomize_visual_material


@configclass
class MaterialDRCfg:
    randomize_object_material = EventTermCfg(
        func=randomize_visual_material,
        mode="reset",
        params={
            "asset_cfg": SceneEntityCfg("target_object",
                                        body_names=["box_.*"]),
            "albedo_range": ((0.05, 0.95),  # R
                             (0.05, 0.95),  # G
                             (0.05, 0.95)), # B
            "roughness_range": (0.1, 0.9),
            "metallic_range": (0.0, 1.0),
        },
    )


def build_material_replicator_graph(target_paths: list[str]):
    """Use Replicator for procedural texture jitter on hero assets."""
    with rep.trigger.on_frame(interval=50):
        prims = rep.get.prims(path_pattern="|".join(target_paths))
        with prims:
            rep.randomizer.materials(
                materials=rep.distribution.choice([
                    "/Isaac/Materials/Plastic/Plastic_Smooth.mdl",
                    "/Isaac/Materials/Metal/Brushed_Aluminum.mdl",
                    "/Isaac/Materials/Cardboard/Cardboard_Worn.mdl",
                    "/Isaac/Materials/Rubber/Rubber_Matte.mdl",
                ]),
                project_uvw=True,
            )
    return rep.orchestrator

Keep roughness_range away from (0.0, 0.0) — perfectly mirror surfaces blow up RTX denoiser variance and your loss curve will look like static.

Recipe 3: Physics Randomization (Mass, Friction, Damping)

Visual DR closes the visual gap. Physics DR closes the dynamics gap — and is the recipe most often skipped because debugging it is harder. The block below randomizes mass, friction, and joint damping at reset on the manipulated object and the robot itself.

# randomization/physics_dr.py
from isaaclab.utils import configclass
from isaaclab.managers import EventTermCfg, SceneEntityCfg
from isaaclab.envs.mdp import (
    randomize_rigid_body_mass,
    randomize_rigid_body_material,
    randomize_actuator_gains,
)


@configclass
class PhysicsDRCfg:
    # Object mass: ±20 % around nominal, log-uniform to keep extremes rare
    randomize_object_mass = EventTermCfg(
        func=randomize_rigid_body_mass,
        mode="reset",
        params={
            "asset_cfg": SceneEntityCfg("target_object",
                                        body_names=["box_.*"]),
            "mass_distribution_params": (0.8, 1.2),
            "operation": "scale",
            "distribution": "log_uniform",
            "recompute_inertia": True,
        },
    )

    # Friction: static + dynamic, sampled jointly
    randomize_contact_friction = EventTermCfg(
        func=randomize_rigid_body_material,
        mode="reset",
        params={
            "asset_cfg": SceneEntityCfg("robot",
                                        body_names=[".*_gripper_.*"]),
            "static_friction_range": (0.4, 1.4),
            "dynamic_friction_range": (0.3, 1.2),
            "restitution_range": (0.0, 0.1),
            "num_buckets": 64,
        },
    )

    # Actuator gains: ±20 % around URDF-declared stiffness/damping
    randomize_actuator_gains = EventTermCfg(
        func=randomize_actuator_gains,
        mode="reset",
        params={
            "asset_cfg": SceneEntityCfg("robot",
                                        joint_names=[".*"]),
            "stiffness_distribution_params": (0.8, 1.2),
            "damping_distribution_params": (0.8, 1.2),
            "operation": "scale",
            "distribution": "uniform",
        },
    )

The num_buckets=64 parameter on friction is non-obvious and important. PhysX 5.4 cannot have a unique friction value per env at scale; instead it samples your range into N buckets and assigns envs to a bucket. 64 buckets across 4,096 envs gives you 64 envs per bucket — enough variety, cheap enough to run.

Register all three configs in your EnvCfg.events. Isaac Lab’s EventManager will dispatch them in the order declared, so put physics before lighting if your reward depends on contact (so the friction bucket is set before the first physics step).

A worked example of how to attach all three to an environment config:

# envs/warehouse_pick/warehouse_pick_env_cfg.py
from isaaclab.envs import ManagerBasedRLEnvCfg
from isaaclab.utils import configclass
from randomization.lighting_dr import LightingDRCfg
from randomization.material_dr import MaterialDRCfg
from randomization.physics_dr import PhysicsDRCfg


@configclass
class EventCfg(PhysicsDRCfg, MaterialDRCfg, LightingDRCfg):
    """Order matters: physics first, then materials, then lights."""
    pass


@configclass
class WarehousePickEnvCfg(ManagerBasedRLEnvCfg):
    # ... scene, observations, actions, rewards, terminations ...
    events: EventCfg = EventCfg()

    def __post_init__(self):
        self.decimation = 4
        self.episode_length_s = 8.0
        self.sim.dt = 1.0 / 120.0
        self.sim.render_interval = self.decimation

The multiple inheritance pattern is idiomatic Isaac Lab — each DR config contributes its event terms to the combined EventCfg, and the EventManager auto-discovers them via the EventTermCfg annotations. This is the part of the API that finally clicked after I stopped trying to register events imperatively.

Training, RTX Performance, and Sim-to-Real Validation

With the three recipes wired in, training is largely the standard Isaac Lab loop — but with a few RTX-specific knobs that materially change throughput.

The sequence above is what happens per rollout step at 4,096 envs. The two dominant costs are PhysX step and RTX render. On a single H100 80 GB I see ~85k env-steps/s at 4,096 envs when rendering 64×64 RGB observations, dropping to ~32k env-steps/s at 256×256. For most manipulation tasks 84×84 is sufficient (DeepMind precedent) and keeps training under 12 hours for 2B steps.

Three settings that matter for RTX throughput in 4.5:

sim.render.rendering_mode = "performance" — disables path tracing, uses the real-time renderer. Path tracing is for hero shots, not training. Performance mode is 3–4× faster and visually plausible enough for policy learning.
sim.render.antialiasing_mode = "DLSS" with dlss_mode = "balanced" — lets RTX render internally at lower resolution and upscale. For 256×256 final observations this is ~20 % wallclock saving with no measurable policy quality loss.
sim.physx.gpu_max_rigid_contact_count — bump from default 524,288 to 2,097,152 if you see PhysX: contact buffer overflow warnings. They silently corrupt episodes.

Validation must happen on a held-out DR seed set. Pick 256 seeds you never trained on, run 50 episodes each, and report success rate, mean episode reward, and 95th-percentile time-to-completion. Anything less is overfit measurement.

The final sim-to-real step is straightforward but discipline-heavy. Export with torch.onnx.export (opset 17, dynamic batch axis), build a TensorRT engine on the deployment target — Jetson Orin AGX for most robots in 2026 — and run a 100-trial bench on the physical robot in the same configuration. If your sim success rate is 95 % and your real success rate is < 70 %, the gap is almost always (in this order): camera intrinsics not in the sim camera, gripper friction range too narrow, or lighting distribution missing the actual deployment lighting. Fix in that order.

A minimal training launcher that ties it together:

./isaaclab.sh -p train/train.py \
  --task Isaac-WarehousePick-Direct-v0 \
  --num_envs 4096 \
  --headless \
  --enable_cameras \
  --max_iterations 6000 \
  --seed 42 \
  --experiment_name dr_v1_lighting_material_physics \
  --logger wandb \
  --log_project_name isaac-dr-tutorial

Two things worth checking on the first run: the GPU utilization should be > 85 % on each H100 (if not, you are either disk-bound on USD loading — fix by warming the asset cache — or stuck on Python overhead, fix by raising num_envs), and the W&B episode/success_rate curve should climb out of zero within the first 200 iterations. If it stays at zero, your reward is wrong, not your DR.

Trade-offs and Gotchas

Domain randomization is not free. Over-randomization is the failure mode nobody talks about: if your distribution is so wide that every episode looks like a different planet, the policy converges to the safest, slowest, most generic motion that “works on average” and fails on every specific real-world instance. Symptom: training reward plateaus low and never recovers. Fix: tighten ranges, especially friction and mass.

GPU memory is the second wall. Each env at 256×256 RGB consumes ~200 KB of observation buffer; at 4,096 envs and 64-step minibatches the rollout buffer alone is ~50 GB before policy gradients. Run nvidia-smi --query-gpu=memory.used --format=csv -lms 500 during the first rollout — if you hit > 90 %, drop to 2,048 envs before OOM kills the run silently at update 18.

Headless mode quirks in 4.5: the --headless flag still requires --enable_cameras if you want any camera observations rendered, and the two together require Vulkan with the swrast fallback disabled (VK_LOADER_DRIVERS_DISABLE=swrast). On a fresh Ubuntu cluster image this is the single most common failure — symptom is a job that prints “headless OK” and then renders solid black frames forever.

Replicator API gotchas: in 4.5, rep.modify.attribute() on a USD float attribute will silently no-op if the type does not match — pass float not np.float32. And rep.trigger.on_frame(interval=N) counts render frames, not physics steps; with physics_steps_per_render = 4 your interval is effectively 4× longer than it looks.

Practical Recommendations

A short, opinionated checklist before you turn on the cluster:

Start narrow, widen weekly. Begin with 50 % of your intended DR range. Train, validate, widen. Curriculum DR (ADR) is overkill for most pick-place tasks; manual widening is fine.
Always randomize lighting, materials, mass, friction. Skip any of these and your sim-to-real gap will dominate your error budget.
Never randomize a parameter the real robot does not vary in. A 5 % motor torque variation on a Franka is realistic; a 200 % one is fantasy and hurts the policy.
Validate on 256 held-out seeds, every checkpoint. Save the wandb run; you will need it for the post-mortem.
Performance-mode rendering for training, path-traced for eval visuals only.
Pin every Isaac Sim and Isaac Lab version in requirements.txt — 4.5 → 4.6 will break Replicator graphs.
Containerize. Build an Enroot/Singularity image with Isaac Sim 4.5 baked in; commit the .sqsh hash to git.
Log DR seeds per episode. When the policy fails in real, you want to replay the closest sim episode.

The cluster topology above is what a production setup looks like: dev workstation pushes USD to a Nucleus/S3 registry, head node shards envs across 4× workers each with 2× H100, gradients aggregate at the head, best checkpoint goes to W&B Artifacts, eval harness pulls the ONNX/TensorRT engine to the physical robot. Anything less than this is fine for a demo and not enough for a production deployment.

FAQ

Isaac Sim vs Isaac Lab — which do I install?

Both. Isaac Sim 4.5 is the simulator (Kit + PhysX + RTX + Replicator). Isaac Lab 2.0 is the GPU-tensor RL environment framework that uses Isaac Sim under the hood. For RL training you almost always interact with Isaac Lab’s ManagerBasedRLEnv API and never touch Isaac Sim directly. You still install Isaac Sim first; Isaac Lab symlinks it.

Is an RTX GPU strictly required?

For training, effectively yes. The renderer and PhysX GPU pipeline are CUDA-only and tuned for Ampere/Ada/Hopper. A non-RTX GPU (e.g., A100) works for headless physics-only training, but you lose camera observations. For deployment, the policy is just an ONNX or TensorRT model and runs on any CUDA target including Jetson Orin.

Can I run Isaac Sim 4.5 in a Linux container?

Yes — NVIDIA publishes official Isaac Sim 4.5 containers on NGC. You need nvidia-container-toolkit, the host driver ≥ 550, and Vulkan loader inside the container. Enroot is the production choice on slurm; Docker is fine for dev. Headless rendering works; GUI inside a container is possible via WebRTC streaming but is not the path I recommend.

How many DR samples / how wide should the distribution be?

There is no closed-form answer, but as a rule of thumb: pick the widest plausible real-world range (e.g., warehouse lighting is realistically 300–4,000 lux), then expand by 30 % to cover edge conditions you cannot predict. For sample count, train until validation success rate on held-out seeds plateaus — that is when the distribution is “covered”. For most manipulation tasks this is 1–2B env-steps.

My sim-to-real gap is still large after DR. What now?

In this order: (1) verify camera intrinsics match — most teams skip this and pay for it; (2) verify gripper friction range covers the actual surface — measure with a force gauge; (3) add sensor-noise randomization (Gaussian on RGB, dropout on depth); (4) consider system identification on the robot’s motor model and tighten the actuator DR around the measured values. Wider DR is rarely the answer once you are past the obvious wins.

Does DR work for non-vision policies (state-based)?

Yes, but the win shrinks. Without images you skip lighting and materials entirely; physics DR alone closes a meaningful but smaller gap (~30–50 % of the total in my experience). State-based policies trained with physics DR transfer reasonably well to real robots that share the same observation space, but the moment you add a camera at deploy time, the visual gap that DR could have handled bites you. Plan the observation modality at the start.