WORLD · aws-gpu

BAGEL World Model

BAGEL is the image-generation/editing foundation model path for pi0.7-style visual subgoal images.

Runtime Target

Benchmark the public ByteDance-Seed/BAGEL-7B-MoT checkpoint on one Nebius H200 with reduced and full robot-camera subgoal generation.

Plan

MAX

Provider

nebius

Accelerator

nvidia-h200-141gb

Simulator

bagel-subgoal-image-generation

Embodiment

robot camera subgoal image

Placement

cloud-only

Status

Working Now

Catalog metadata, paid gating, autoscaling, and a bounded Nebius H200 benchmark workload are wired and validated for the public BAGEL base checkpoint.

Ready To Deploy

The lifecycle workload downloads ByteDance-Seed/BAGEL-7B-MoT with hf_transfer, runs image-conditioned subgoal generation across CFG, no-CFG, low-res, and TaylorSeer cases, writes timing metrics, and shuts the instance down.

Runtime Notes

Physical Intelligence fine-tuned its BAGEL-initialized world model on robot and egocentric video data; those weights are not public, so this page reports the public base-checkpoint benchmark.

Measured H200 Timing

Load

6.617s

Fastest 1 step

2.138s

Full 25 step

5.233s

Taylor 25 step

2.689s

Peak memory

30.678 GiB

GPU

NVIDIA H200