live1,247 agents deployedbuilt by a solo devpowered by hermes
← All skillsSign up to install

verl-e2e-testing

General0 installsUpdated 19d ago
VerifiedCuratedNVIDIA

External verl end-to-end validation workflow for Megatron-Bridge model/provider changes. Covers running a small verl Megatron backend job from a Bridge checkout, choosing LoRA/DDP plus optional save/resume and parallelism variants, setting PYTHONPATH so verl imports the local Bridge tree, and reporting pass/fail evidence.

SKILL.md preview

---
name: verl-e2e-testing
description: External verl end-to-end validation workflow for Megatron-Bridge model/provider changes. Covers running a small verl Megatron backend job from a Bridge checkout, choosing LoRA/DDP plus optional save/resume and parallelism variants, setting PYTHONPATH so verl imports the local Bridge tree, and reporting pass/fail evidence.
when_to_use: Adding or changing a Megatron-Bridge model/provider and needing downstream verl compatibility validation; checking non-vanilla Bridge provider paths; testing PEFT/LoRA, DDP, checkpoint behavior, or explicitly requested advanced variants through verl; 'does this model work in verl', 'run verl e2e', 'external RL loop validation'.
---

# verl E2E Testing

Validate a Megatron-Bridge model addition through verl's Megatron backend. This catches integration issues that Bridge-only conversion tests miss: provider configuration, HF import through Bridge, PEFT wrapping, DDP wrapping, optimizer setup, rollout/ref wiring, and checkpoint ownership by an external RL loop.

Use this as an external compatibility smoke test after the Bridge unit and functional tests for a new model provider are green.

This is not a replacement for Bridge model parity tests. The default verl PPO run proves that the provider can survive an external RL training loop; architecture-specific correctness still comes from Bridge import/export, logits/roundtrip, and model-specific inference tests.

## Scope

Think in coverage levels. Start with Level 0 and add only the levels justified by the change.

| Level | Required when | What it proves |
|---|---|---|
| 0: LoRA + DDP smoke | Any new provider or provider config change that claims verl compatibility | verl can import the local Bridge provider, apply PEFT, wrap with Megatron DDP, build optimizer state, run rollout/ref/critic wiring, and finish one PPO step |
| 1: Save/resume | PEFT, checkpointing, HF export, adapter export, optimizer state, or resume behavior changed | verl-owned checkpoint scheduling can save and reload Bridge-built model state |
| 2: Parallelism stress | Provider finalization, mpu-derived settings, TP/PP/CP/EP, sequence parallel, or dispatcher behavior changed | provider settings remain correct under non-trivial Megatron parallel state |
| 3: Optional Megatron-FSDP | Only when downstream explicitly asks for verl Megatron-FSDP coverage or the change directly touches that integration path | the same provider works when verl selects Megatron-FSDP instead of DDP |
| 4: Architecture-specific e2e | VLM, MoE, MTP, QAT/ModelOpt, quantized weights, or custom layer behavior is involved | the part of the architecture not exercised by text-only GSM8K also has a targeted runtime check |
| 5: Convergence / learning signal | Optimizer, scheduler, loss, reward, PEFT trainability, gradient flow, or model-specific training stability changed | metrics move in the expected direction over a short run and do not silently produce zero/NaN/unstable updates |

The default Level 0 target is a short, non-vanilla Bridge run in verl with LoRA enabled and Megatron DDP selected:

```bash
USE_MBRIDGE=True
VANILLA_MBRIDGE=False
VALUE_VANILLA_MBRIDGE=False
LORA_RANK=4
USE_MEGATRON_FSDP=False
TOTAL_TRAIN_STEPS=1
```

This is intentionally small. It exercises the Bridge-facing path in verl without making Megatron-Bridge own rollout scheduling, reward handling, optimizer scheduling, or checkpoint orchestration.

Level 0 is not a convergence test. It only proves the training loop can complete one update. Use Level 5 when the question is whether the model actually learns under verl.

Megatron-FSDP is not part of the default validation expected for current provider compatibility work. Run it only for Level 3 coverage when FSDP is explicitly in scope:

```bash
USE_MEGATRON_FSDP=True
ALL_OFFLOAD=False
COMMON_PP=1
COMMON_VPP=null
COMMON_CP=1
COMMON_TP=1
INFER_TP=1
```

## Repos

Use explicit repo variables. Do not rely on an installed `megatron-bridge` wheel; the purpo