SUT ID: LMBD260507
STAC-AI

STAC-AI Benchmark Results on Lambda 1-Click Cluster Cloud Instance with NVIDIA B200 SXM6 Blackwell Series GPUs

Type: Audited

Specs: STAC-AI™ LANG6

STAC recently completed a STAC-AI™ LANG6 (Inference-only) benchmark audit on a Lambda 1-Click Cluster (a Lambda Cloud virtual instance) powered by 8x NVIDIA B200 SXM6 Blackwell Series GPUs.

Stack Under Test (SUT):

STAC-AI™ LANG6 (Inference-Only) Pack for NVIDIA TensorRT-LLM (Rev D), 'rtx6000dev' branch, commit 8a410552, 12 March 2026
NVIDIA TensorRT-LLM 1.2.0rc2 (PyTorch backend)
NVIDIA TensorRT Model Optimizer 0.37.0 (NVFP4 quantization of Llama-3.1-8B-Instruct and Llama-3.1-70B-Instruct)
PyTorch 2.9.0a0+145a3a7bda.nv25.10; transformers 4.56.0; cuDNN 9.14.0; NCCL 2.27.7-1+cuda13.0; CUDA 13.0
Ubuntu 24.04.3 LTS; Podman 4.9.3 with NVIDIA Container Toolkit 1.18.1
Lambda 1-Click Cluster — 1 × 8×B200 SXM6 cloud node (Lambda Cloud virtual instance)
8 × NVIDIA B200 SXM6 Blackwell GPUs, each with 180 GiB HBM3e (NVIDIA driver 580.126.09)
104 physical Intel® Xeon® Platinum 8570 cores (208 logical) and 2.8 TiB of guest RAM
22 TB virtualized NVMe storage via BlueField-3 200GbE/NDR200 DPU networking

Key Results Summary:

Batch workloads
- 2.2x the throughput in the EDGAR4a small model test (52,823 vs. 23,607 words/s)¹
- 3.6x the throughput in the EDGAR4b large model test (12,040 vs. 3,351 words/s)²
- 2.3x the throughput in the EDGAR5a small model test (2,220 vs. 954 words/s)³
- 2.7x the throughput in the EDGAR5b large model test (350 vs. 132 words/s)⁴
Interactive workloads
- EDGAR4a: At 165 inf/s this system achieved a Response⁵ time 11.15x and a Reaction⁶ time 5.09x faster
- EDGAR4b: At 20 inf/s this system achieved a Response⁷ 6.2x and a Reaction⁸time 5.49x faster
- EDGAR5a: At 2.40 inf/s this system achieved a Response⁹ time 10.94x and a Reaction¹⁰ time 4.5x faster

The benchmark report is available to all STAC Observer members. STAC Insights subscribers gain access to detailed visualizations, configuration data, benchmark code, and the ability to run these tests in their own labs. Please log in to access the reports. For subscription options, contact us.

Please log in to see file attachments. If you are not registered, you may register for no charge.

STAC-AI Benchmark Results on Lambda 1-Click Cluster Cloud Instance with NVIDIA B200 SXM6 Blackwell Series GPUs

User login