Examples

This section contains examples of using LWS with or without specific inference runtime.

This section provides practical examples of using LeaderWorkerSet (LWS) in various scenarios:

Infrastructure Examples

Horizontal Pod Autoscaler (HPA) - Configure automatic scaling based on resource utilization

Each example includes detailed configuration files, deployment instructions, and best practices for production use.

An example of using vLLM with LWS

An example of using TensorRT-LLM with LWS

An example of using llama.cpp with LWS

An example of using SGLang with LWS

An example of using Horizontal Pod Autoscaler with LeaderWorkerSet

Last modified September 22, 2025: add hpa docs (c1e9ac6)