Getting Started
Kernels & Layers
Python API
LittleKernel
Advanced Topics
Examples
Triton-distributed provides end-to-end model implementations with distributed inference support.
Model
Description
Dense Model
Dense transformer model (e.g., Qwen, LLaMA)
Qwen MoE Model
Qwen MoE model