Tutorials

We provide a list tutorials for writing various distributed operations with Triton-distributed. It is recommended that you first read the technique report, which contains design and implementation details, and then play with these tutorials.

Distributed Notify and Wait

Distributed Notify and Wait

Intra-node AllGather

Intra-node AllGather

Inter-node AllGather

Inter-node AllGather

Low Latency All-to-All Communication

Low Latency All-to-All Communication

Intra-node ReduceScatter

Intra-node ReduceScatter

Inter-node ReduceScatter

Inter-node ReduceScatter

Overlapping AllGather GEMM

Overlapping AllGather GEMM

Overlapping GEMM ReduceScatter

Overlapping GEMM ReduceScatter

Overlapping AllGather GEMM on AMD GPU

Overlapping AllGather GEMM on AMD GPU

Overlapping GEMM ReduceScatter on AMD GPU

Overlapping GEMM ReduceScatter on AMD GPU