Kernels

Triton-distributed provides a set of optimized distributed kernels for both NVIDIA and AMD GPUs. These kernels implement efficient computation-communication overlapping patterns.