Ulysses SP All-to-All Layer

High-level layer for Ulysses-style Sequence Parallelism All-to-All.

Description

This layer implements the All-to-All communication pattern used in Ulysses sequence parallelism.

See python/triton_dist/layers/nvidia/ulysses_sp_a2a_layer.py for implementation details.