Tensor Parallel Attention (AMD) =============================== Tensor Parallel Attention layer for AMD GPUs. See ``python/triton_dist/layers/amd/tp_attn.py`` for implementation details.