Skip to content

Instantly share code, notes, and snippets.

@melonedo
melonedo / generic-intra-warp-shuffle-generator.py
Created August 9, 2025 17:04
Generate generic intra warp shuffle for use with CUDA
# The algorithm proposed by https://github.com/triton-lang/triton/pull/7558
from dataclasses import dataclass
from enum import Enum
def inverse_permutation(P):
"""
Given a permutation P, return its inverse permutation.
"""