Reference

The full public API. Everything else under PTX, PTX.IR, PTX.Codegen, PTX.Parser is internal and may change without notice.

Authoring

PTX.@ptx_strMacro
ptx"opcode.mod1.mod2..."

Construct an Operation{op, mods} singleton — op::Symbol is the opcode (first segment), mods::Tuple{Vararg{Symbol}} is the modifier chain. Splits on .; each segment becomes one Symbol verbatim, so :: (PTX sub-namespace separator), digit-leading tokens (3d, m16n8k32), and underscores in modifier names all flow through cleanly.

Supports $x / $(expr) interpolation. The macro expands to a chain of type-domain * compositions:

  • Static segments fold into the initial Operation{op, mods}() and subsequent Chain{(...)}() constants.
  • A glued interpolation (literal chars adjacent on either side, e.g. x$(N) or $(N)d) emits Symbol(pre, expr, post) and composes via Operation * Symbol. With a compile-time-constant interpolated value (e.g. N from a Val{N} unwrap) the whole site folds to the same singleton the literal form would produce — making it safe to use inside device kernels.
  • A bare interpolation (between two .s) emits _ptx_dyn_seg(expr), which yields a Chain — supports String values containing . (split into multiple modifier segments) for host-side use.
  • An interpolated opcode (first segment) falls back to _ptx_op_from_string(...). Device kernels should keep the opcode literal.

Examples:

ptx"add.f32"(a, b)
ptx"mma.sync.aligned.m16n8k16.row.col.f32.bf16.bf16.f32"(a, b, c)
ptx"cp.async.bulk.tensor.3d.shared::cta.global.tile.mbarrier::complete_tx::bytes"(...)
ptx"bar.sync"(Val(0))
ptx"mov.u32"(sreg"%tid.x")

dt = "u32"
ptx"mov.$dt"(x)              # ≡ ptx"mov.u32"(x)
ptx"st.$(space).b32"(p, v)   # $(...) for non-identifier exprs

# Glued — folds to a literal singleton when N is Val-known:
@inline f(p, ::Val{N}) where {N} =
    ptx"ldmatrix.sync.aligned.m8n8.x$(N).shared.b16"(p)

Empty literal parts (consecutive ., leading/trailing ., or empty string) error at expansion for the static path, and at runtime for the interp path.

See also: @mod_str for modifier-only chains usable on the right side of *.

source
PTX.@sreg_strMacro
sreg"name"

Construct a SpecialReg{Symbol("%name")} singleton — a compile-time literal for a PTX special register. Bakes the verbatim asm token, so underscore-bearing names (%cluster_ctarank, %lanemask_eq, %total_smem_size) round-trip losslessly. Accepts either form:

sreg"tid.x"            ≡ sreg"%tid.x"            → "%tid.x"
sreg"cluster_ctarank"  ≡ sreg"%cluster_ctarank"  → "%cluster_ctarank"
source
PTX.@mod_strMacro
mod"mod1.mod2..."

Construct a Chain{mods} singleton — a sequence of PTX modifiers with no opcode. Splits on . like @ptx_str but does not accept $ interpolation: there is no opcode context to interpolate against, and the type-level * (below) already handles compile-time composition.

mod"" is the empty chain Chain{()}. Composing it via * is a no-op — useful as an "absent modifier" sentinel for conditional helpers.

source

Transpiler

PTX.Codegen.ptx_to_juliaFunction
ptx_to_julia(source) -> String

Parse PTX source text and emit Julia source code. Returns a string of one or more function ... end definitions calling the ptx"..." macro.

source
PTX.Codegen.ir_to_juliaFunction
ir_to_julia(mod::IR.Module) -> String

Convert a parsed IR.Module into Julia source code. Each Function in the module becomes one Julia function definition.

source

Parser

PTX.Parser.tokenizeFunction
tokenize(source::AbstractString) -> Vector{Token}

Tokenize PTX source text. The token stream includes NEWLINE / COMMENT tokens (needed for formatting preservation) and ends with an EOF token. Raises LexError on unrecognizable characters.

source
PTX.Parser.TokenType
Token

A single token from the PTX source. leading_whitespace carries the spaces/tabs that preceded this token on the same line, used by the parser to reconstruct FormattingInfo for byte-identical round-trip.

source
PTX.Parser.parseFunction
parse(source::AbstractString) -> IR.Module

Parse PTX source text into a Module. Module-header (.version / .target / .address_size) is parsed structurally; everything between it and TokenKind.EOF is collected as a flat sequence of statements (blank lines / comments preserved). Function/body parsing is deferred — for now, function bodies arrive as an Instruction opening-brace placeholder followed by the body statements followed by a closing-brace placeholder.

source
PTX.Parser.ParseErrorType
ParseError

Raised by parse on input the recursive-descent parser cannot consume. Carries the source line and col of the failed token.

source

IR

PTX.IR.formatFunction
format(mod::IR.Module) -> String

Reconstruct PTX text from a parsed IR.Module. Returns mod.raw_source verbatim when set (the lossless fast path used by parser-produced IR); otherwise emits the module structurally — header, leading prelude, then each directive — falling back per statement to formatting.raw_line when present and to field-driven reconstruction when not.

Per-statement format(stmt) methods (one per IR.Statement kind) implement the structural fallback and can be called individually.

source

Index