Reference
The full public API. Everything else under PTX, PTX.IR, PTX.Codegen, PTX.Parser is internal and may change without notice.
Authoring
PTX.@ptx_str — Macro
ptx"opcode.mod1.mod2..."Construct an Operation{op, mods} singleton — op::Symbol is the opcode (first segment), mods::Tuple{Vararg{Symbol}} is the modifier chain. Splits on .; each segment becomes one Symbol verbatim, so :: (PTX sub-namespace separator), digit-leading tokens (3d, m16n8k32), and underscores in modifier names all flow through cleanly.
Supports $x / $(expr) interpolation. The macro expands to a chain of type-domain * compositions:
- Static segments fold into the initial
Operation{op, mods}()and subsequentChain{(...)}()constants. - A glued interpolation (literal chars adjacent on either side, e.g.
x$(N)or$(N)d) emitsSymbol(pre, expr, post)and composes viaOperation * Symbol. With a compile-time-constant interpolated value (e.g.Nfrom aVal{N}unwrap) the whole site folds to the same singleton the literal form would produce — making it safe to use inside device kernels. - A bare interpolation (between two
.s) emits_ptx_dyn_seg(expr), which yields aChain— supportsStringvalues containing.(split into multiple modifier segments) for host-side use. - An interpolated opcode (first segment) falls back to
_ptx_op_from_string(...). Device kernels should keep the opcode literal.
Examples:
ptx"add.f32"(a, b)
ptx"mma.sync.aligned.m16n8k16.row.col.f32.bf16.bf16.f32"(a, b, c)
ptx"cp.async.bulk.tensor.3d.shared::cta.global.tile.mbarrier::complete_tx::bytes"(...)
ptx"bar.sync"(Val(0))
ptx"mov.u32"(sreg"%tid.x")
dt = "u32"
ptx"mov.$dt"(x) # ≡ ptx"mov.u32"(x)
ptx"st.$(space).b32"(p, v) # $(...) for non-identifier exprs
# Glued — folds to a literal singleton when N is Val-known:
@inline f(p, ::Val{N}) where {N} =
ptx"ldmatrix.sync.aligned.m8n8.x$(N).shared.b16"(p)Empty literal parts (consecutive ., leading/trailing ., or empty string) error at expansion for the static path, and at runtime for the interp path.
See also: @mod_str for modifier-only chains usable on the right side of *.
PTX.@sreg_str — Macro
sreg"name"Construct a SpecialReg{Symbol("%name")} singleton — a compile-time literal for a PTX special register. Bakes the verbatim asm token, so underscore-bearing names (%cluster_ctarank, %lanemask_eq, %total_smem_size) round-trip losslessly. Accepts either form:
sreg"tid.x" ≡ sreg"%tid.x" → "%tid.x"
sreg"cluster_ctarank" ≡ sreg"%cluster_ctarank" → "%cluster_ctarank"PTX.@mod_str — Macro
mod"mod1.mod2..."Construct a Chain{mods} singleton — a sequence of PTX modifiers with no opcode. Splits on . like @ptx_str but does not accept $ interpolation: there is no opcode context to interpolate against, and the type-level * (below) already handles compile-time composition.
mod"" is the empty chain Chain{()}. Composing it via * is a no-op — useful as an "absent modifier" sentinel for conditional helpers.
Transpiler
PTX.Codegen.ptx_to_julia — Function
ptx_to_julia(source) -> StringParse PTX source text and emit Julia source code. Returns a string of one or more function ... end definitions calling the ptx"..." macro.
PTX.Codegen.ir_to_julia — Function
ir_to_julia(mod::IR.Module) -> StringConvert a parsed IR.Module into Julia source code. Each Function in the module becomes one Julia function definition.
Parser
PTX.Parser.tokenize — Function
tokenize(source::AbstractString) -> Vector{Token}Tokenize PTX source text. The token stream includes NEWLINE / COMMENT tokens (needed for formatting preservation) and ends with an EOF token. Raises LexError on unrecognizable characters.
PTX.Parser.Token — Type
TokenA single token from the PTX source. leading_whitespace carries the spaces/tabs that preceded this token on the same line, used by the parser to reconstruct FormattingInfo for byte-identical round-trip.
PTX.Parser.TokenKind — Module
TokenKindToken kinds produced by tokenize. EnumX-nested so values are referenced as TokenKind.IDENTIFIER, TokenKind.NEWLINE, etc.
PTX.Parser.LexError — Type
LexErrorRaised by tokenize on unrecognizable input.
PTX.Parser.parse — Function
parse(source::AbstractString) -> IR.ModuleParse PTX source text into a Module. Module-header (.version / .target / .address_size) is parsed structurally; everything between it and TokenKind.EOF is collected as a flat sequence of statements (blank lines / comments preserved). Function/body parsing is deferred — for now, function bodies arrive as an Instruction opening-brace placeholder followed by the body statements followed by a closing-brace placeholder.
PTX.Parser.ParseError — Type
ParseErrorRaised by parse on input the recursive-descent parser cannot consume. Carries the source line and col of the failed token.
IR
PTX.IR.format — Function
format(mod::IR.Module) -> StringReconstruct PTX text from a parsed IR.Module. Returns mod.raw_source verbatim when set (the lossless fast path used by parser-produced IR); otherwise emits the module structurally — header, leading prelude, then each directive — falling back per statement to formatting.raw_line when present and to field-driven reconstruction when not.
Per-statement format(stmt) methods (one per IR.Statement kind) implement the structural fallback and can be called individually.
Index
PTX.Parser.TokenKindPTX.Parser.LexErrorPTX.Parser.ParseErrorPTX.Parser.TokenPTX.Codegen.ir_to_juliaPTX.Codegen.ptx_to_juliaPTX.IR.formatPTX.Parser.parsePTX.Parser.tokenizePTX.layout_for_aPTX.layout_for_mn_majorPTX.pick_gmma_layoutPTX.tensor_map_encode_tiledPTX.tensor_map_tile_2dPTX.@mod_strPTX.@ptx_strPTX.@sreg_str