pto.ttri¶
pto.ttri is part of the Irregular And Complex instruction set.
Summary¶
Generate a triangular (lower/upper) mask tile.
Mechanism¶
Generate a (lower/upper) triangular mask tile with ones and zeros. The triangular orientation is controlled by the compile-time template parameter isUpperOrLower (0 = lower, 1 = upper). It belongs to the tile instructions and carries architecture-visible behavior that is not reducible to a plain elementwise compute pattern.
Let R = dst.GetValidRow() and C = dst.GetValidCol(). Let d = diagonal.
Lower-triangular (isUpperOrLower=0) conceptually produces:
Upper-triangular (isUpperOrLower=1) conceptually produces:
Syntax¶
Textual spelling is defined by the PTO ISA syntax-and-operands pages.
AS Level 1 (SSA)¶
%dst = pto.ttri %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
AS Level 2 (DPS)¶
pto.ttri ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
IR Level 1 (SSA)¶
%dst = pto.ttri %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
IR Level 2 (DPS)¶
pto.ttri ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
C++ Intrinsic¶
Declared in include/pto/common/pto_instr.hpp:
template <typename TileData, int isUpperOrLower, typename... WaitEvents>
PTO_INST RecordEvent TTRI(TileData &dst, int diagonal, WaitEvents &... events);
Inputs¶
diagonalis the diagonal offset.isUpperOrLower(template parameter): 0 for lower triangular, 1 for upper triangular.dstnames the destination tile. The operation iterates over dst's valid region.
Expected Outputs¶
dst holds a triangular mask (1s on one side of the diagonal, 0s elsewhere).
Side Effects¶
No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.
Constraints¶
Constraints
-
isUpperOrLowermust be0(lower) or1(upper). -
Destination tile must be row-major on some targets (see
include/pto/npu/*/TTri.hpp).
Exceptions¶
Exceptions
- Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
- Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.
Target-Profile Restrictions¶
Target-Profile Restrictions
-
pto.ttripreserves PTO-visible semantics across CPU simulation, A2/A3-class targets, and A5-class targets, but concrete support subsets may differ by profile. -
Portable code must rely only on the documented type, layout, shape, and mode combinations that the selected target profile guarantees.
Examples¶
See related examples in docs/isa/ and docs/coding/tutorials/.
Auto Mode¶
# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.ttri %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
Manual Mode¶
# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.ttri %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
PTO Assembly Form¶
%dst = pto.ttri %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
# AS Level 2 (DPS)
pto.ttri ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
Related Ops / Instruction Set Links¶
- Instruction set overview: Irregular And Complex
- Previous op in instruction set: pto.tci
- Next op in instruction set: pto.tpartadd