pto.txor¶
pto.txor is part of the Elementwise Tile Tile instruction set.
Summary¶
Elementwise bitwise XOR of two tiles.
Mechanism¶
Elementwise bitwise XOR of two tiles.
For each element (i, j) in the valid region:
\[ \mathrm{dst}_{i,j} = \mathrm{src0}_{i,j} \oplus \mathrm{src1}_{i,j} \]
Syntax¶
Textual spelling is defined by the PTO ISA syntax-and-operands pages.
Synchronous form:
%dst = txor %src0, %src1 : !pto.tile<...>
AS Level 1 (SSA)¶
%dst = pto.txor %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
AS Level 2 (DPS)¶
pto.txor ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
C++ Intrinsic¶
Declared in include/pto/common/pto_instr.hpp:
template <typename TileDataDst, typename TileDataSrc0, typename TileDataSrc1, typename TileDataTmp,
typename... WaitEvents>
PTO_INST RecordEvent TXOR(TileDataDst &dst, TileDataSrc0 &src0, TileDataSrc1 &src1, TileDataTmp &tmp, WaitEvents &... events);
Inputs¶
| Operand | Role | Description |
|---|---|---|
%src0 |
Left tile | First source tile; read at (i, j) for each (i, j) in dst valid region |
%src1 |
Right tile | Second source tile; read at (i, j) for each (i, j) in dst valid region |
%tmp |
Temporary tile | Temporary working tile required by A2/A3 for computation |
WaitEvents... |
Optional synchronisation | RecordEvent tokens to wait on before issuing the operation |
Expected Outputs¶
| Result | Type | Description |
|---|---|---|
%dst |
!pto.tile<...> |
Destination tile; all (i, j) in its valid region contain src0[i,j] ^ src1[i,j] after the operation |
Side Effects¶
No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.
Constraints¶
Constraints
- The op iterates over
dst.GetValidRow()/dst.GetValidCol().
Exceptions¶
Exceptions
- Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
- Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.
Target-Profile Restrictions¶
Target-Profile Restrictions
-
Implementation checks (A5):
dst,src0, andsrc1element types must match.- Supported element types are
uint8_t,int8_t,uint16_t,int16_t,uint32_t, andint32_t. dst,src0, andsrc1must be row-major.src0.GetValidRow()/GetValidCol()andsrc1.GetValidRow()/GetValidCol()must matchdst.
-
Implementation checks (A2A3):
dst,src0,src1, andtmpelement types must match.- Supported element types are
uint8_t,int8_t,uint16_t, andint16_t. dst,src0,src1, andtmpmust be row-major.src0,src1, andtmpvalid shapes must matchdst.- In manual mode,
dst,src0,src1, andtmpmust not overlap in memory.
Performance¶
A2/A3 Throughput¶
TXOR compiles to CCE vector instructions via the TBinOp.hpp performance model. The throughput is identical to TADD (binary arithmetic):
| Metric | Value (FP) | Value (INT) |
|---|---|---|
| Startup latency | 14 | 14 |
| Completion latency | 19 | 17 |
| Per-repeat throughput | 2 | 2 |
| Pipeline interval | 18 | 18 |
Examples¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example() {
using TileDst = Tile<TileType::Vec, uint32_t, 16, 16>;
using TileSrc0 = Tile<TileType::Vec, uint32_t, 16, 16>;
using TileSrc1 = Tile<TileType::Vec, uint32_t, 16, 16>;
using TileTmp = Tile<TileType::Vec, uint32_t, 16, 16>;
TileDst dst;
TileSrc0 src0;
TileSrc1 src1;
TileTmp tmp;
TXOR(dst, src0, src1, tmp);
}
Auto Mode¶
# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.txor %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
Manual Mode¶
# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.txor %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>
PTO Assembly Form¶
%dst = txor %src0, %src1 : !pto.tile<...>
# AS Level 2 (DPS)
pto.txor ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
Related Ops / Instruction Set Links¶
- Instruction set overview: Elementwise Tile Tile
- Previous op in instruction set: pto.tshr
- Next op in instruction set: pto.tlog