pto.txor

pto.txor is part of the Elementwise Tile Tile instruction set.

Summary

Elementwise bitwise XOR of two tiles.

Mechanism

Elementwise bitwise XOR of two tiles.

For each element (i, j) in the valid region:

\[ \mathrm{dst}_{i,j} = \mathrm{src0}_{i,j} \oplus \mathrm{src1}_{i,j} \]

Syntax

Textual spelling is defined by the PTO ISA syntax-and-operands pages.

Synchronous form:

%dst = txor %src0, %src1 : !pto.tile<...>

AS Level 1 (SSA)

%dst = pto.txor %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS)

pto.txor ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

C++ Intrinsic

Declared in include/pto/common/pto_instr.hpp:

template <typename TileDataDst, typename TileDataSrc0, typename TileDataSrc1, typename TileDataTmp,
          typename... WaitEvents>
PTO_INST RecordEvent TXOR(TileDataDst &dst, TileDataSrc0 &src0, TileDataSrc1 &src1, TileDataTmp &tmp, WaitEvents &... events);

Inputs

Operand Role Description
%src0 Left tile First source tile; read at (i, j) for each (i, j) in dst valid region
%src1 Right tile Second source tile; read at (i, j) for each (i, j) in dst valid region
%tmp Temporary tile Temporary working tile required by A2/A3 for computation
WaitEvents... Optional synchronisation RecordEvent tokens to wait on before issuing the operation

Expected Outputs

Result Type Description
%dst !pto.tile<...> Destination tile; all (i, j) in its valid region contain src0[i,j] ^ src1[i,j] after the operation

Side Effects

No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.

Constraints

Constraints

  • The op iterates over dst.GetValidRow() / dst.GetValidCol().

Exceptions

Exceptions

  • Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
  • Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.

Target-Profile Restrictions

Target-Profile Restrictions
  • Implementation checks (A5):

    • dst, src0, and src1 element types must match.
    • Supported element types are uint8_t, int8_t, uint16_t, int16_t, uint32_t, and int32_t.
    • dst, src0, and src1 must be row-major.
    • src0.GetValidRow()/GetValidCol() and src1.GetValidRow()/GetValidCol() must match dst.
  • Implementation checks (A2A3):

    • dst, src0, src1, and tmp element types must match.
    • Supported element types are uint8_t, int8_t, uint16_t, and int16_t.
    • dst, src0, src1, and tmp must be row-major.
    • src0, src1, and tmp valid shapes must match dst.
    • In manual mode, dst, src0, src1, and tmp must not overlap in memory.

Performance

A2/A3 Throughput

TXOR compiles to CCE vector instructions via the TBinOp.hpp performance model. The throughput is identical to TADD (binary arithmetic):

Metric Value (FP) Value (INT)
Startup latency 14 14
Completion latency 19 17
Per-repeat throughput 2 2
Pipeline interval 18 18

Examples

#include <pto/pto-inst.hpp>

using namespace pto;

void example() {
  using TileDst = Tile<TileType::Vec, uint32_t, 16, 16>;
  using TileSrc0 = Tile<TileType::Vec, uint32_t, 16, 16>;
  using TileSrc1 = Tile<TileType::Vec, uint32_t, 16, 16>;
  using TileTmp = Tile<TileType::Vec, uint32_t, 16, 16>;
  TileDst dst;
  TileSrc0 src0;
  TileSrc1 src1;
  TileTmp tmp;
  TXOR(dst, src0, src1, tmp);
}

Auto Mode

# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.txor %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

Manual Mode

# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.txor %src0, %src1 : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

PTO Assembly Form

%dst = txor %src0, %src1 : !pto.tile<...>
# AS Level 2 (DPS)
pto.txor ins(%src0, %src1 : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)