pto.tgatherb

pto.tgatherb is part of the Irregular And Complex instruction set.

Summary

Gather elements using byte offsets.

Mechanism

Gather elements using byte offsets. It belongs to the tile instructions and carries architecture-visible behavior that is not reducible to a plain elementwise compute pattern.

For each element in the valid region:

\[ \mathrm{dst}_{i,j} = *\left(\mathrm{srcBase} + \mathrm{offset}_{i,j}\right) \]

On A2/A3 and A5, out-of-range offsets produce undefined results; on the CPU simulator, out-of-range offsets are clamped to the source tile boundary.

Syntax

Textual spelling is defined by the PTO ISA syntax-and-operands pages.

Synchronous form:

%dst = tgatherb %src, %offsets : !pto.tile<...> -> !pto.tile<...>

AS Level 1 (SSA)

%dst = pto.tgatherb %src, %offsets : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS)

pto.tgatherb ins(%src, %offsets : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

IR Level 1 (SSA)

%dst = pto.tgatherb %src, %offsets : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

IR Level 2 (DPS)

pto.tgatherb ins(%src, %offsets : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

C++ Intrinsic

Declared in include/pto/common/pto_instr.hpp:

template <typename TileDataDst, typename TileDataSrc, typename TileDataOffset, typename... WaitEvents>
PTO_INST RecordEvent TGATHERB(TileDataDst &dst, TileDataSrc &src, TileDataOffset &offset, WaitEvents &... events);

Inputs

  • src is the source tile.
  • offset is an offset tile providing byte offsets for each destination element.
  • dst names the destination tile. The operation iterates over dst's valid region.

Expected Outputs

dst holds elements gathered from src using byte offsets from offset.

Side Effects

No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.

Constraints

Constraints

  • Offset interpretation:
    • Offsets are interpreted as uint32_t values (byte offsets) by the implementation.
    • Offset bounds are not validated by explicit runtime assertions; on A2/A3 and A5, out-of-range offsets produce undefined results; on the CPU simulator, out-of-range offsets are clamped to the source tile boundary.

Exceptions

Exceptions

  • Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
  • Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.

Target-Profile Restrictions

Target-Profile Restrictions
  • Implementation checks (A2A3):

    • Destination layout must be row-major (TileDataDst::isRowMajor).
    • Destination element size must be 1, 2, or 4 bytes (enforced via static_assert in the helper).
    • SrcTileData::DType/DstTileData::DType must be int8_t or uint8_t or int16_t or uint16_t or int32_t or uint32_t or half or bfloat16_t or float.
  • Implementation checks (A5):

    • Destination element size must be 1, 2, or 4 bytes.
    • SrcTileData::DType/DstTileData::DType must be int8_t or uint8_t or int16_t or uint16_t or int32_t or uint32_t or half or bfloat16_t or float.

Examples

Auto

#include <pto/pto-inst.hpp>

using namespace pto;

void example_auto() {
  using SrcT = Tile<TileType::Vec, uint8_t, 1, 256>;
  using OffT = Tile<TileType::Vec, uint32_t, 1, 256>;
  using DstT = Tile<TileType::Vec, uint8_t, 1, 256>;
  SrcT src;
  OffT off;
  DstT dst;
  TGATHERB(dst, src, off);
}

Manual

#include <pto/pto-inst.hpp>

using namespace pto;

void example_manual() {
  using SrcT = Tile<TileType::Vec, uint8_t, 1, 256>;
  using OffT = Tile<TileType::Vec, uint32_t, 1, 256>;
  using DstT = Tile<TileType::Vec, uint8_t, 1, 256>;
  SrcT src;
  OffT off;
  DstT dst;
  TASSIGN(src, 0x1000);
  TASSIGN(off, 0x2000);
  TASSIGN(dst, 0x3000);
  TGATHERB(dst, src, off);
}

Auto Mode

# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.tgatherb %src, %offsets : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

Manual Mode

# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.tgatherb %src, %offsets : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

PTO Assembly Form

%dst = tgatherb %src, %offsets : !pto.tile<...> -> !pto.tile<...>
# AS Level 2 (DPS)
pto.tgatherb ins(%src, %offsets : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)