pto.tcolmin

pto.tcolmin is part of the Reduce And Expand instruction set.

Summary

Reduce each column by taking the minimum across rows.

Mechanism

Reduce each column by taking the minimum across rows.

Let R = src.GetValidRow() and C = src.GetValidCol(). For 0 <= j < C:

\[ \mathrm{dst}_{0,j} = \min_{0 \le i < R} \mathrm{src}_{i,j} \]

Syntax

Textual spelling is defined by the PTO ISA syntax-and-operands pages.

Synchronous form:

%dst = tcolmin %src : !pto.tile<...> -> !pto.tile<...>

AS Level 1 (SSA)

%dst = pto.tcolmin %src : !pto.tile<...> -> !pto.tile<...>

AS Level 2 (DPS)

pto.tcolmin ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

C++ Intrinsic

Declared in include/pto/common/pto_instr.hpp:

template <typename TileDataOut, typename TileDataIn, typename... WaitEvents>
PTO_INST RecordEvent TCOLMIN(TileDataOut &dst, TileDataIn &src, WaitEvents &... events);

Inputs

  • src is the source tile.
  • dst names the destination tile. The operation iterates over dst's valid region.

Expected Outputs

dst holds the column-wise minimum: for each column j, dst[0,j] = min of all elements in column j of src.

Side Effects

No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.

Constraints

Constraints

General constraints / checks

  • dst and src must be TileType::Vec.

  • dst and src must use standard ND layout: row-major and non-fractal (BLayout::RowMajor, SLayout::NoneBox).

  • dst and src must use the same element type.

  • Runtime checks:

  • src.GetValidCol() == dst.GetValidCol()

  • Supported element types: half, float, int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, bfloat16_t.

Exceptions

Exceptions

  • Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
  • Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.

Target-Profile Restrictions

Target-Profile Restrictions
  • If src.GetValidRow() == 0 or src.GetValidCol() == 0, the implementation returns early.
  • Supported element types: half, float, int16_t, int32_t.

No additional restriction is documented for this target.

Examples

Auto

#include <pto/pto-inst.hpp>

using namespace pto;

void example_auto() {
  using SrcT = Tile<TileType::Vec, float, 16, 16>;
  using DstT = Tile<TileType::Vec, float, 1, 16>;
  SrcT src;
  DstT dst;
  TCOLMIN(dst, src);
}

Manual

#include <pto/pto-inst.hpp>

using namespace pto;

void example_manual() {
  using SrcT = Tile<TileType::Vec, float, 16, 16>;
  using DstT = Tile<TileType::Vec, float, 1, 16>;
  SrcT src;
  DstT dst;
  TASSIGN(src, 0x1000);
  TASSIGN(dst, 0x2000);
  TCOLMIN(dst, src);
}

Auto Mode

# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.tcolmin %src : !pto.tile<...> -> !pto.tile<...>

Manual Mode

# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.tcolmin %src : !pto.tile<...> -> !pto.tile<...>

PTO Assembly Form

%dst = tcolmin %src : !pto.tile<...> -> !pto.tile<...>
# AS Level 2 (DPS)
pto.tcolmin ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)