pto.trowexpand¶
pto.trowexpand is part of the Reduce And Expand instruction set.
Summary¶
Broadcast the first element of each source row across the destination row.
Mechanism¶
Broadcast the first element of each source row across the destination row.
Let R = dst.GetValidRow() and C = dst.GetValidCol(). For 0 <= i < R and 0 <= j < C:
Syntax¶
Textual spelling is defined by the PTO ISA syntax-and-operands pages.
Synchronous form:
%dst = trowexpand %src : !pto.tile<...> -> !pto.tile<...>
AS Level 1 (SSA)¶
%dst = pto.trowexpand %src : !pto.tile<...> -> !pto.tile<...>
AS Level 2 (DPS)¶
pto.trowexpand ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
C++ Intrinsic¶
Declared in include/pto/common/pto_instr.hpp:
template <typename TileDataDst, typename TileDataSrc, typename... WaitEvents>
PTO_INST RecordEvent TROWEXPAND(TileDataDst &dst, TileDataSrc &src, WaitEvents &... events);
Inputs¶
srcis the source tile.dstnames the destination tile. The operation iterates over dst's valid region.
Expected Outputs¶
dst holds the row-wise broadcast: each row i of dst is filled with src[i,0].
Side Effects¶
No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.
Constraints¶
Constraints
-
Tile Type:
dstandsrcmust beTileType::Vec. -
Tile layout: ND fractal (
isRowMajorandSLayout::NoneBox) for bothsrcanddst.
Exceptions¶
Exceptions
- Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
- Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.
Target-Profile Restrictions¶
Target-Profile Restrictions
Implementation Checks (NPU)¶
-
Data type: A2A3/A5 element types must be one of:
int8_t,uint8_t,int16_t,uint16_t,int32_t,uint32_t,half,bfloat16_t, orfloat. -
Runtime valid-region checks:
- A2A3: returns early if any of
dstValidRow,dstValidCol,srcValidRow,srcValidColis zero. - A5: asserts
srcValidRow == dstValidRowand assertssrcValidRow != 0 && srcValidCol != 0.
- A2A3: returns early if any of
Examples¶
Auto¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example_auto() {
using SrcT = Tile<TileType::Vec, float, 16, 16>;
using DstT = Tile<TileType::Vec, float, 16, 16>;
SrcT src;
DstT dst;
TROWEXPAND(dst, src);
}
Manual¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example_manual() {
using SrcT = Tile<TileType::Vec, float, 16, 16>;
using DstT = Tile<TileType::Vec, float, 16, 16>;
SrcT src;
DstT dst;
TASSIGN(src, 0x1000);
TASSIGN(dst, 0x2000);
TROWEXPAND(dst, src);
}
Auto Mode¶
# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.trowexpand %src : !pto.tile<...> -> !pto.tile<...>
Manual Mode¶
# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.trowexpand %src : !pto.tile<...> -> !pto.tile<...>
PTO Assembly Form¶
%dst = trowexpand %src : !pto.tile<...> -> !pto.tile<...>
# AS Level 2 (DPS)
pto.trowexpand ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
Related Ops / Instruction Set Links¶
- Instruction set overview: Reduce And Expand
- Previous op in instruction set: pto.trowargmin
- Next op in instruction set: pto.trowexpanddiv