pto.tcolexpand¶
pto.tcolexpand is part of the Reduce And Expand instruction set.
Summary¶
Broadcast the first element of each source column across the destination column.
Mechanism¶
Broadcast the first element of each source column across the destination column.
Let R = dst.GetValidRow() and C = dst.GetValidCol(). For 0 <= i < R and 0 <= j < C:
Syntax¶
Textual spelling is defined by the PTO ISA syntax-and-operands pages.
Synchronous form:
%dst = tcolexpand %src : !pto.tile<...> -> !pto.tile<...>
AS Level 1 (SSA)¶
%dst = pto.tcolexpand %src : !pto.tile<...> -> !pto.tile<...>
AS Level 2 (DPS)¶
pto.tcolexpand ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
C++ Intrinsic¶
Declared in include/pto/common/pto_instr.hpp:
template <typename TileDataDst, typename TileDataSrc, typename... WaitEvents>
PTO_INST RecordEvent TCOLEXPAND(TileDataDst &dst, TileDataSrc &src, WaitEvents &... events);
Inputs¶
srcis the source tile.dstnames the destination tile. The operation iterates over dst's valid region.
Expected Outputs¶
dst holds the column-wise broadcast: each column j of dst is filled with src[0,j].
Side Effects¶
No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.
Constraints¶
Constraints
- The op iterates over
dst.GetValidRow()/dst.GetValidCol().
Exceptions¶
Exceptions
- Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
- Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.
Target-Profile Restrictions¶
Target-Profile Restrictions
-
pto.tcolexpandpreserves PTO-visible semantics across CPU simulation, A2/A3-class targets, and A5-class targets, but concrete support subsets may differ by profile. -
Portable code must rely only on the documented type, layout, shape, and mode combinations that the selected target profile guarantees.
Examples¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example() {
using TileT = Tile<TileType::Vec, float, 16, 16>;
TileT src, dst;
TCOLEXPAND(dst, src);
}
Auto Mode¶
# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.tcolexpand %src : !pto.tile<...> -> !pto.tile<...>
Manual Mode¶
# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.tcolexpand %src : !pto.tile<...> -> !pto.tile<...>
PTO Assembly Form¶
%dst = tcolexpand %src : !pto.tile<...> -> !pto.tile<...>
# AS Level 2 (DPS)
pto.tcolexpand ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
Related Ops / Instruction Set Links¶
- Instruction set overview: Reduce And Expand
- Previous op in instruction set: pto.tcolmin
- Next op in instruction set: pto.tcolexpanddiv