pto.texpands¶
pto.texpands is part of the Tile Scalar And Immediate instruction set.
Summary¶
Broadcast a scalar into a destination tile.
Mechanism¶
Broadcast a scalar into a destination tile. It operates on tile payloads rather than scalar control state, and its legality is constrained by tile shape, layout, valid-region, and target-profile support.
For each element (i, j) in the valid region:
Syntax¶
Textual spelling is defined by the PTO ISA syntax-and-operands pages.
Synchronous form:
%dst = texpands %scalar : f32, !pto.tile<...>
AS Level 1 (SSA)¶
%dst = pto.texpands %scalar : dtype -> !pto.tile<...>
AS Level 2 (DPS)¶
pto.texpands ins(%scalar : dtype) outs(%dst : !pto.tile_buf<...>)
C++ Intrinsic¶
Declared in include/pto/common/pto_instr.hpp:
template <typename TileData, typename... WaitEvents>
PTO_INST RecordEvent TEXPANDS(TileData &dst, typename TileData::DType scalar, WaitEvents &... events);
Inputs¶
srcis the source tile.scalaris the scalar value broadcast to all lanes.dstnames the destination tile.- The operation iterates over
dst's valid region.
Expected Outputs¶
dst carries the result tile or updated tile payload produced by the operation.
Side Effects¶
No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.
Constraints¶
Constraints
- Valid region:
- For
TileType::Vec: - The op fills
dstoverdst.GetValidRow()/dst.GetValidCol(). - For
TileType::Mat: - For Tile : The op fills
dstoverTileData::Rows/TileData::Cols. - For ConvTile : The op fills
dstoverConvTileData's shape.
- For
Exceptions¶
Exceptions
- Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
- Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.
Target-Profile Restrictions¶
Target-Profile Restrictions
-
Implementation checks (A2A3):
- For
TileType::Vec: TileData::DTypemust be one of:uint8_t,int8_t,uint16_t,int16_t,uint32_t,int32_t,half,bfloat16_t,float.- Static valid bounds:
TileData::ValidRow <= TileData::RowsandTileData::ValidCol <= TileData::Cols. - For
TileType::Mat: TileData::DTypemust be one of:uint8_t,int8_t,uint16_t,int16_t,uint32_t,int32_t,half,bfloat16_t,float.- Static valid bounds:
The range of TileData::Rows * TileData::Cols * sizeof(T) / 32 is [1, 32767].
- For
-
Implementation checks (A5):
- For
TileType::Vec: TileData::DTypemust be one of:uint8_t,int8_t,uint16_t,int16_t,uint32_t,int32_t,half,float.- Tile layout must be row-major (
TileData::isRowMajor). - Static valid bounds:
TileData::ValidRow <= TileData::RowsandTileData::ValidCol <= TileData::Cols. - For
TileType::Mat: TileData::DTypemust be one of:uint8_t,int8_t,uint16_t,int16_t,uint32_t,int32_t,half,float.- For
TileDataDst::layout == pto::Layout::NC1HWC0 || TileDataDst::layout == pto::Layout::FRACTAL_Z:The range of convtile's (shape0 * shape1 * shape2 * shape3) is [1, 32767].
- For
TileDataDst::layout == pto::Layout::NDC1HWC0 || TileDataDst::layout == pto::Layout::FRACTAL_Z_3D:The range of convtile's (shape0 * shape1 * shape2 * shape3 * shape4) is [1, 32767].
- For
Examples¶
Auto¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example_auto() {
using TileT = Tile<TileType::Vec, float, 16, 16>;
TileT dst;
TEXPANDS(dst, 0.0f);
}
Manual¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example_manual() {
using TileT = Tile<TileType::Vec, float, 16, 16>;
TileT dst;
TASSIGN(dst, 0x1000);
TEXPANDS(dst, 0.0f);
}
Auto Mode¶
# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.texpands %scalar : dtype -> !pto.tile<...>
Manual Mode¶
# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.texpands %scalar : dtype -> !pto.tile<...>
PTO Assembly Form¶
%dst = texpands %scalar : f32, !pto.tile<...>
# AS Level 2 (DPS)
pto.texpands ins(%scalar : dtype) outs(%dst : !pto.tile_buf<...>)
Related Ops / Instruction Set Links¶
- Instruction set overview: Tile Scalar And Immediate
- Next op in instruction set: pto.tcmps