Sync And Config Instruction Set¶
Sync-and-config operations manage tile-visible state: resource binding, event setup, mode control, and synchronization. These operations do not produce arithmetic payload — they change state that later tile instructions consume.
Operations¶
| Operation | Description | Category | C++ Intrinsic | |
|---|---|---|---|---|
| pto.tassign | Bind tile register to a UB address | Resource | TASSIGN(tile, addr) |
|
| pto.tsync | Synchronize execution, wait on events, insert barrier | Sync | TSYNC(events...) |
|
| pto.syncall | Cross-core synchronization barrier | Sync | SYNCALL() |
|
| pto.talias | Create an alias view that shares tile storage | View | TALIAS(dst, src) |
|
| pto.sethf32mode | Set HF32 computation mode | Config | SETHF32MODE(mode) |
|
| pto.settf32mode | Set TF32 computation mode | Config | SETTF32MODE(mode) |
|
| pto.setfmatrix | Set FMATRIX engine mode and address | Config | SETFMATRIX(tile) |
|
| pto.set_img2col_rpt | Set img2col repetition count | Config | SET_IMG2COL_RPT(rpt) |
|
| pto.set_img2col_padding | Set img2col padding configuration | Config | SET_IMG2COL_PADDING(pad) |
|
| pto.subview | Create a sub-view of a tile | View | SUBVIEW(tile, offsets, shape) |
|
| pto.get_scale_addr | Get scale address for quantized matmul | Config | GET_SCALE_ADDR(tile) |
Mechanism¶
Sync-and-config operations change tile-visible state that later tile instructions consume:
TASSIGN: binds a physical UB address to a tile register. WithoutTASSIGN, the compiler/runtime auto-assigns addresses.TASSIGNenables manual placement for performance tuning.TSYNC: waits on event tokens (events...) or inserts per-op pipeline barriers (TSYNC<Op>()). See Ordering and Synchronization for the full event model.SYNCALL: synchronizes selected core participants through the cross-core control plane. Hardware mode uses FFTS, and software mode uses GM polling workspace.TALIAS: creates a second tile view over the same payload storage. It changes the visible tile view, not the underlying bytes.SETHF32MODE/SETTF32MODE/SETFMATRIX/SET_IMG2COL_RPT/SET_IMG2COL_PADDING: tile-local configuration for HF32/TF32 computation mode, FMATRIX engine binding, and IMG2COL parameters. These program tile-side registers consumed by subsequent compute and DMA operations.SUBVIEW: creates a logical view of a tile with adjusted offsets and/or reduced shape. The underlying storage is shared with the source tile.GET_SCALE_ADDR: computes a right-shifted address of a scale tensor used in quantized matmul operations.
Sync Model¶
TSYNC operates at two levels:
-
Event-wait form:
TSYNC(%e0, %e1)blocks until the specified events have been recorded. Events are produced by preceding operations (e.g.,TLOADproduces an event;TSYNCwaits on it). -
Barrier form:
TSYNC<Op>()inserts a pipeline barrier for the specified operation class. All operations of classOpthat appear before the barrier complete before any operation of classOpthat appears after the barrier begins.
See Producer-Consumer Ordering for the complete synchronization model.
Constraints¶
Constraints
TASSIGNbinds an address; using the same address for two non-alias tiles simultaneously results in undefined behavior.TSYNCwith no operands is a no-op.- Tile-side configuration operations affect subsequent operations until the next mode-setting operation of the same kind.
SUBVIEWcreates a view with reduced shape; accessing elements outside the view's shape but within the underlying tile's shape is undefined behavior.TALIASshares storage with its source; writes through either view are visible through the other view according to the alias contract.
Cases That Are Not Allowed¶
Cases That Are Not Allowed
- MUST NOT use the same physical tile register for two non-alias tiles without an intervening
TSYNC. - MUST NOT wait on an event that has not been produced by a preceding operation.
- MUST NOT configure mode registers while dependent operations are in-flight.
C++ Intrinsic¶
#include <pto/pto-inst.hpp>
using namespace pto;
// Assign tile to UB address
template <typename TileT>
PTO_INST void TASSIGN(TileT& tile, uint64_t addr);
// Synchronize on events
template <typename... EventTs>
PTO_INST RecordEvent TSYNC(EventTs&... events);
// Pipeline barrier for op class
template <typename OpTag>
PTO_INST void TSYNC();
// Cross-core synchronization barrier
template <SyncCoreType CoreType = SyncCoreType::AIVOnly>
PTO_INST void SYNCALL();
// Set computation modes
PTO_INST void SETHF32MODE(bool enable, RoundMode mode);
PTO_INST void SETTF32MODE(bool enable, RoundMode mode);
PTO_INST void SETFMATRIX(TileData& tile);
// Subview creation
template <typename TileT>
PTO_INST TileT SUBVIEW(TileT& src, int rowOffset, int colOffset,
int newRows, int newCols);
// Get scale address for quantized matmul
PTO_INST void GET_SCALE_ADDR(TileDataDst& dst, TileDataSrc& src);
See Also¶
- Tile instruction set — Instruction set overview
- Ordering and Synchronization — Event model
- Tile instruction set — Instruction Set description