TMOV

Tile Operation Diagram

TMOV tile operation

Introduction

Move/copy between tiles, optionally applying implementation-defined conversion modes selected by template parameters and overloads.

TMOV is used for:

  • Vec -> Vec moves
  • Mat -> Left/Right/Bias/Scaling/Scale(Microscaling) moves (target-dependent)
  • Acc -> Vec moves (target-dependent)

Math Interpretation

Conceptually copies or transforms elements from src into dst over the valid region. Exact transformation depends on the selected mode and target.

For the pure copy case:

\[ \mathrm{dst}_{i,j} = \mathrm{src}_{i,j} \]

Assembly Syntax

PTO-AS form: see PTO-AS Specification.

The PTO AS design recommends splitting TMOV into a family of ops:

%left  = tmov.m2l %mat  : !pto.tile<...> -> !pto.tile<...>
%right = tmov.m2r %mat  : !pto.tile<...> -> !pto.tile<...>
%bias  = tmov.m2b %mat  : !pto.tile<...> -> !pto.tile<...>
%scale = tmov.m2s %mat  : !pto.tile<...> -> !pto.tile<...>
%vec   = tmov.a2v %acc  : !pto.tile<...> -> !pto.tile<...>
%v1    = tmov.v2v %v0   : !pto.tile<...> -> !pto.tile<...>

AS Level 1 (SSA)

%dst = pto.tmov.s2d %src  : !pto.tile<...> -> !pto.tile<...>

AS Level 2 (DPS)

pto.tmov ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

C++ Intrinsic

Declared in include/pto/common/pto_instr.hpp and include/pto/common/constants.hpp:

template <typename DstTileData, typename SrcTileData, typename... WaitEvents>
PTO_INST RecordEvent TMOV(DstTileData &dst, SrcTileData &src, WaitEvents &... events);

template <typename DstTileData, typename SrcTileData, ReluPreMode reluMode, typename... WaitEvents>
PTO_INST RecordEvent TMOV(DstTileData &dst, SrcTileData &src, WaitEvents &... events);

template <typename DstTileData, typename SrcTileData, AccToVecMode mode, ReluPreMode reluMode = ReluPreMode::NoRelu,
          typename... WaitEvents>
PTO_INST RecordEvent TMOV(DstTileData &dst, SrcTileData &src, WaitEvents &... events);

template <typename DstTileData, typename SrcTileData, typename FpTileData, AccToVecMode mode,
          ReluPreMode reluMode = ReluPreMode::NoRelu, typename... WaitEvents>
PTO_INST RecordEvent TMOV(DstTileData &dst, SrcTileData &src, FpTileData &fp, WaitEvents &... events);

template <typename DstTileData, typename SrcTileData, ReluPreMode reluMode = ReluPreMode::NoRelu,
          typename... WaitEvents>
PTO_INST RecordEvent TMOV(DstTileData &dst, SrcTileData &src, uint64_t preQuantScalar, WaitEvents &... events);

template <typename DstTileData, typename SrcTileData, AccToVecMode mode, ReluPreMode reluMode = ReluPreMode::NoRelu,
          typename... WaitEvents>
PTO_INST RecordEvent TMOV(DstTileData &dst, SrcTileData &src, uint64_t preQuantScalar, WaitEvents &... events);

Constraints

  • Implementation checks (A2A3):
    • Shape rules:
      • Shapes must match: SrcTileData::Rows == DstTileData::Rows and SrcTileData::Cols == DstTileData::Cols.
    • Supported location pairs (compile-time checked):
      • Mat -> Left/Right/Bias/Scaling
      • Vec -> Vec
      • Acc -> Mat
    • Additional checks by path:
      • Acc -> Mat: additional fractal and dtype constraints are enforced (for example, Acc uses an NZ-like fractal, Mat uses a 512B fractal, and only specific dtype conversions are allowed).
  • Implementation checks (A5):
    • Shape rules:
      • For Mat -> Left/Right/Bias/Scaling/Scale, shapes must match.
      • For Vec -> Vec and Vec -> Mat, the effective copy region may be determined by the valid rows/cols of source and destination.
    • Supported location pairs include (target-dependent):
      • Mat -> Left/Right/Bias/Scaling/Scale
      • Vec -> Vec/Mat
      • Acc -> Vec/Mat
    • Acc -> Vec supports additional AccToVecMode forms; some forms also take FpTileData or preQuantScalar.

Examples

Auto

#include <pto/pto-inst.hpp>

using namespace pto;

void example_auto() {
  using TileT = Tile<TileType::Vec, float, 16, 16>;
  TileT src, dst;
  TMOV(dst, src);
}

Manual

#include <pto/pto-inst.hpp>

using namespace pto;

void example_manual() {
  using SrcT = Tile<TileType::Mat, float, 16, 16, BLayout::RowMajor, 16, 16, SLayout::ColMajor>;
  using DstT = TileLeft<float, 16, 16>;
  SrcT mat;
  DstT left;
  TASSIGN(mat, 0x1000);
  TASSIGN(left, 0x2000);
  TMOV(left, mat);
}

ASM Form Examples

Auto Mode

# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.tmov.s2d %src  : !pto.tile<...> -> !pto.tile<...>

Manual Mode

# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.tmov.s2d %src  : !pto.tile<...> -> !pto.tile<...>

PTO Assembly Form

%dst = pto.tmov.s2d %src  : !pto.tile<...> -> !pto.tile<...>
# AS Level 2 (DPS)
pto.tmov ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)