pto.mte_l1_l0a¶

pto.mte_l1_l0a is part of Cube Data Movement Ops.

Summary¶

Load a logical %m x %k left tile from L1 l1 into l0a for pto.mad* consumption. The source must already be in cube-fractal NZ layout; this op does not convert arbitrary row-major matrices. Use pto.mte_gm_l1_frac to repack ND/DN source data first.

Mechanism¶

The op moves an L1 cube-fractal tile into the L0A operand domain. The destination layout follows NZ Fractal Layout for L0A (K1 M1 M0 K0, FRACTAL_NZ on A5 / FRACTAL_ZZ on A3).

If transpose = true, the selected logical source tile is transposed before placement in the destination operand domain. Omitting the attribute means transpose = false.

Syntax¶

pto.mte_l1_l0a %src, %dst, %m, %k
  : !pto.ptr<T, l1>, !pto.ptr<T, l0a>, i64, i64

Inputs¶

Parameter	Width	Description
`%src`	ptr	L1 cube-fractal source tile in `l1`
`%dst`	ptr	Left operand destination in `l0a`
`%m`	i64	Logical M extent
`%k`	i64	Logical K extent
`transpose`	attr	Optional boolean source-tile transpose before destination placement

Expected Outputs¶

Result	Type	Description
None	`—`	Writes the L0A tile that subsequent `pto.mad*` will read.

Side Effects¶

Reads L1; writes L0A. Engages the AIC MTE1 pipe.

Constraints¶

Constraints

%src must be in l1, %dst must be in l0a.
%src and %dst must satisfy the target alignment for Cube tile loads.
transpose = true requires a tile shape supported by the element-type transpose granularity.

Examples¶

pto.mte_l1_l0a %l1_a, %l0a, %c16_i64, %c32_i64
  : !pto.ptr<f16, l1>, !pto.ptr<f16, l0a>, i64, i64

Right operand load: pto.mte_l1_l0b
MX scale loader: pto.mte_l1_l0a_mx
Upstream repack: pto.mte_gm_l1_frac
MAD consumers: pto.mad, pto.mad_acc, pto.mad_bias