pto.mte_l1_l0a¶
pto.mte_l1_l0a is part of Cube Data Movement Ops.
Summary¶
Load a logical %m x %k left tile from L1 l1 into l0a for pto.mad* consumption. The source must already be in cube-fractal NZ layout; this op does not convert arbitrary row-major matrices. Use pto.mte_gm_l1_frac to repack ND/DN source data first.
Mechanism¶
The op moves an L1 cube-fractal tile into the L0A operand domain. The destination layout follows NZ Fractal Layout for L0A (K1 M1 M0 K0, FRACTAL_NZ on A5 / FRACTAL_ZZ on A3).
If transpose = true, the selected logical source tile is transposed before placement in the destination operand domain. Omitting the attribute means transpose = false.
Syntax¶
pto.mte_l1_l0a %src, %dst, %m, %k
: !pto.ptr<T, l1>, !pto.ptr<T, l0a>, i64, i64
Inputs¶
| Parameter | Width | Description |
|---|---|---|
%src |
ptr | L1 cube-fractal source tile in l1 |
%dst |
ptr | Left operand destination in l0a |
%m |
i64 | Logical M extent |
%k |
i64 | Logical K extent |
transpose |
attr | Optional boolean source-tile transpose before destination placement |
Expected Outputs¶
| Result | Type | Description |
|---|---|---|
| None | — |
Writes the L0A tile that subsequent pto.mad* will read. |
Side Effects¶
Reads L1; writes L0A. Engages the AIC MTE1 pipe.
Constraints¶
Constraints
%srcmust be inl1,%dstmust be inl0a.%srcand%dstmust satisfy the target alignment for Cube tile loads.transpose = truerequires a tile shape supported by the element-type transpose granularity.
Examples¶
pto.mte_l1_l0a %l1_a, %l0a, %c16_i64, %c32_i64
: !pto.ptr<f16, l1>, !pto.ptr<f16, l0a>, i64, i64
Related Ops¶
- Right operand load: pto.mte_l1_l0b
- MX scale loader: pto.mte_l1_l0a_mx
- Upstream repack: pto.mte_gm_l1_frac
- MAD consumers: pto.mad, pto.mad_acc, pto.mad_bias