pto.mte_l0c_gm¶
pto.mte_l0c_gm is part of Cube Data Movement Ops. It is one of the three FIXPIPE writeback ops; see FIXPIPE Model for the shared writeback pipeline.
Summary¶
FIXPIPE writeback from l0c to GM. The data transform clauses match pto.mte_l0c_l1; GM-specific operands select the GM write path and optional atomic update behavior.
Syntax¶
pto.mte_l0c_gm %src, %dst, %m, %n, %src_stride, %dst_stride, %sid, %l2_cache_ctrl
[, unit_flag(check_only | check_and_clear)]?
[, pre_quant(%payload, mode = <quant_pre_mode>)]?
[, pre_relu([%payload, ]mode = <relu_pre_mode> [, clip = %clip])]?
[, nz2nd | nz2dn(%loop0_src_stride) | nz2nz(%split)?]
[, loop3(%count, %src_stride3, %dst_stride3)]?
[, sat | sat(preserve_nan) | nosat]?
[, atomic(type = <atomic_type>, op = <atomic_op>)]?
: ...
Inputs¶
| Parameter | Width | Description |
|---|---|---|
%src, %m, %n, %src_stride |
— | Same as pto.mte_l0c_l1 |
%dst |
buffer-like | GM destination |
%dst_stride |
i64 | GM destination stride in destination elements |
%sid |
i64 | GM stream/session hint; does not change written values |
%l2_cache_ctrl |
i64 | GM store cache hint; does not change written values |
atomic(type = ..., op = ...) |
clause | Optional GM read-modify-write |
| other optional clauses | — | Same as pto.mte_l0c_l1 |
%sid and %l2_cache_ctrl affect the memory path only — they do not change the logical result, destination layout, numeric conversion, or atomic operation. For target-profile GM writeback, constant %sid values must be in [0, 3] (use 0 unless the surrounding memory system deliberately assigns a different stream/session hint). Constant %l2_cache_ctrl values must fit in the target cache-control hint range [0, 15].
atomic(type = T, op = add|max|min) performs an atomic read-modify-write at each GM destination element. add accumulates the converted value into the existing GM value. max and min compare using T and write the selected value. Supported atomic types: f32, f16, bf16, s32, s16, s8.
Expected Outputs¶
| Result | Type | Description |
|---|---|---|
| None | — |
Writes converted M x N result to GM. |
Side Effects¶
Reads L0C; writes GM. Engages the AIC FIXP pipe. If atomic(...) is present, the GM update is read-modify-write.
Constraints¶
Constraints
atomic(...)is valid only onpto.mte_l0c_gm.atomicrequires bothtypeandop.- Atomic op values are
add,max, andmin. - If
%sidor%l2_cache_ctrlis a constant, it must be in the target range described above. - Other constraints match
pto.mte_l0c_l1.
Examples¶
pto.mte_l0c_gm %l0c, %out, %c16_i64, %c32_i64, %c16_i64, %c32_i64,
%c0_i64, %c0_i64,
pre_quant(%c1_f32, mode = qf322f16_pre_scalar),
nz2nd,
atomic(type = f16, op = add)
: !pto.ptr<f32, l0c>, !pto.ptr<f16, gm>, i64, i64, i64, i64, i64, i64, f32
Related Ops¶
- FIXPIPE writeback siblings: pto.mte_l0c_l1, pto.mte_l0c_ub
- Parameter payload loader: pto.mte_l1_fb
- MAD producers: pto.mad and variants