DMA Copy¶
These pto.* forms configure and execute scalar-side DMA movement between GM, UB, and L1. They are part of the scalar and control instructions because they describe DMA configuration and copy behavior, not vector-register compute.
What This Instruction Set Covers¶
- Grouped GM↔UB transfers with inline burst / loop / pad clauses
- Grouped UB↔UB and UB→L1 copies
- (Pre-v0.6) standalone loop-size and loop-stride configuration registers
v0.6 Grouped Transfer Ops¶
These are the four public grouped DMA interfaces in the PTO ISA v0.6 micro-instruction surface. Each instruction expresses its repetition structure via inline nburst(...) / loop(...) clauses on the op itself; standalone loop / stride configuration registers are no longer required.
- pto.mte_gm_ub — GM → UB, with optional
pad(...)for 32B-aligned row padding - pto.mte_ub_gm — UB → GM, strips padding added during load
- pto.mte_ub_ub — intra-UB copy in 32B-unit bursts with gap fields
- pto.mte_ub_l1 — UB → L1 (cube CBUF), 32B-unit bursts with gap fields
Deprecated Pre-v0.6 Configuration Ops¶
These ops correspond to the older surface where loop counts and per-level strides were programmed via standalone configuration registers and then consumed by a separate copy op. In v0.6 the same information lives inline on the grouped transfer op (nburst(...) and outer loop(...) clauses). The pages below are retained for historical reference and pre-v0.6 ports.
- pto.set_loop_size_outtoub
- pto.set_loop2_stride_outtoub
- pto.set_loop1_stride_outtoub
- pto.set_loop_size_ubtoout
- pto.set_loop2_stride_ubtoout
- pto.set_loop1_stride_ubtoout
The legacy execution ops pto.copy_gm_to_ubuf / pto.copy_ubuf_to_gm / pto.copy_ubuf_to_ubuf have been replaced by the v0.6 grouped forms pto.mte_gm_ub / pto.mte_ub_gm / pto.mte_ub_ub linked above. Their per-op pages (URL slugs preserved) now document the v0.6 surface.