Scalar And Control Instruction Set: Shared Structured Control Flow

PTO source programs use shared MLIR scf operations to express loops, branches, and loop-carried state around PTO regions. These are part of the documented source instruction set, but they are not PTO mnemonic instruction sets.

Summary

Shared structured control flow gives PTO a control shell that stays analyzable and explicit. It avoids inventing custom PTO branch syntax for logic that is already represented clearly by scf.for, scf.if, scf.while, scf.condition, and scf.yield.

Mechanism

scf surrounds PTO regions rather than replacing them. It is used to:

  • express counted loops around repeated tile or vector work
  • carry scalar or tile state across iterations
  • model structured conditional execution
  • keep control flow visible to analyses and lowerings

This matters especially for the vector instructions, where __VEC_SCOPE__ is modeled using structured control rather than an opaque launch node.

Inputs

Shared structured control flow consumes:

  • scalar predicates
  • loop bounds and step values
  • region-carried SSA state
  • yielded values from nested branches or loops

Expected Outputs

It produces:

  • well-structured control regions
  • explicit loop-carried values
  • branch-selected scalar or tile state

Side Effects

scf itself does not create DMA, synchronization, or payload effects. Those effects come from the PTO instructions inside the structured regions.

Constraints

Constraints

  • PTO control flow SHOULD stay in structured scf form unless a more specific architecture-visible mechanism is required.
  • Region-carried values and branch results MUST be explicit through scf.yield.
  • Predicate construction for scf control SHOULD come from the shared scalar instructions, not from undocumented control side channels.

Exceptions

Exceptions

The following are ILLEGAL:

  • pretending scf is a PTO mnemonic instruction set
  • hiding loop-carried state that later affects PTO legality
  • collapsing structured control into vague prose instead of documenting the carried values and branch conditions

Target-Profile Restrictions

Target-Profile Restrictions

The scf instruction set is largely target-neutral. Restrictions appear when a region contains target-profile-specific PTO instructions or when a backend imposes extra structure on a vector-execution scope.

Examples

Counted Loop Around Vector Work

scf.for %i = %c0 to %tile_count step %c1 {
  %offset = arith.muli %i, %tile_stride : index
  %mask = pto.pset_b32 "PAT_ALL" : !pto.mask<G>
  %v = pto.vlds %ub[%offset] : !pto.ptr<f32, ub> -> !pto.vreg<64xf32>
  %abs = pto.vabs %v, %mask : !pto.vreg<64xf32>, !pto.mask<b32> -> !pto.vreg<64xf32>
  pto.vsts %abs, %ub_out[%offset], %mask : !pto.vreg<64xf32>, !pto.ptr<f32, ub>, !pto.mask<b32>
}

Structured Conditional Around Tile Update

%need_tail = arith.cmpi slt, %valid_cols, %tile_cols : index
scf.if %need_tail {
  pto.tsubs ins(%tile, %bias : !pto.tile_buf<...>, f32) outs(%tile : !pto.tile_buf<...>)
} else {
  pto.tadds ins(%tile, %bias : !pto.tile_buf<...>, f32) outs(%tile : !pto.tile_buf<...>)
}