Predicate Generation And Algebra¶
Predicate generation and algebra operations create, combine, pack, unpack, and interleave !pto.mask<G> values on the scalar and control instructions. The !pto.mask<G> type is the lane-masking mechanism that pto.v* vector operations consume.
The !pto.mask<G> Type¶
!pto.mask<G> is a predicate mask type whose width is tied to the active element type rather than being a fixed number of bits:
| Element Type | Vector Width N | Predicate Width |
|---|---|---|
| f32 | 64 | 64 bits |
| f16 / bf16 | 128 | 128 bits |
| i8 / u8 | 256 | 256 bits |
A predicate mask with bit value 1 at position i means lane i is active; bit value 0 means lane i is inactive. Vector operations execute on active lanes only; inactive lanes produce zero on A2/A3 and A5, and may produce zero or undefined values on the CPU simulator.
Sub-category Overview¶
| Sub-category | Operations | Description | Static / Dynamic |
|---|---|---|---|
| Pattern-based construction | pset_b8, pset_b16, pset_b32 |
Build mask from named pattern | Static (compile-time pattern) |
| Comparison generation (≥) | pge_b8, pge_b16, pge_b32 |
Generate mask: i < scalar |
Dynamic (runtime scalar) |
| Comparison generation (<) | plt_b8, plt_b16, plt_b32 |
Generate mask: i ≥ scalar; also updates scalar |
Dynamic (runtime scalar) |
| Predicate pack | ppack |
Narrow: pack two N-bit masks into one 2N-bit mask | Static (partition token) |
| Predicate unpack | punpack |
Widen: extract half from a 2N-bit mask | Static (partition token) |
| Boolean algebra | pand, por, pxor, pnot |
AND / OR / XOR / NOT | Dynamic (runtime operands) |
| Predicate select | psel |
mask0 ? mask1 : mask2 |
Dynamic (runtime operands) |
| Deinterleave | pdintlv_b8, pdintlv_b16, pdintlv_b32 |
Deinterleave two predicate sources into two predicate outputs at the matching granularity | Static |
| Interleave | pintlv_b8, pintlv_b16, pintlv_b32 |
Interleave two predicate sources into two predicate outputs at the matching granularity | Static |
Pattern Tokens¶
pset_* operations accept pattern tokens that encode compile-time-known mask shapes:
| Pattern | Predicate Width | Meaning |
|---|---|---|
PAT_ALL |
All N | All lanes active |
PAT_ALLF |
All N | All lanes inactive |
PAT_H |
N/2 | High half active (upper N/2 lanes) |
PAT_Q |
N/4 | Upper quarter active |
PAT_VL1 … PAT_VL128 |
N | First N lanes active |
PAT_M3 |
N | Modular pattern: repeat every 3 lanes |
PAT_M4 |
N | Modular pattern: repeat every 4 lanes |
Partition Tokens¶
ppack and punpack use partition tokens to specify which half of the predicate register is accessed:
| Token | Meaning |
|---|---|
LOWER |
Lower N bits of the 2N-bit predicate register |
HIGHER |
Upper N bits of the 2N-bit predicate register |
Shared Constraints¶
All predicate generation and algebra operations MUST satisfy:
- Operand type: All predicate operands MUST be
!pto.mask<G>. Mixing predicate operands with scalar or vector register operands is illegal. - Predicate width consistency: All operands in a single operation MUST share the same predicate width. Operations that mix N-bit and 2N-bit predicates MUST use explicit pack/unpack.
- Pattern token validity: Pattern tokens MUST be supported by the target profile. Using a pattern token outside its supported width context is illegal.
- Scalar operand type: For
pge_*andplt_*operations, the scalar operand type MUST match the variant suffix (_b8→ i8,_b16→ i16,_b32→ i32). - Side effect: No predicate generation or algebra operation writes to UB or modifies architectural state beyond producing a predicate result.
Relationship Between pset, pge, and plt¶
pset_*→ static mask, fully determined at compile time from the pattern tokenpge_*→ dynamic mask, depends on a runtime scalar value; predicate laneiis active iffi < scalarplt_*→ dynamic mask AND scalar update; predicate laneiis active iffi < scalar, andscalar_out = scalar - N
plt_* operations are designed for software-pipelined remainder loops where the scalar counter is decremented by the vector length each iteration.