pto.pld¶
pto.pld is part of the Predicate Load Store instruction set.
Summary¶
Load the full predicate register from a UB location with a register-relative address offset.
Mechanism¶
pto.pld reads a predicate word from a UB address computed as base + areg * sizeof(predicate), then materializes it as !pto.mask<G>. The offset is sourced from a scalar register, making the effective address data-dependent.
For predicate width Pw, UB base base, and offset register areg:
\[ \mathrm{addr} = base + areg \times 8 $$
$$ \mathrm{mask} = \mathrm{READ\_UB}_{64}(\mathrm{addr}) \]
The offset register value is interpreted as a byte displacement in units of 8 bytes (64 bits). The register must contain a value such that the resulting effective address is 64-bit aligned.
Syntax¶
PTO Assembly Form¶
%mask = pto.pld %ub_ptr, %areg, "DIST" : !pto.ptr<i64, ub>, i32 -> !pto.mask<G>
AS Level 1 (SSA)¶
%mask = pto.pld %ub_ptr, %areg, "DIST" : !pto.ptr<i64, ub>, i32 -> !pto.mask<G>
AS Level 2 (DPS)¶
pto.pld ins(%ub_ptr, %areg, "DIST" : !pto.ptr<i64, ub>, i32) outs(%mask : !pto.mask<G>)
C++ Intrinsic¶
vector_bool dst;
__ubuf__ uint32_t *base;
vector_address offset;
pld(dst, base, offset, __cce_simd::NORM);
Inputs¶
| Operand | Type | Description |
|---|---|---|
%ub_ptr |
!pto.ptr<i64, ub> |
UB base address |
%areg |
i32 |
Scalar register holding the byte offset in 8-byte units |
"DIST" |
string attribute | Distribution mode: "NORM", "US", or "DS" |
Expected Outputs¶
| Result | Type | Description |
|---|---|---|
%mask |
!pto.mask<G> |
Loaded predicate register |
Side Effects¶
None.
Constraints¶
Constraints
- UB address space:
%ub_ptrMUST have address spaceub. - Offset alignment: The offset register MUST be set such that
base + areg * 8is 64-bit aligned. Misaligned effective addresses are illegal. - Distribution mode: The
distattribute MUST be one of"NORM","US", or"DS". Other modes are illegal for this form. - Predicate width: The load transfers exactly 64 bits, which MUST match the active element type context.
- Single active predicate: Loading a new predicate does not implicitly save a prior predicate. Programs that need to preserve predicate state MUST save it first.
Exceptions¶
Exceptions
- Illegal if
%ub_ptris not a UB-space pointer. - Illegal if the effective address (base + areg * 8) is not 64-bit aligned.
- Illegal if
distattribute is not a supported distribution mode.
Target-Profile Restrictions¶
Target-Profile Restrictions
| Aspect | CPU Sim | A2/A3 | A5 |
|---|---|---|---|
| Register-offset predicate load | Simulated | Supported | Supported |
"NORM" distribution mode |
Supported | Supported | Supported |
"US" / "DS" distribution modes |
Simulated | Supported | Supported |
Examples¶
Load predicate with register offset¶
#include <pto/pto-inst.hpp>
using namespace pto;
void load_with_offset(RegBuf<predicate_t>& dst,
Ptr<ub_space_t, ub_t> base,
int32_t slot) {
// slot is in units of 8 bytes (one predicate word per slot)
PLD(dst, base, slot, "NORM");
}
SSA form¶
// UB base at %ub_base; %c1 holds slot index (in 8-byte units)
%mask = pto.pld %ub_base, %c1, "NORM" : !pto.ptr<i64, ub>, i32 -> !pto.mask<G>
// Use predicate in predicated vector operation
%result = pto.vsel %v_a, %v_b, %mask : !pto.vreg<64xf32>, !pto.vreg<64xf32>, !pto.mask<b32> -> !pto.vreg<64xf32>
Related Ops / Instruction Set Links¶
- Instruction set overview: Predicate Load Store
- Previous op in instruction set: pto.plds
- Next op in instruction set: pto.pldi
- Control-shell overview: Control and configuration