pto.psts

pto.psts is part of the Predicate Load Store instruction set.

Summary

Store the full predicate register to a contiguous UB location.

Mechanism

pto.psts writes a predicate word from !pto.mask<G> to a UB address. The operation covers the full predicate width for the active element type (64 bits for f32, 128 bits for f16/bf16, 256 bits for i8/u8).

For predicate width Pw and UB address base:

\[ \mathrm{WRITE\_UB}_{64}(base, mask) \]

The predicate register is read atomically. On A2/A3 and A5 only the low N bits of the predicate register (matching the current element type) are written to UB; bits outside are discarded. On CPU simulator all 64 bits are written.

Syntax

PTO Assembly Form

pto.psts %mask, %ub_ptr : !pto.mask<G>, !pto.ptr<i64, ub>

AS Level 1 (SSA)

pto.psts %mask, %ub_ptr : !pto.mask<G>, !pto.ptr<i64, ub>

AS Level 2 (DPS)

pto.psts ins(%mask, %ub_ptr : !pto.mask<G>, !pto.ptr<i64, ub>)

C++ Intrinsic

vector_bool src;
__ubuf__ uint32_t *base;
int32_t offset = 0;
psts(src, base, offset, __cce_simd::NORM);
psts(src, base, offset, __cce_simd::NORM, __cce_simd::POST_UPDATE);

Inputs

Operand Type Description
%mask !pto.mask<G> Predicate register to store
%ub_ptr !pto.ptr<i64, ub> UB destination address (must be 64-bit aligned)

Expected Outputs

None. This form is defined by its side effect on UB memory.

Side Effects

  • Writes the predicate register value to the UB location.
  • UB memory at the target address is modified.

Constraints

Constraints

  • UB address space: %ub_ptr MUST have address space ub. Pointers to other spaces are illegal.
  • Alignment: The effective address MUST be 64-bit aligned. Misaligned addresses are illegal.
  • Predicate width: The store transfers exactly 64 bits. The caller MUST ensure this matches the active element type context.
  • Write atomicity: The 64-bit predicate word is written atomically.

Exceptions

Exceptions

  • Illegal if %ub_ptr is not a UB-space pointer.
  • Illegal if the effective address is not 64-bit aligned.
  • Illegal if predicate width does not match the active element type context.

Target-Profile Restrictions

Target-Profile Restrictions
Aspect CPU Sim A2/A3 A5
Contiguous store Simulated Supported Supported
64-bit alignment requirement Enforced Enforced Enforced
Predicate width (f32 / f16,bf16 / i8) N=64/128/256 N=64/128/256 N=64/128/256

Examples

Store predicate to UB

#include <pto/pto-inst.hpp>
using namespace pto;

void save_mask(RegBuf<predicate_t>& src, Ptr<ub_space_t, ub_t> dst) {
    PSTS(src, dst);
}

SSA form

// Generate comparison mask
%mask = pto.vcmp %v0, %v1, %seed, "lt" : !pto.vreg<64xf32>, !pto.vreg<64xf32>, !pto.mask<b32> -> !pto.mask<b32>

// Store predicate to UB for later reuse
pto.psts %mask, %ub_mask_slot0 : !pto.mask<G>, !pto.ptr<i64, ub>