pto.punpack

pto.punpack is part of the Predicate Generation And Algebra instruction set.

Summary

Widening unpack: extract one N-bit segment from a 2N-bit predicate register, zero-filling the non-selected half of the source.

Mechanism

pto.punpack takes a 2N-bit predicate register and a partition token, and produces an N-bit predicate by selecting one half and zero-filling the other. It is the inverse of ppack.

For source predicate src with 2N bits and partition token P:

\[ \mathrm{dst}_N = \begin{cases} \mathrm{LOWER}(\mathrm{src}_{2N}) & \text{if } P = \text{LOWER} \\ \mathrm{UPPER}(\mathrm{src}_{2N}) & \text{if } P = \text{HIGHER} \end{cases} \]

Syntax

PTO Assembly Form

%dst = pto.punpack %src, "PART" : !pto.mask<G> -> !pto.mask<G>

AS Level 1 (SSA)

%dst = pto.punpack %src, "PART" : !pto.mask<G> -> !pto.mask<G>

AS Level 2 (DPS)

pto.punpack ins(%src, "PART" : !pto.mask<G>) outs(%dst : !pto.mask<G>)

C++ Intrinsic

vector_bool dst;
vector_bool src;
punpack(dst, src, __cce_simd::LOWER);

Inputs

Operand Type Description
%src !pto.mask<G> Source 2N-bit predicate
"PART" string attribute Partition token: "LOWER" or "HIGHER"

Expected Outputs

Result Type Description
%dst !pto.mask<G> N-bit predicate extracted from the selected half

Side Effects

None.

Constraints

Constraints

  • Partition token: MUST be "LOWER" or "HIGHER". Other tokens are illegal.
  • Source width: The source predicate MUST be 2N bits. Programs MUST ensure the source context provides a 2N-bit predicate.
  • Destination width: The destination predicate is always N bits. Programs that need a 2N-bit result after extraction MUST use ppack to reconstruct it.
  • Zero-fill behavior: The non-selected half of the source is ignored (zero-filled); the destination does NOT contain a concatenation or merge of both halves.

Exceptions

Exceptions

  • Illegal if the partition token is not "LOWER" or "HIGHER".
  • Illegal if source and destination predicate widths are not in a 2:1 ratio.

Target-Profile Restrictions

Target-Profile Restrictions
Aspect CPU Sim A2/A3 A5
Unpack operation Simulated Supported Supported
LOWER / HIGHER tokens Supported Supported Supported

Examples

Extract upper half of a 64-bit predicate

#include <pto/pto-inst.hpp>
using namespace pto;

void extract_upper(RegBuf<predicate_t>& dst,
                   const RegBuf<predicate_t>& src_64) {
    PUNPACK(dst, src_64, "HIGHER");
}

Extract and re-pack with modification

// %full_64: 64-bit predicate from a comparison

// Extract lower half
%lo = pto.punpack %full_64, "LOWER" : !pto.mask<G> -> !pto.mask<G>

// Extract upper half
%hi = pto.punpack %full_64, "HIGHER" : !pto.mask<G> -> !pto.mask<G>

// Modify lower half (e.g., invert)
%lo_inv = pto.pnot %lo, %lo : !pto.mask<G>, !pto.mask<G> -> !pto.mask<G>

// Re-pack into 64-bit predicate
%new_lo = pto.ppack %lo_inv, "LOWER" : !pto.mask<G> -> !pto.mask<G>
%new_hi = pto.ppack %hi, "HIGHER" : !pto.mask<G> -> !pto.mask<G>
%new_full = pto.por %new_lo, %new_hi, %new_lo : !pto.mask<G>, !pto.mask<G>, !pto.mask<G> -> !pto.mask<G>