pto.punpack¶
pto.punpack is part of the Predicate Generation And Algebra instruction set.
Summary¶
Widening unpack: extract one N-bit segment from a 2N-bit predicate register, zero-filling the non-selected half of the source.
Mechanism¶
pto.punpack takes a 2N-bit predicate register and a partition token, and produces an N-bit predicate by selecting one half and zero-filling the other. It is the inverse of ppack.
For source predicate src with 2N bits and partition token P:
\[ \mathrm{dst}_N = \begin{cases} \mathrm{LOWER}(\mathrm{src}_{2N}) & \text{if } P = \text{LOWER} \\ \mathrm{UPPER}(\mathrm{src}_{2N}) & \text{if } P = \text{HIGHER} \end{cases} \]
Syntax¶
PTO Assembly Form¶
%dst = pto.punpack %src, "PART" : !pto.mask<G> -> !pto.mask<G>
AS Level 1 (SSA)¶
%dst = pto.punpack %src, "PART" : !pto.mask<G> -> !pto.mask<G>
AS Level 2 (DPS)¶
pto.punpack ins(%src, "PART" : !pto.mask<G>) outs(%dst : !pto.mask<G>)
C++ Intrinsic¶
vector_bool dst;
vector_bool src;
punpack(dst, src, __cce_simd::LOWER);
Inputs¶
| Operand | Type | Description |
|---|---|---|
%src |
!pto.mask<G> |
Source 2N-bit predicate |
"PART" |
string attribute | Partition token: "LOWER" or "HIGHER" |
Expected Outputs¶
| Result | Type | Description |
|---|---|---|
%dst |
!pto.mask<G> |
N-bit predicate extracted from the selected half |
Side Effects¶
None.
Constraints¶
Constraints
- Partition token: MUST be
"LOWER"or"HIGHER". Other tokens are illegal. - Source width: The source predicate MUST be 2N bits. Programs MUST ensure the source context provides a 2N-bit predicate.
- Destination width: The destination predicate is always N bits. Programs that need a 2N-bit result after extraction MUST use
ppackto reconstruct it. - Zero-fill behavior: The non-selected half of the source is ignored (zero-filled); the destination does NOT contain a concatenation or merge of both halves.
Exceptions¶
Exceptions
- Illegal if the partition token is not
"LOWER"or"HIGHER". - Illegal if source and destination predicate widths are not in a 2:1 ratio.
Target-Profile Restrictions¶
Target-Profile Restrictions
| Aspect | CPU Sim | A2/A3 | A5 |
|---|---|---|---|
| Unpack operation | Simulated | Supported | Supported |
LOWER / HIGHER tokens |
Supported | Supported | Supported |
Examples¶
Extract upper half of a 64-bit predicate¶
#include <pto/pto-inst.hpp>
using namespace pto;
void extract_upper(RegBuf<predicate_t>& dst,
const RegBuf<predicate_t>& src_64) {
PUNPACK(dst, src_64, "HIGHER");
}
Extract and re-pack with modification¶
// %full_64: 64-bit predicate from a comparison
// Extract lower half
%lo = pto.punpack %full_64, "LOWER" : !pto.mask<G> -> !pto.mask<G>
// Extract upper half
%hi = pto.punpack %full_64, "HIGHER" : !pto.mask<G> -> !pto.mask<G>
// Modify lower half (e.g., invert)
%lo_inv = pto.pnot %lo, %lo : !pto.mask<G>, !pto.mask<G> -> !pto.mask<G>
// Re-pack into 64-bit predicate
%new_lo = pto.ppack %lo_inv, "LOWER" : !pto.mask<G> -> !pto.mask<G>
%new_hi = pto.ppack %hi, "HIGHER" : !pto.mask<G> -> !pto.mask<G>
%new_full = pto.por %new_lo, %new_hi, %new_lo : !pto.mask<G>, !pto.mask<G>, !pto.mask<G> -> !pto.mask<G>
Related Ops / Instruction Set Links¶
- Instruction set overview: Predicate Generation And Algebra
- Previous op in instruction set: pto.ppack
- Next op in instruction set: pto.pand
- Control-shell overview: Control and configuration