pto.wait_intra_core¶
pto.wait_intra_core is part of the Pipeline Sync instruction set.
Summary¶
Block a specific pipeline within a cluster (A5) until a subblock signals an event. Only the named pipeline stalls; other pipelines on the same core continue executing.
Mechanism¶
pto.wait_intra_core blocks a specific pipeline on the calling core until the named event is signaled by a remote subblock.
Per-pipeline blocking (A5):
- Unlike A2A3's
wait_flag_devwhich stalls the entire SU (all pipelines),wait_intra_coreonly stalls the named pipeline. - Other pipelines on the same core continue executing while one pipeline is blocked.
- The semaphore pool uses 16 physical IDs with a 32-ID address space: IDs 0–15 target AIV0; IDs 16–31 target AIV1.
Key advantage over A2A3: A5's wait_intra_core enables finer-grained parallelism where multiple pipelines can be in different synchronization states simultaneously.
Syntax¶
PTO Assembly Form¶
pto.wait_intra_core %pipe, %sem_id : i64, i64
AS Level 1 (SSA)¶
pto.wait_intra_core %pipe, %sem_id : i64, i64
C++ Intrinsic¶
The installed 3510 public CCE headers spell this operation as wait_intra_block(...). The PTO page keeps the wait_intra_core ISA name, but the call surface below is the public Bisheng entry point that is actually shipped.
pipe_t pipe = PIPE_V;
uint64_t syncId = 0;
wait_intra_block(pipe, syncId);
Inputs¶
| Operand | Type | Description |
|---|---|---|
%pipe |
pipe_t |
The pipeline that should wait (only this pipeline stalls) |
%sem_id |
i64 |
Semaphore ID: 0–15 for AIV0, base+15 for AIV1 |
Expected Outputs¶
None. This form is defined by its side effect (blocking) on the named pipeline.
Side Effects¶
- Blocks only the named pipeline. Other pipelines on the same core continue.
- Decrements the semaphore counter when the event is signaled.
Constraints¶
Constraints
- A5 only:
wait_intra_coreis only available on the A5 profile. - Per-pipeline blocking: Only the named pipeline is blocked. All other pipelines continue. This differs fundamentally from A2A3's SU-level blocking.
- Semaphore ID mapping: IDs 0–15 target AIV0; IDs 16–31 target AIV1.
- Event must be set: Waiting on an event that was never set is illegal.
- Semaphore pool: 16 physical IDs, 32-ID address space. IDs outside 0–31 are illegal.
Exceptions¶
Exceptions
- Illegal on non-A5 profiles.
- Illegal if
%sem_idis outside the range 0–31. - Illegal if the event was never set by a remote subblock.
Target-Profile Restrictions¶
Target-Profile Restrictions
| Aspect | CPU Sim | A2/A3 | A5 |
|---|---|---|---|
wait_intra_core |
Not available | Use wait_flag_dev |
Supported |
| Blocking scope | Not applicable | Entire SU blocked | Only named pipe blocked |
| Other pipes during wait | Not applicable | All stalled | Continue executing |
| Semaphore pool | Not applicable | 16 IDs | 16 IDs, 32-ID address space |
Examples¶
A5: per-pipeline blocking vs. A2A3 SU-level blocking¶
#include <pto/pto-inst.hpp>
using namespace pto;
// A2A3 (wait_flag_dev): entire core stalls
// WAIT_FLAG_DEV(/* event_id */ 0); // ALL pipelines blocked
// A5 (wait_intra_core): only PIPE_V stalls
WAIT_INTRA_CORE(PIPE_V, /* sem_id */ 0); // Only Vector pipe stalls
// PIPE_MTE2 and PIPE_MTE3 continue executing
SSA form — A5 C→V signaling with per-pipeline waits¶
// === AIC signals both AIV subblocks (no broadcast — separate calls) ===
pto.set_intra_block "PIPE_MTE2", %c0_i64 : i64, i64 // → AIV0
pto.set_intra_block "PIPE_MTE2", %c16_i64 : i64, i64 // → AIV1
// === AIV0: Vector pipe waits (only Vector stalls; MTE2/MTE3 continue) ===
pto.wait_intra_core "PIPE_V", %c0_i64 : i64, i64
// [AIV0 Vector processes data while AIV0 MTE2/MTE3 continue]
// === AIV1: Vector pipe waits ===
pto.wait_intra_core "PIPE_V", %c16_i64 : i64, i64
// === AIV0 signals back to AIC ===
pto.set_intra_block "PIPE_V", %c1_i64 : i64, i64
// === AIV1 signals back to AIC ===
pto.set_intra_block "PIPE_V", %c1_i64 : i64, i64
// === AIC: waits for both AIV subblocks (separate waits) ===
pto.wait_intra_core "PIPE_MTE2", %c1_i64 : i64, i64 // wait for AIV0
pto.wait_intra_core "PIPE_MTE2", %c17_i64 : i64, i64 // wait for AIV1
Related Ops / Instruction Set Links¶
- Instruction set overview: Pipeline Sync
- Previous op in instruction set: pto.set_intra_block
- Next op in instruction set: (none — last in instruction set)
- Control-shell overview: Control and configuration