pto.vlrelu¶
pto.vlrelu is part of the Vector-Scalar Instructions instruction set.
Summary¶
Lane-wise leaky ReLU with a broadcast slope scalar.
Mechanism¶
For each active lane i, dst[i] = (src[i] >= 0) ? src[i] : scalar * src[i]. The scalar operand supplies the negative-path slope for every active lane. Inactive lanes do not participate in the computation.
Syntax¶
PTO Assembly Form¶
vlrelu %dst, %src, %slope, %mask : !pto.vreg<NxT>, T
AS Level 1 (SSA)¶
%result = pto.vlrelu %input, %scalar, %mask : !pto.vreg<NxT>, T, !pto.mask<G> -> !pto.vreg<NxT>
Inputs¶
| Operand | Type | Description |
|---|---|---|
| %input | !pto.vreg<NxT> |
Source activation vector |
| %scalar | T |
Negative-path slope broadcast to every active lane |
| %mask | !pto.mask<G> |
Predicate mask; only lanes with mask bit 1 participate |
Expected Outputs¶
| Result | Type | Description |
|---|---|---|
| %result | !pto.vreg<NxT> |
Lane-wise leaky-ReLU result on the active lanes |
Side Effects¶
This operation has no architectural side effect beyond producing its destination values. It does not implicitly reserve buffers, signal events, or establish memory fences.
Constraints¶
Constraints
- The current manual documents floating-point forms for
f16andf32. %inputand%resultMUST have the same vector widthNand element typeT.- The mask width MUST match
N.
Exceptions¶
Exceptions
- The verifier rejects illegal operand shapes, unsupported element types, and attribute combinations that are not valid for the selected instruction set or target profile.
- Any additional illegality stated in the constraints section is also part of the contract.
Target-Profile Restrictions¶
Target-Profile Restrictions
- Documented floating-point forms:
f16,f32. - A5 is the most detailed concrete profile in the current manual; CPU simulation and A2/A3-class targets may support narrower subsets or emulate the behavior while preserving the visible PTO contract.
Examples¶
for (int i = 0; i < N; i++)
if (mask[i])
dst[i] = (src[i] >= 0) ? src[i] : scalar * src[i];
%result = pto.vlrelu %activations, %alpha, %mask : !pto.vreg<64xf32>, f32, !pto.mask<b32> -> !pto.vreg<64xf32>
Related Ops / Instruction Set Links¶
- Instruction set overview: Vector-Scalar Instructions
- Previous op in instruction set: pto.vshrs
- Next op in instruction set: pto.vaddcs