pto.vsubcs

pto.vsubcs is part of the Vector-Scalar Instructions instruction set.

Summary

Lane-wise subtract with explicit borrow-in and borrow-out masks.

Mechanism

For each active lane i, diff = lhs[i] - rhs[i] - borrow_in[i], result[i] = low_bits(diff), and borrow[i] = borrow_out(diff). The borrow chain is lane-local in the PTO surface: each lane consumes one incoming borrow bit and produces one outgoing borrow bit.

Syntax

PTO Assembly Form

vsubcs %dst, %borrow_out, %lhs, %rhs, %borrow_in, %mask : !pto.vreg<NxT>, !pto.mask<G>

AS Level 1 (SSA)

%result, %borrow = pto.vsubcs %lhs, %rhs, %borrow_in, %mask : !pto.vreg<NxT>, !pto.vreg<NxT>, !pto.mask<G>, !pto.mask<G> -> !pto.vreg<NxT>, !pto.mask<G>

Inputs

Operand Type Description
%lhs !pto.vreg<NxT> Minuend vector
%rhs !pto.vreg<NxT> Subtrahend vector
%borrow_in !pto.mask<G> Incoming borrow bit per lane
%mask !pto.mask<G> Predicate mask; only lanes with mask bit 1 participate

Expected Outputs

Result Type Description
%result !pto.vreg<NxT> Lane-wise arithmetic result on the active lanes
%borrow !pto.mask<G> Borrow-out bit produced for each active lane

Side Effects

This operation has no architectural side effect beyond producing its destination values. It does not implicitly reserve buffers, signal events, or establish memory fences.

Constraints

Constraints

  • Borrow-chain forms are defined for integer element types.
  • %lhs, %rhs, and %result MUST have the same vector width N and element type T.
  • %borrow_in, %borrow, and %mask MUST all have width N.

Exceptions

Exceptions

  • The verifier rejects illegal operand shapes, unsupported element types, and attribute combinations that are not valid for the selected instruction set or target profile.
  • Any additional illegality stated in the constraints section is also part of the contract.

Target-Profile Restrictions

Target-Profile Restrictions
  • Treat this form as an unsigned integer borrow-chain unless a target profile documents a wider legal domain.
  • A5 is the most detailed concrete profile in the current manual; CPU simulation and A2/A3-class targets may support narrower subsets or emulate the behavior while preserving the visible PTO contract.

Examples

for (int i = 0; i < N; i++) {
    if (!mask[i]) continue;
    uint64_t rhs_total = (uint64_t)rhs[i] + borrow_in[i];
    result[i] = lhs[i] - rhs_total;
    borrow[i] = lhs[i] < rhs_total;
}
%result, %borrow = pto.vsubcs %lhs, %rhs, %borrow_in, %mask : !pto.vreg<64xi32>, !pto.vreg<64xi32>, !pto.mask<b32>, !pto.mask<b32> -> !pto.vreg<64xi32>, !pto.mask<b32>