pto.mad_bias¶
pto.mad_bias is part of the Cube MAD Ops.
Summary¶
Bias-init cube matrix multiply: dst[m, n] = sum_k(lhs[m, k] * rhs[k, n]) + bias[n].
Mechanism¶
Like pto.mad, but seeds the accumulator with a per-N bias vector instead of zero. Useful as the first MAD in a K-tiled sequence where the bias is known up front; subsequent partial sums can accumulate via pto.mad_acc.
Syntax¶
pto.mad_bias %lhs, %rhs, %dst, %bias, %m, %n, %k
unit_flag(check_only | check_and_set)?
disable_gemv?
(sat | nosat)?
tf32_mode(round_even | round_away)?
n_dir?
: !pto.ptr<A, l0a>, !pto.ptr<B, l0b>, !pto.ptr<C, l0c>, !pto.ptr<C, bt>, i64, i64, i64
Inputs¶
| Parameter | Type | Description |
|---|---|---|
%lhs, %rhs, %dst, %m, %n, %k |
— | Same as pto.mad |
%bias |
!pto.ptr<C, bt> |
Bias vector in BT, interpreted as N values broadcast across M |
See MAD Common Clauses for the optional clauses.
Expected Outputs¶
| Result | Type | Description |
|---|---|---|
| None | — |
Writes the produced M x N tile to L0C with bias-init seed. |
Side Effects¶
Engages the CUBE pipe, reads %bias from BT, writes to L0C. The caller is responsible for staging %bias into BT via pto.mte_l1_bt prior to this op.
Constraints¶
Constraints
%biasmust be inbtaddress space.%biaselement type must match%dstelement type.- Only
Nbias values are consumed;%biasis not anM x Nmatrix. - Other constraints match
pto.mad.
Examples¶
pto.mad_bias %l0a, %l0b, %l0c, %bt, %c16_i64, %c16_i64, %c32_i64
: !pto.ptr<f16, l0a>, !pto.ptr<f16, l0b>, !pto.ptr<f32, l0c>, !pto.ptr<f32, bt>, i64, i64, i64
Related Ops¶
- Zero-init form: pto.mad
- Accumulating form: pto.mad_acc
- MX bias-init form: pto.mad_mx_bias
- Bias staging: pto.mte_l1_bt