pto.tneg

pto.tneg is part of the Elementwise Tile Tile instruction set.

Summary

Elementwise negation of a tile.

Mechanism

Elementwise negation of a tile.

For each element (i, j) in the valid region:

\[ \mathrm{dst}_{i,j} = -\mathrm{src}_{i,j} \]

Syntax

Textual spelling is defined by the PTO ISA syntax-and-operands pages.

Synchronous form:

%dst = tneg %src : !pto.tile<...>

AS Level 1 (SSA)

%dst = pto.tneg %src : !pto.tile<...> -> !pto.tile<...>

AS Level 2 (DPS)

pto.tneg ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)

C++ Intrinsic

Declared in include/pto/common/pto_instr.hpp:

template <typename TileDataDst, typename TileDataSrc, typename... WaitEvents>
PTO_INST RecordEvent TNEG(TileDataDst &dst, TileDataSrc &src, WaitEvents &... events);

Inputs

Operand Role Description
%src Source tile Source tile; read at (i, j) for each (i, j) in dst valid region
%dst Destination tile Destination tile receiving the result
WaitEvents... Optional synchronisation RecordEvent tokens to wait on before issuing the operation

Expected Outputs

Result Type Description
%dst !pto.tile<...> Destination tile; all (i, j) in its valid region contain -src[i,j] after the operation

Side Effects

No architectural side effects beyond producing the destination tile. Does not implicitly fence unrelated traffic.

Constraints

Constraints

  • The op iterates over dst.GetValidRow() / dst.GetValidCol().

Exceptions

Exceptions

  • Illegal operand tuples, unsupported types, invalid layout combinations, or unsupported target-profile modes are rejected by the verifier or by the selected backend instruction set.
  • Programs must not rely on behavior outside the documented legal domain of this operation, even if one backend currently accepts it.

Target-Profile Restrictions

Target-Profile Restrictions
  • pto.tneg preserves PTO-visible semantics across CPU simulation, A2/A3-class targets, and A5-class targets, but concrete support subsets may differ by profile.

  • Portable code must rely only on the documented type, layout, shape, and mode combinations that the selected target profile guarantees.

Performance

A2/A3 Throughput

TNEG compiles to CCE vector instructions via the TUnaryOp.hpp performance model:

Metric Value
Startup latency 13
Completion latency 26 (FP transcendental)
Per-repeat throughput 1
Pipeline interval 18

Examples

#include <pto/pto-inst.hpp>

using namespace pto;

void example() {
  using TileT = Tile<TileType::Vec, float, 16, 16>;
  TileT x, out;
  TNEG(out, x);
}

Auto Mode

# Auto mode: compiler/runtime-managed placement and scheduling.
%dst = pto.tneg %src : !pto.tile<...> -> !pto.tile<...>

Manual Mode

# Manual mode: bind resources explicitly before issuing the instruction.
# Optional for tile operands:
# pto.tassign %arg0, @tile(0x1000)
# pto.tassign %arg1, @tile(0x2000)
%dst = pto.tneg %src : !pto.tile<...> -> !pto.tile<...>

PTO Assembly Form

%dst = tneg %src : !pto.tile<...>
# AS Level 2 (DPS)
pto.tneg ins(%src : !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)