TDIVS¶
指令示意图¶
简介¶
与标量的逐元素除法(Tile/标量 或 标量/Tile)。
数学语义¶
对每个元素 (i, j) 在有效区域内:
- Tile/scalar:
$$ \mathrm{dst}{i,j} = \frac{\mathrm{src}{i,j}}{\mathrm{scalar}} $$
- Scalar/tile:
$$ \mathrm{dst}{i,j} = \frac{\mathrm{scalar}}{\mathrm{src}{i,j}} $$
汇编语法¶
PTO-AS 形式:参见 PTO-AS Specification.
Tile/scalar form:
%dst = tdivs %src, %scalar : !pto.tile<...>, f32
Scalar/tile form:
%dst = tdivs %scalar, %src : f32, !pto.tile<...>
AS Level 1 (SSA)¶
%dst = pto.tdivs %src, %scalar : (!pto.tile<...>, dtype) -> !pto.tile<...>
%dst = pto.tdivs %scalar, %src : (dtype, !pto.tile<...>) -> !pto.tile<...>
AS Level 2 (DPS)¶
pto.tdivs ins(%src, %scalar : !pto.tile_buf<...>, dtype) outs(%dst : !pto.tile_buf<...>)
pto.tdivs ins(%scalar, %src : dtype, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
AS Level 1(SSA)¶
%dst = pto.tdivs %src, %scalar : (!pto.tile<...>, dtype) -> !pto.tile<...>
%dst = pto.tdivs %scalar, %src : (dtype, !pto.tile<...>) -> !pto.tile<...>
AS Level 2(DPS)¶
pto.tdivs ins(%src, %scalar : !pto.tile_buf<...>, dtype) outs(%dst : !pto.tile_buf<...>)
pto.tdivs ins(%scalar, %src : dtype, !pto.tile_buf<...>) outs(%dst : !pto.tile_buf<...>)
C++ 内建接口¶
声明于 include/pto/common/pto_instr.hpp:
template <typename TileDataDst, typename TileDataSrc, typename... WaitEvents>
PTO_INST RecordEvent TDIVS(TileDataDst &dst, TileDataSrc &src0, typename TileDataSrc::DType scalar, WaitEvents &... events);
template <typename TileDataDst, typename TileDataSrc, typename... WaitEvents>
PTO_INST RecordEvent TDIVS(TileDataDst &dst, typename TileDataDst::DType scalar, TileDataSrc &src0, WaitEvents &... events);
约束¶
- 实现检查 (A2A3) (both overloads):
TileData::DTypemust be one of:int32_t,int,int16_t,half,float16_t,float,float32_t.- Tile location must be vector (
TileData::Loc == TileType::Vec). - Static valid bounds:
TileData::ValidRow <= TileData::RowsandTileData::ValidCol <= TileData::Cols. - Runtime:
src0.GetValidRow() == dst.GetValidRow()andsrc0.GetValidCol() == dst.GetValidCol(). - Tile 布局 must be row-major (
TileData::isRowMajor).
- 实现检查 (A5) (both overloads):
TileData::DTypemust be one of:uint8_t,int8_t,uint16_t,int16_t,uint32_t,int32_t,half,float.- Tile location must be vector (
TileData::Loc == TileType::Vec). - Static valid bounds:
TileData::ValidRow <= TileData::RowsandTileData::ValidCol <= TileData::Cols. - Runtime:
src0.GetValidRow() == dst.GetValidRow()andsrc0.GetValidCol() == dst.GetValidCol(). - Tile 布局 must be row-major (
TileData::isRowMajor).
- 有效区域:
- The op uses
dst.GetValidRow()/dst.GetValidCol()as the iteration domain.
- The op uses
- Division-by-zero:
- Behavior is target-defined; on A5 the tile/scalar form maps to multiply-by-reciprocal and uses
1/0 -> +infforscalar == 0.
- Behavior is target-defined; on A5 the tile/scalar form maps to multiply-by-reciprocal and uses
示例¶
自动(Auto)¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example_auto() {
using TileT = Tile<TileType::Vec, float, 16, 16>;
TileT src, dst;
TDIVS(dst, src, 2.0f);
}
手动(Manual)¶
#include <pto/pto-inst.hpp>
using namespace pto;
void example_manual() {
using TileT = Tile<TileType::Vec, float, 16, 16>;
TileT src, dst;
TASSIGN(src, 0x1000);
TASSIGN(dst, 0x2000);
TDIVS(dst, 2.0f, src);
}