BFloat16 floating-point dot product (vector, by element). This instruction delimits the source vectors into pairs of BFloat16 elements. The BFloat16 pair within the second source vector is specified using an immediate index. The index range is from 0 to 3 inclusive.
If FEAT_EBF16 is not implemented or FPCR.EBF is 0, this instruction:
If FEAT_EBF16 is implemented and FPCR.EBF is 1, then this instruction:
Irrespective of FEAT_EBF16 and FPCR.EBF, this instruction:
ID_AA64ISAR1_EL1.BF16 indicates whether this instruction is supported.
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | Q | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | L | M | Rm | 1 | 1 | 1 | 1 | H | 0 | Rn | Rd | |||||||||||
U | size | opcode |
if !IsFeatureImplemented(FEAT_BF16) then UNDEFINED; integer n = UInt(Rn); integer m = UInt(M:Rm); integer d = UInt(Rd); integer i = UInt(H:L); constant integer datasize = 64 << UInt(Q); integer elements = datasize DIV 32;
<Vd> |
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. |
<Ta> |
Is an arrangement specifier,
encoded in
|
<Vn> |
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. |
<Tb> |
Is an arrangement specifier,
encoded in
|
<Vm> |
Is the name of the second SIMD&FP source register, encoded in the "M:Rm" fields. |
<index> |
Is the immediate index of a pair of 16-bit elements in the range 0 to 3, encoded in the "H:L" fields. |
CheckFPAdvSIMDEnabled64(); bits(datasize) operand1 = V[n, datasize]; bits(128) operand2 = V[m, 128]; bits(datasize) operand3 = V[d, datasize]; bits(datasize) result; for e = 0 to elements-1 bits(16) elt1_a = Elem[operand1, 2*e+0, 16]; bits(16) elt1_b = Elem[operand1, 2*e+1, 16]; bits(16) elt2_a = Elem[operand2, 2*i+0, 16]; bits(16) elt2_b = Elem[operand2, 2*i+1, 16]; bits(32) sum = Elem[operand3, e, 32]; sum = BFDotAdd(sum, elt1_a, elt1_b, elt2_a, elt2_b, FPCR); Elem[result, e, 32] = sum; V[d, datasize] = result;
Internal version only: aarchmrs v2024-03_relA, pseudocode v2024-03_rel, sve v2024-03_rel ; Build timestamp: 2024-03-26T09:45
Copyright © 2010-2024 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.