Floating-point maximum number recursive reduction of quadword vector segments
Floating-point maximum number of the same element numbers from each 128-bit source vector segment using a recursive pairwise reduction, placing each result into the corresponding element number of the 128-bit SIMD&FP destination register. Inactive elements in the source vector are treated as the default NaN.
Regardless of the value of FPCR.AH, the behavior is as follows:
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | size | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | Pg | Zn | Vd |
if !IsFeatureImplemented(FEAT_SVE2p1) && !IsFeatureImplemented(FEAT_SME2p1) then UNDEFINED; if size == '00' then UNDEFINED; constant integer esize = 8 << UInt(size); constant integer g = UInt(Pg); constant integer n = UInt(Zn); constant integer d = UInt(Vd);
<Vd> |
Is the name of the destination SIMD&FP register, encoded in the "Vd" field. |
<T> |
Is an arrangement specifier,
encoded in
|
<Pg> |
Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field. |
<Zn> |
Is the name of the source scalable vector register, encoded in the "Zn" field. |
<Tb> |
Is the size specifier,
encoded in
|
CheckSVEEnabled(); constant integer VL = CurrentVL; constant integer PL = VL DIV 8; constant integer segments = VL DIV 128; constant integer elempersegment = 128 DIV esize; constant bits(PL) mask = P[g, PL]; constant bits(VL) operand = if AnyActiveElement(mask, esize) then Z[n, VL] else Zeros(VL); constant bits(esize) identity = FPDefaultNaN(FPCR, esize); bits(128) result = Zeros(128); constant integer p2bits = CeilPow2(segments*esize); constant integer p2elems = p2bits DIV esize; for e = 0 to elempersegment-1 bits(p2bits) stmp; bits(esize) dtmp; for s = 0 to p2elems-1 if s < segments && ActivePredicateElement(mask, s * elempersegment + e, esize) then Elem[stmp, s, esize] = Elem[operand, s * elempersegment + e, esize]; else Elem[stmp, s, esize] = identity; dtmp = FPReduce(ReduceOp_FMAXNUM, stmp, esize, FPCR); Elem[result, e, esize] = dtmp; V[d, 128] = result;
Internal version only: aarchmrs v2024-03_relA, pseudocode v2024-03_rel, sve v2024-03_rel ; Build timestamp: 2024-03-26T09:45
Copyright © 2010-2024 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.