RFC ls017 Transcendental instructions for 3D and
- Funded by NLnet under the Privacy and Enhanced Trust Programme, EU Horizon2020 Grant 825310, and NGI0 Entrust No 101069594
- https://libre-soc.org/openpower/sv/rfc/ls017.transcendentals/
- https://bugs.libre-soc.org/show_bug.cgi?id=1196
Severity: Major
Status: New
Date: 29 Apr 2023
Target: v3.2B
Source: v3.1B
Books and Section affected:
Book I Floating-Point Instructions
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
Summary
Instructions added: 43 transcendental mathematical to SFFS
Submitter: Luke Leighton (Libre-SOC)
Requester: Libre-SOC
Impact on processor:
Addition of new Transcendental instructions
Impact on software:
Requires support for new instructions in assembler, debuggers, and
related tools. Greatly decreases instruction count in 3D and HPC.
Keywords:
Scientific Computing, HPC, 3D, OpenCL,
Motivation
Allows 3D GPU Products to be commercially viable.
Changes
Add the following entries to:
- the Appendices of Book I
- Book I 4.6.6.4 Transcendental Instructions
- Book I 1.6.1 and 1.6.2
\newpage{}
Opcode Tables for PO=59/63 XO=1---011--
Power ISA v3.1B opcodes extracted from:
- Power ISA v3.1B Appendix D Table 23 sheet 2/3 of 4 page 1391/1392
- Power ISA v3.1B Appendix D Table 25 sheet 2/3 of 4 page 1399/1400
Parenthesized entries are not part of fptrans.
- Entries whose mnemonic ends in
s
are only in PO=59. - Entries whose mnemonic does not end in
s
are only in PO=63. - Entries whose mnemonic ends in
(s)
are in both PO=59 and PO=63.
XO LSB half → XO MSB half ↓ |
01100 | 01101 | 01110 | 01111 |
---|---|---|---|---|
10000 | 10000 01100 fcbrt(s) (draft) |
10000 01101 fsinpi(s) (draft) |
10000 01110 fatan2pi(s) (draft) |
10000 01111 fasinpi(s) (draft) |
10001 | 10001 01100 fcospi(s) (draft) |
10001 01101 ftanpi(s) (draft) |
10001 01110 facospi(s) (draft) |
10001 01111 fatanpi(s) (draft) |
10010 | 10010 01100 frsqrt(s) (draft) |
10010 01101 fsin(s) (draft) |
10010 01110 fatan2(s) (draft) |
10010 01111 fasin(s) (draft) |
10011 | 10011 01100 fcos(s) (draft) |
10011 01101 ftan(s) (draft) |
10011 01110 facos(s) (draft) |
10011 01111 fatan(s) (draft) |
10100 | 10100 01100 frecip(s) (draft) |
10100 01101 fsinh(s) (draft) |
10100 01110 fhypot(s) (draft) |
10100 01111 fasinh(s) (draft) |
10101 | 10101 01100 fcosh(s) (draft) |
10101 01101 ftanh(s) (draft) |
10101 01110 facosh(s) (draft) |
10101 01111 fatanh(s) (draft) |
10110 | 10110 01100 |
10110 01101 |
10110 01110 |
10110 01111 |
10111 | 10111 01100 |
10111 01101 |
10111 01110 |
10111 01111 |
XO LSB half → XO MSB half ↓ |
01100 | 01101 | 01110 | 01111 |
---|---|---|---|---|
11000 | 11000 01100 fexp2m1(s) (draft) |
11000 01101 flog2p1(s) (draft) |
11000 01110 (cffpro) (draft) |
11000 01111 (ctfpr(s)) (draft) |
11001 | 11001 01100 fexpm1(s) (draft) |
11001 01101 flogp1(s) (draft) |
11001 01110 (fctid) |
11001 01111 (fctidz) |
11010 | 11010 01100 fexp10m1(s) (draft) |
11010 01101 flog10p1(s) (draft) |
11010 01110 (fcfid(s)) |
11010 01111 fmod(s) (draft) |
11011 | 11011 01100 fpown(s) (draft) |
11011 01101 frootn(s) (draft) |
11011 01110 |
11011 01111 |
11100 | 11100 01100 fexp2(s) (draft) |
11100 01101 flog2(s) (draft) |
11100 01110 (mffpr(s)) (draft) |
11100 01111 (mtfpr(s)) (draft) |
11101 | 11101 01100 fexp(s) (draft) |
11101 01101 flog(s) (draft) |
11101 01110 (fctidu) |
11101 01111 (fctiduz) |
11110 | 11110 01100 fexp10(s) (draft) |
11110 01101 flog10(s) (draft) |
11110 01110 (fcfidu(s)) |
11110 01111 fremainder(s) (draft) |
11111 | 11111 01100 fpowr(s) (draft) |
11111 01101 fpow(s) (draft) |
11111 01110 |
11111 01111 |
XO LSB half → XO MSB half ↓ |
10000 | 10001 | 10010 | 10011 |
---|---|---|---|---|
////0 | ....0 10000 fminmax (draft) |
////0 10001 |
////0 10010 (fdiv(s)) |
////0 10011 |
////1 | ////1 10000 |
////1 10001 |
////1 10010 (fdiv(s)) |
////1 10011 |
DRAFT List of 2-arg opcodes
These are X-Form, recommended Major Opcode 63 for full-width and 59 for half-width (ending in s).
0.5 | 6.10 | 11.15 | 16.20 | 21..30 | 31 | name | Form |
---|---|---|---|---|---|---|---|
NN | FRT | FRA | FRB | 1xxxx011xx | Rc | transcendental | X-Form |
NN | FRT | FRA | RB | 1xxxx011xx | Rc | transcendental | X-Form |
NN | FRT | FRA | FRB | xxxxx10000 | Rc | transcendental | X-Form |
Recommended 10-bit XO assignments:
opcode | Description | Major 59 and 63 | bits 16..20 |
---|---|---|---|
fatan2(s) | atan2 arc tangent | 10010 01110 | FRB |
fatan2pi(s) | atan2 arc tangent / π | 10000 01110 | FRB |
fpow(s) | xy | 11111 01101 | FRB |
fpown(s) | xn (n ∈ ℤ) | 11011 01100 | RB |
fpowr(s) | xy (x >= 0) | 11111 01100 | FRB |
frootn(s) | n√x (n ∈ ℤ) | 11011 01101 | RB |
fhypot(s) | √(x2 + y2) | 10100 01110 | FRB |
fminmax | min/max | ....0 10000 | FRB |
fmod(s) | modulus | 11010 01111 | FRB |
fremainder(s) | IEEE 754 remainder | 11110 01111 | FRB |
DRAFT List of 1-arg transcendental opcodes
These are X-Form, and are mostly identical in Special Registers Altered to
fsqrt
(the exact fp exceptions they can produce differ).
Recommended Major Opcode 63 for full-width and 59 for half-width (ending in s).
Special Registers Altered (FIXME: come up with correct list):
FPRF FR FI FX OX UX XX
VXSNAN VXIMZ VXZDZ
CR1 (if Rc=1)
0.5 | 6.10 | 11.15 | 16.20 | 21..30 | 31 | name | Form |
---|---|---|---|---|---|---|---|
NN | FRT | /// | FRB | 1xxxx011xx | Rc | transcendental | X-Form |
Recommended 10-bit XO assignments:
opcode | Description | Major 59 and 63 |
---|---|---|
frsqrt(s) | 1 / √x | 10010 01100 |
fcbrt(s) | ∛x | 10000 01100 |
frecip(s) | 1 / x | 10100 01100 |
fexp2m1(s) | 2x - 1 | 11000 01100 |
flog2p1(s) | log2 (x + 1) | 11000 01101 |
fexp2(s) | 2x | 11100 01100 |
flog2(s) | log2 x | 11100 01101 |
fexpm1(s) | ex - 1 | 11001 01100 |
flogp1(s) | loge (x + 1) | 11001 01101 |
fexp(s) | ex | 11101 01100 |
flog(s) | loge x | 11101 01101 |
fexp10m1(s) | 10x - 1 | 11010 01100 |
flog10p1(s) | log10 (x + 1) | 11010 01101 |
fexp10(s) | 10x | 11110 01100 |
flog10(s) | log10 x | 11110 01101 |
DRAFT List of 1-arg trigonometric opcodes
These are X-Form, and are mostly identical in Special Registers Altered to
fsqrt
(the exact fp exceptions they can produce differ).
Recommended Major Opcode 63 for full-width and 59 for half-width (ending in s)
Special Registers Altered:
FPRF FR FI FX OX UX XX
VXSNAN VXIMZ VXZDZ
CR1 (if Rc=1)
0.5 | 6.10 | 11.15 | 16.20 | 21..30 | 31 | name | Form |
---|---|---|---|---|---|---|---|
NN | FRT | /// | FRB | 1xxxx011xx | Rc | trigonometric | X-Form |
Recommended 10-bit XO assignments:
opcode | Description | Major 59 and 63 |
---|---|---|
fsin(s) | sin (radians) | 10010 01101 |
fcos(s) | cos (radians) | 10011 01100 |
ftan(s) | tan (radians) | 10011 01101 |
fasin(s) | arcsin (radians) | 10010 01111 |
facos(s) | arccos (radians) | 10011 01110 |
fatan(s) | arctan (radians) | 10011 01111 |
fsinpi(s) | sin(π * x) | 10000 01101 |
fcospi(s) | cos(π * x) | 10001 01100 |
ftanpi(s) | tan(π * x) | 10001 01101 |
fasinpi(s) | arcsin(x) / π | 10000 01111 |
facospi(s) | arccos(x) / π | 10001 01110 |
fatanpi(s) | arctan(x) / π | 10001 01111 |
fsinh(s) | hyperbolic sin | 10100 01101 |
fcosh(s) | hyperbolic cos | 10101 01100 |
ftanh(s) | hyperbolic tan | 10101 01101 |
fasinh(s) | inverse hyperbolic sin | 10100 01111 |
facosh(s) | inverse hyperbolic cos | 10101 01110 |
fatanh(s) | inverse hyperbolic tan | 10101 01111 |
\newpage{}
Appendices
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
Form | Book | Page | Version | Mnemonic | Description |
---|---|---|---|---|---|
A | I | # | 3.2B | maddsubrs | Integer DCT/FFT Twin-Butterfly |
X | I | # | 3.2B | fdmadds | FP DCT Twin-Butterfly Single |
X | I | # | 3.2B | ffmadds | FP FFT Twin-Butterfly Single |
X | I | # | 3.2B | fdmadds | FP DCT Twin-Butterfly Double |
X | I | # | 3.2B | ffmadds | FP FFT Twin-Butterfly Double |
X | I | # | 3.2B | ffadds | FP FFT Twin-Butterfly Single |
X | I | # | 3.2B | ffadd | FP FFT Twin-Butterfly Double |
X | I | # | 3.2B | ffsubs | FP FFT Twin-Butterfly Single |
X | I | # | 3.2B | ffsub | FP FFT Twin-Butterfly Double |