RFC ls002.fmi v2 Floating-Point Load-Immediate
- Funded by NLnet under the Privacy and Enhanced Trust Programme, EU Horizon2020 Grant 825310, and NGI0 Entrust No 101069594
- https://libre-soc.org/openpower/sv/int_fp_mv/#fmvis
- https://libre-soc.org/openpower/sv/rfc/ls002.fmi/
- https://bugs.libre-soc.org/show_bug.cgi?id=1092
- https://git.openpower.foundation/isa/PowerISA/issues/87
Severity: Major
Status: New
Date: 05 Oct 2022 v3 TODO
Target: v3.2B
Source: v3.0B
Books and Section affected:
Book I Scalar Floating-Point 4.6.2.1
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
Summary
Instructions added
fmvis - Floating-Point Move Immediate, Shifted
fishmv - Floating-Point Immediate, Second-half Move
Submitter: Luke Leighton (Libre-SOC)
Requester: Libre-SOC
Impact on processor:
Addition of two new FPR-based instructions
Impact on software:
Requires support for new instructions in assembler, debuggers,
and related tools.
Keywords:
FPR, Floating-point, Load-immediate, BF16, bfloat16, BFP32
Motivation
Similar to lxvkq but extended to a bfloat16 with one
32-bit instruction and a full FP32 in two 32-bit instructions
these instructions always save a Data Load and associated L1
and TLB lookup. Even quickly clearing an FPR to zero presently needs Load.
Notes and Observations:
- There is no need for an Rc=1 variant because this is an immediate
loading instruction (an FPR equivalent to
li) - There is no need for Special Registers (FP Flags) because this
is an immediate loading instruction. No FPR Load Operations
alter
FPSCR, neither doeslxvkq, and on that basis neither should these instructions. fishmvas a FRT-only Read-Modify-Write (instead of an unnecessary FRT,FRA pair) saves five potential bits, making the difference between a 5-bit XO (VA/DX-Form) and requiring an entire Primary Opcode.
Changes
Add the following entries to:
- the Appendices of Book I
- Instructions of Book I as a new Section 4.6.2.1
- DX-Form of Book I Section 1.6.1.6 and 1.6.2
- Floating-Point Data a Format of Book I Section 4.3.1
\newpage{}
Floating-Point Move Immediate
fmvis FRT, D
| 0-5 | 6-10 | 11-15 | 16-25 | 26-30 | 31 | Form |
|---|---|---|---|---|---|---|
| Major | FRT | d1 | d0 | XO | d2 | DX-Form |
Pseudocode:
bf16 <- d0 || d1 || d2 # create bfloat16 immediate
bfp32 <- bf16 || [0]*16 # convert bfloat16 to BFP32
FRT <- DOUBLE(bfp32) # convert BFP32 to BFP64
Special registers altered:
None
The value D << 16 is interpreted as a 32-bit float, converted to a
64-bit float and written to FRT. This is equivalent to reinterpreting
D as a bfloat16 and converting to 64-bit float.
Examples:
fmvis f4, 0 # writes +0.0 to f4 (clears an FPR)
fmvis f4, 0x8000 # writes -0.0 to f4
fmvis f4, 0x3F80 # writes +1.0 to f4
fmvis f4, 0xBFC0 # writes -1.5 to f4
fmvis f4, 0x7FC0 # writes +qNaN to f4
fmvis f4, 0x7F80 # writes +Infinity to f4
fmvis f4, 0xFF80 # writes -Infinity to f4
fmvis f4, 0x3FFF # writes +1.9921875 to f4
Floating-Point Immediate Second-Half Move
fishmv FRT, D
DX-Form:
| 0-5 | 6-10 | 11-15 | 16-25 | 26-30 | 31 | Form |
|---|---|---|---|---|---|---|
| Major | FRT | d1 | d0 | XO | d2 | DX-Form |
Pseudocode:
n <- (FRT) # read FRT
bfp32 <- SINGLE(n) # convert to BFP32
bfp32[16:31] <- d0 || d1 || d2 # replace LSB half
FRT <- DOUBLE(bfp32) # convert back to BFP64
Special registers altered:
None
An additional 16-bits of immediate is inserted into the low-order half of the single-format value corresponding to the contents of FRT.
This instruction performs a Read-Modify-Write on FRT.
In hardware, fishmv may be macro-op-fused with fmvis.
Programmer's note:
The use of these two instructions is strategically similar to
how li combined with oris may be used to construct 32-bit Integers.
If a prior fmvis instruction had been used to
set the upper 16-bits from a BFP32 value, fishmv may be used
to set the
lower 16-bits.
Example:
# these two combined instructions write 0x3f808000
# into f4 as a BFP32 to be converted to a BFP64.
# actual contents in f4 after conversion: 0x3ff0_1000_0000_0000
# first the upper bits, happens to be +1.0
fmvis f4, 0x3F80 # writes +1.0 to f4
# now write the lower 16 bits of a BFP32
fishmv f4, 0x8000 # writes +1.00390625 to f4
\newpage{}
DX-Form
Add the following to Book I, 1.6.1.6, DX-Form
|0 |6 |11 |16 |26 |31
| PO | FRT| d1| d0| XO|d2
Add DX to FRT Field in Book I, 1.6.2
FRT (6:10)
Field used to specify an FPR to be used as a
source.
Formats: D, X, DX
bfloat16 definition
Add the following to Book I, 4.3.1:
The format may be a 16-bit bfloat16, 32-bit single format for a single-precision value...
The bfloat16 format is used as an immediate.
The structure of the bfloat16, single and double formats is shown below.
|S |EXP| FRACTION|
|0 |1 8|9 15|
Figure #. Binary floating-point half-precision format (bfloat16)
Appendices
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
| Form | Book | Page | Version | mnemonic | Description |
|---|---|---|---|---|---|
| DX | I | # | 3.0B | fmvis | Floating-point Move Immediate, Shifted |
| DX | I | # | 3.0B | fishmv | Floating-point Immediate, Second-half Move |