RFC ls004 Shift-And-Add
URLs:
- https://libre-soc.org/openpower/sv/biginteger/analysis/
- https://libre-soc.org/openpower/sv/rfc/ls004/
- bigint: https://bugs.libre-soc.org/show_bug.cgi?id=960 TODO: maybe remove this link due to confusion and irrelevance?
- https://git.openpower.foundation/isa/PowerISA/issues/91
- shift-and-add https://bugs.libre-soc.org/show_bug.cgi?id=968
- add shaddw: https://bugs.libre-soc.org/show_bug.cgi?id=996
Severity: Major
Status: New
Date: 31 Oct 2022
Target: v3.2B
Source: v3.0B
Books and Section affected:
Book I Fixed-Point Shift Instructions 3.3.14.2
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
Summary
Instructions added
shadd - Shift and Add
shadduw - Shift and Add Unsigned Word
Submitter: Luke Leighton (Libre-SOC)
Requester: Libre-SOC
Impact on processor:
Addition of two new GPR-based instructions
Impact on software:
Requires support for new instructions in assembler, debuggers,
and related tools.
Keywords:
GPR, Big-manip, Shift, Arithmetic
Motivation
Power ISA is missing LD/ST with shift, which is present in both ARM and x86. Adding more LD/ST is too complex, a compromise is to add shift-and-add. Replaces a pair of explicit instructions in hot-loops.
Notes and Observations:
shadd
andshadduw
operate on unsigned integers.shadduw
is intended for performing address offsets, as the second operand is constrained to lower 32-bits and zero-extended.- Both are 2-in 1-out instructions.
TODO: signed 32-bit shift-and-add should be added, this needs to be addressed before submitting the RFC: https://bugs.libre-soc.org/show_bug.cgi?id=996
Changes
Add the following entries to:
- the Appendices of Book I
- Instructions of Book I added to Section 3.3.14.2
\newpage{}
Shift-and-Add
shadd RT, RA, RB
0-5 | 6-10 | 11-15 | 16-20 | 21-22 | 23-30 | 31 | Form |
---|---|---|---|---|---|---|---|
PO | RT | RA | RB | sm | XO | Rc | Z23-Form |
Pseudocode:
shift <- sm + 1 # Shift is between 1-4
sum[0:63] <- ((RB) << shift) + (RA) # Shift RB, add RA
RT <- sum # Result stored in RT
When sm
is zero, the contents of register RB are multiplied by 2,
added to the contents of register RA, and the result stored in RT.
sm
is a 2-bit bitfield, and allows multiplication of RB by 2, 4, 8, 16.
Operands RA and RB, and the result RT are all 64-bit, unsigned integers.
NEED EXAMPLES (not sure how to embedd sm)!!! Examples:
# adds r1 to (r2*8)
shadd r4, r1, r2, 3
Shift-and-Add Unsigned Word
shadd RT, RA, RB
0-5 | 6-10 | 11-15 | 16-20 | 21-22 | 23-30 | 31 | Form |
---|---|---|---|---|---|---|---|
PO | RT | RA | RB | sm | XO | Rc | Z23-Form |
Pseudocode:
shift <- sm + 1 # Shift is between 1-4
n <- (RB)[32:63] # Only use lower 32-bits of RB
sum[0:63] <- (n << shift) + (RA) # Shift n, add RA
RT <- sum # Result stored in RT
When sm
is zero, the lower word contents of register RB are multiplied by 2,
added to the contents of register RA, and the result stored in RT.
sm
is a 2-bit bitfield, and allows multiplication of RB by 2, 4, 8, 16.
Operands RA and RB, and the result RT are all 64-bit, unsigned integers.
*Programmer's Note: The advantage of this instruction is doing address offsets. RA is the base 64-bit address. RB is the offset into data structure limited to 32-bit.
Examples:
#
shadduw r4, r1, r2
Appendices
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
Form | Book | Page | Version | mnemonic | Description |
---|---|---|---|---|---|
Z23 | I | # | 3.0B | shadd | Shift-and-Add |
Z23 | I | # | 3.0B | shadduw | Shift-and-Add Unsigned Word |