RFC ls004 Shift-And-Add

URLs:

Severity: Major

Status: New

Date: 31 Oct 2022

Target: v3.2B

Source: v3.0B

Books and Section affected:

    Book I Fixed-Point Shift Instructions 3.3.14.2
    Appendix E Power ISA sorted by opcode
    Appendix F Power ISA sorted by version
    Appendix G Power ISA sorted by Compliancy Subset
    Appendix H Power ISA sorted by mnemonic

Summary

    Instructions added
    shadd - Shift and Add
    shadduw - Shift and Add Unsigned Word 

Submitter: Luke Leighton (Libre-SOC)

Requester: Libre-SOC

Impact on processor:

    Addition of two new GPR-based instructions

Impact on software:

    Requires support for new instructions in assembler, debuggers,
    and related tools.

Keywords:

    GPR, Big-manip, Shift, Arithmetic

Motivation

Power ISA is missing LD/ST with shift, which is present in both ARM and x86. Adding more LD/ST is too complex, a compromise is to add shift-and-add. Replaces a pair of explicit instructions in hot-loops.

Notes and Observations:

  1. shadd and shadduw operate on unsigned integers.
  2. shadduw is intended for performing address offsets, as the second operand is constrained to lower 32-bits and zero-extended.
  3. Both are 2-in 1-out instructions.

TODO: signed 32-bit shift-and-add should be added, this needs to be addressed before submitting the RFC: https://bugs.libre-soc.org/show_bug.cgi?id=996

Changes

Add the following entries to:

  • the Appendices of Book I
  • Instructions of Book I added to Section 3.3.14.2

\newpage{}

Shift-and-Add

shadd RT, RA, RB

0-5 6-10 11-15 16-20 21-22 23-30 31 Form
PO RT RA RB sm XO Rc Z23-Form

Pseudocode:

shift <- sm + 1                     # Shift is between 1-4
sum[0:63] <- ((RB) << shift) + (RA) # Shift RB, add RA
RT <- sum                           # Result stored in RT

When sm is zero, the contents of register RB are multiplied by 2, added to the contents of register RA, and the result stored in RT.

sm is a 2-bit bitfield, and allows multiplication of RB by 2, 4, 8, 16.

Operands RA and RB, and the result RT are all 64-bit, unsigned integers.

NEED EXAMPLES (not sure how to embedd sm)!!! Examples:

# adds r1 to (r2*8)
shadd r4, r1, r2, 3

Shift-and-Add Unsigned Word

shadd RT, RA, RB

0-5 6-10 11-15 16-20 21-22 23-30 31 Form
PO RT RA RB sm XO Rc Z23-Form

Pseudocode:

shift <- sm + 1                     # Shift is between 1-4
n <- (RB)[32:63]                    # Only use lower 32-bits of RB
sum[0:63] <- (n << shift) + (RA)    # Shift n, add RA
RT <- sum                           # Result stored in RT

When sm is zero, the lower word contents of register RB are multiplied by 2, added to the contents of register RA, and the result stored in RT.

sm is a 2-bit bitfield, and allows multiplication of RB by 2, 4, 8, 16.

Operands RA and RB, and the result RT are all 64-bit, unsigned integers.

*Programmer's Note: The advantage of this instruction is doing address offsets. RA is the base 64-bit address. RB is the offset into data structure limited to 32-bit.

Examples:

# 
shadduw r4, r1, r2

Appendices

Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
Form Book Page Version mnemonic Description
Z23 I # 3.0B shadd Shift-and-Add
Z23 I # 3.0B shadduw Shift-and-Add Unsigned Word