links

potential opcode allocations

discussion on ways to allocate scalar and svp64 opcodes. the first requirement is:

  • 75% of one major opcode for SVP64 (25%) SVP64-Single (25%) SVP64-Reserved (25%)
  • 75% of one major opcode for grevluti crternlogi ternlogi (again each 25%)
  • 75% of one major opcode for xpermi fmvis fishmv bmrevi mv.swizzle etc.

(see also: idea for reducing amount of opcode space needed by around 50% which is irrelevant to the topic at hand, and should be entirely ignored 100% for the purposes of this discussion)

the additional requirements are:

  • all of the scalar operations must be Vectorizeable
  • all of the scalar operations must be in a 32-bit encoding (not prefixed-prefixed)

use 75% of QTY 3 MAJOR ops

(for completeness: this idea is too much)

there are a number of areas as candidates:

  • EXT006 (75%)
  • EXT005 (100%)
  • EXT009 (100%)

However unfortunately as this would be the entire available 32-bit Major opcodes used up, it is not viable.

major old/new scalar/vec

following a similar scheme to EXT001 in Power ISA Public v3.1, one bit indicates "this is an entire new 32-bit scalar space". although the "penalty" is that any such "escape-sequenced" 32-bit instructions require a prefix-marker bit, it does effectively double the entirety of the 32-bit Major Opcode space.

Section 1.6.3:

Prefix bits 6:7 are used to identify one of four prefix for-
mat types. When bit 6 is set to 0 (prefix types 00 and
01), the suffix is not a Defined Word-instruction instruction (i.e.,
requires the prefix to identify the alternate opcode
space the suffix is assigned to as well as additional or
extended operand and/or control fields); when bit 6 is
set to 1 (prefix types 10 and 11), the prefix is modifying
the behavior of a Defined Word-instruction instruction in the suffix.

thus, we have:

0-5 6 7 8-31 Description
EXT001 0 0 nnnn load/store
EXT001 1 0 nnnn reg-to-reg
EXT001 0 1 nnnn load/store suffix=defined-word
EXT001 1 1 nnnn reg-to-reg suffix=defined-word

and so when bit 6=0 there is space to create an entirely new suite of encodings including new 32-bit instructions.

this "doubling" is already public and part of EXT001, the idea here is to mirror that (bit 6), but unlike EXT001, use bit 7 to mark whether the instruction is SVP64-vector or SVP64-single.

0-5 6 7 8-31 Description
PO 0 0 nnnn new, scalar (SVP64Single)
PO 1 0 nnnn old, scalar (SVP64Single)
PO 0 1 nnnn new, vector (SVP64)
PO 1 1 nnnn old, vector (SVP64)

there are some special-cases here, involving bits 8-31 but they are degenerate. let us set Scalar Identity Behaviour:

0-5 6 7 8-31 Description
PO 0 0 0000 new, scalar (SVP64Single)
PO 1 0 0000 old, scalar (SVP64Single)
PO 0 1 0000 new, vector (SVP64)
PO 1 1 0000 old, vector (SVP64)

there is one set of encodings here which are redundant:

  • bit 6=1
  • bit 7=10
  • bits 8-31=0000

this is a duplication of the existing v3.0B 32-bit Scalar operations. is it worth special-casing for "Reserved" honestly i do not know if it's worth it. it would be:

0-5 6 7 8-31 Description
PO 0 0 0000 RESERVED1
PO 0 0 !zero new-suffix, scalar (SVP64Single)
PO 1 0 0000 RESERVED2
PO 1 0 !zero old-suffix, scalar (SVP64Single)
PO 0 1 nnnn new-suffix, vector (SVP64)
PO 1 1 nnnn old-suffix, vector (SVP64)

having this RESERVED encoding in the middle of the space does complexify multi-issue decoding somewhat, but it does provide an entire new (independent, non-vectorizable) 32-bit opcode space. two separate RESERVED Major opcode areas can be provided: numbering them EXT200-263 and EXT300-363 respectively seems sane. EXT300-363 for RESERVED1 comes with a caveat that it can never be SVP64-Augmented.

the only downside of the third PO Group (EXT300-363) is that this is now four PO Groups right at the Decode Phase: EXT0nn EXT1nn EXT2nn and EXT3nn. this is incredibly expensive. however we don't actually need EXT3nn, it's just a nice-to-have