links
- https://bugs.libre-soc.org/show_bug.cgi?id=924
- https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-October/005344.html
- https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-September/005272.html
potential opcode allocations
discussion on ways to allocate scalar and svp64 opcodes. the first requirement is:
- 75% of one major opcode for SVP64 (25%) SVP64-Single (25%) SVP64-Reserved (25%)
- 75% of one major opcode for grevluti crternlogi ternlogi (again each 25%)
- 75% of one major opcode for xpermi fmvis fishmv bmrevi mv.swizzle etc.
(see also: idea for reducing amount of opcode space needed by around 50% which is irrelevant to the topic at hand, and should be entirely ignored 100% for the purposes of this discussion)
the additional requirements are:
- all of the scalar operations must be Vectorizeable
- all of the scalar operations must be in a 32-bit encoding (not prefixed-prefixed)
use 75% of QTY 3 MAJOR ops
(for completeness: this idea is too much)
there are a number of areas as candidates:
- EXT006 (75%)
- EXT005 (100%)
- EXT009 (100%)
However unfortunately as this would be the entire available 32-bit Major opcodes used up, it is not viable.
major old/new scalar/vec
following a similar scheme to EXT001 in Power ISA Public v3.1, one bit indicates "this is an entire new 32-bit scalar space". although the "penalty" is that any such "escape-sequenced" 32-bit instructions require a prefix-marker bit, it does effectively double the entirety of the 32-bit Major Opcode space.
Section 1.6.3:
Prefix bits 6:7 are used to identify one of four prefix for-
mat types. When bit 6 is set to 0 (prefix types 00 and
01), the suffix is not a Defined Word-instruction instruction (i.e.,
requires the prefix to identify the alternate opcode
space the suffix is assigned to as well as additional or
extended operand and/or control fields); when bit 6 is
set to 1 (prefix types 10 and 11), the prefix is modifying
the behavior of a Defined Word-instruction instruction in the suffix.
thus, we have:
0-5 | 6 | 7 | 8-31 | Description |
---|---|---|---|---|
EXT001 | 0 | 0 | nnnn | load/store |
EXT001 | 1 | 0 | nnnn | reg-to-reg |
EXT001 | 0 | 1 | nnnn | load/store suffix=defined-word |
EXT001 | 1 | 1 | nnnn | reg-to-reg suffix=defined-word |
and so when bit 6=0 there is space to create an entirely new suite of encodings including new 32-bit instructions.
this "doubling" is already public and part of EXT001, the idea here is to mirror that (bit 6), but unlike EXT001, use bit 7 to mark whether the instruction is SVP64-vector or SVP64-single.
0-5 | 6 | 7 | 8-31 | Description |
---|---|---|---|---|
PO | 0 | 0 | nnnn | new, scalar (SVP64Single) |
PO | 1 | 0 | nnnn | old, scalar (SVP64Single) |
PO | 0 | 1 | nnnn | new, vector (SVP64) |
PO | 1 | 1 | nnnn | old, vector (SVP64) |
there are some special-cases here, involving bits 8-31 but
they are degenerate. let us set Scalar Identity Behaviour
:
0-5 | 6 | 7 | 8-31 | Description |
---|---|---|---|---|
PO | 0 | 0 | 0000 | new, scalar (SVP64Single) |
PO | 1 | 0 | 0000 | old, scalar (SVP64Single) |
PO | 0 | 1 | 0000 | new, vector (SVP64) |
PO | 1 | 1 | 0000 | old, vector (SVP64) |
there is one set of encodings here which are redundant:
- bit 6=1
- bit 7=10
- bits 8-31=0000
this is a duplication of the existing v3.0B 32-bit Scalar operations. is it worth special-casing for "Reserved" honestly i do not know if it's worth it. it would be:
0-5 | 6 | 7 | 8-31 | Description |
---|---|---|---|---|
PO | 0 | 0 | 0000 | RESERVED1 |
PO | 0 | 0 | !zero | new-suffix, scalar (SVP64Single) |
PO | 1 | 0 | 0000 | RESERVED2 |
PO | 1 | 0 | !zero | old-suffix, scalar (SVP64Single) |
PO | 0 | 1 | nnnn | new-suffix, vector (SVP64) |
PO | 1 | 1 | nnnn | old-suffix, vector (SVP64) |
having this RESERVED
encoding in the middle of the
space does complexify multi-issue decoding somewhat,
but it does provide an entire new (independent,
non-vectorizable) 32-bit opcode space. two separate
RESERVED Major opcode areas can be provided: numbering them
EXT200-263 and EXT300-363 respectively seems sane.
EXT300-363 for RESERVED1
comes with a caveat that it can
never be SVP64-Augmented.
the only downside of the third PO Group (EXT300-363) is that this is now four PO Groups right at the Decode Phase: EXT0nn EXT1nn EXT2nn and EXT3nn. this is incredibly expensive. however we don't actually need EXT3nn, it's just a nice-to-have