GRev/GOrC combination instruction design

The design is derived from a circuit for GRev made with muxes:

First, we convert that circuit to use And-Or-Invert gates, since that's an efficient way the muxes can be implemented:

Notice how each And-Or-Invert has both a bit of SH and ~SH as inputs? Those can be converted to separate inputs, controlled by the bits of SH using the instruction's immediate as a pair of 2-bit look-up-tables. This requires 4-bits of immediate.

This gives us our final design:

Notice how this still has an overall circuit latency that is essentially equivalent to grev's latency (or shift/rotate's latency). Also notice how this circuit allows specifying much more than just grev or gorc instructions. Layers of XOR gates can be added at the input and output, allowing it to function as a gandc instruction too, requiring a total of 6-bits of immediate (1 bit for inverting the input, 1 bit for inverting the output, 4 bits for the look-up-tables).

We will also want versions of grev that have the shift amount be an immediate (needed for bitwise reverse and byte reversals and other similar instructions.) The immediate-shift-amount version can be specified to always do a grev (or maybe only grev/gorc) operation to save encoding space, since I'd guess it's much more common than any of the other immediate-shift variants.

Twin LUT4s

gate-saving of the AND/OR (AOI) can be applied to grevlut. TODO, version of diagram in SVG/DIA