Resolving ISA conflicts and providing a pain-free RISC-V Standards Upgrade Path

Note: out-of-date as of review 31apr2018, requires updating to reflect "mvendorid-marchid-isamux" concept. Recent discussion 10jun2019 Now updated with its own spec isamux isans.

Executive Summary (old, still relevant for compilers)

A non-invasive backwards-compatible change to make mvendorid and marchid being read-only to be a formal declaration of an architecture having no Custom Extensions, and being permitted to be WARL in order to support multiple simultaneous architectures on the same processor (or per hart or harts) permits not only backwards and forwards compatibility with existing implementations of the RISC-V Standard, not only permits seamless transitions to future versions of the RISC-V Standard (something that is not possible at the moment), but fixes the problem of clashes in Custom Extension opcodes on a global worldwide permanent and ongoing basis.

Summary of impact and benefits:

  • Implementation impact for existing implementations (even though the Standard is not finalised) is zero.
  • Impact for future implementations compliant with (only one) version of the RISC-V Standard is zero.
  • Benefits for implementations complying with (one or more) versions of the RISC-V Standard is: increased customer acceptance due to a smooth upgrade path at the customer's pace and initiative vis-a-vis legacy proprietary software.
  • Benefits for implementations deploying multiple Custom Extensions are a massive reduction in NREs and the hugely reduced ongoing software toolchain maintenance costs plus the benefit of having security updates from upstream software sources due to globally unique identifying information resulting in zero binary encoding conflicts in the toolchains and resultant binaries even for Custom Extensions.


In a lengthy thread that ironically was full of conflict indicative of the future direction in which RISC-V will go if left unresolved, multiple Custom Extensions were noted to be permitted free rein to introduce global binary-encoding conflict with no means of resolution described or endorsed by the RISC-V Standard: a practice that has known disastrous and irreversible consequences for any architecture that permits such practices (1).

Much later on in the discussion it was realised that there is also no way within the current RISC-V Specification to transition to improved versions of the standard, regardless of whether the fixes are absolutely critical show-stoppers or whether they are just keeping the standard up-to-date (2).

With no transition path there is guaranteed to be tension and conflict within the RISC-V Community over whether revisions should be made: should existing legacy designs be prioritised, mutually-exclusively over future designs (and what happens during the transition period is absolute chaos, with the compiler toolchain, software ecosystem and ultimately the end-users bearing the full brunt of the impact). If several overlapping revisions are required that have not yet transitioned out of use (which could take well over two decades to occur) the situation becomes disastrous for the credibility of the entire RISC-V ecosystem.

It was also pointed out that Compliance is an extremely important factor to take into consideration, and that Custom Extensions (as being optional) effectively and quite reasonably fall entirely outside of the scope of Compliance Testing. At this point in the discussion however it was not yet noted the stark problem that the mandatory RISC-V Specification also faces, by virtue of there being no transitional way to bring in show-stopping critical alterations.

To put this into perspective, just taking into account hardware costs alone: with production mask charges for 28nm being around USD $1.5m, engineering development costs and licensing of RTLs for peripherals being of a similar magnitude, no manufacturer is going to back away from selling a "flawed" or "legacy" product (whether it complies with the RISC-V Specification or not) without a bitter fight.

It was also pointed out that there will be significant software tool maintenance costs for manufacturers, meaning that the probability will be extremely high that they will refuse to shoulder such costs, and will publish and continue to publish (and use) hopelessly out-of-date unpatched tools. This practice is well-known to result in security flaws going unpatched, with one of many immediate undesirable consequences being that product in extremely large volume gets discarded into landfill.

All and any of the issues that were discussed, and all of those that were not, can be avoided by providing a hardware-level runtime-enabled forwards and backwards compatible transition path between all parts (mandatory or not) of current and future revisions of the RISC-V ISA Standard.

The rest of the discussion - indicative as it was of the stark mutually exclusive gap being faced by the RISC-V ISA Standard given that it does not cope with the problem - was an effort by two groups in two clear camps: one that wanted things to remain as they are, and another that made efforts to point out that the consequences of not taking action are clearly extreme and irreversible (which, unfortunately, given the severity, some of the first group were unable to believe, despite there being clear historical precedent for the exact same mistake being made in other architectures, and the consequences on the same being absolutely clear).

However after a significant amount of time, certain clear requirements came out of the discussion:

  • Any proposal must be a minimal change with minimal (or zero) impact
  • Any proposal should place no restriction on existing or future ISA encoding space
  • Any proposal should take into account that there are existing implementors of the (yet to be finalised but still "partly frozen") Standard who may resist, for financial investment reasons, efforts to make any change (at all) that could cost them immediate short-term profits.

Several proposals were put forward (and some are still under discussion)

  • "Do nothing": problem is not severe: no action needed.
  • "Do nothing": problem is out-of-scope for RISC-V Foundation.
  • "Do nothing": problem complicates Compliance Testing (and is out of scope)
  • "MISA": the MISA CSR enables and disables extensions already: use that
  • "MISA-like": a new CSR which switches in and out new encodings (without destroying state)
  • "mvendorid/marchid WARL": switching the entire "identity" of a machine
  • "ioctl-like": a OO proposal based around the linux kernel "ioctl" system.

Each of these will be discussed below in their own sections.

Do nothing (no problem exists)

(Summary: not an option)

There were several solutions offered that fell into this category. A few of them are listed in the introduction; more are listed below, and it was exhaustively (and exhaustingly) established that none of them are workable.

Initially it was pointed out that Fabless Semiconductor companies could simply license multiple Custom Extensions and a suitable RISC-V core, and modify them accordingly. The Fabless Semi Company would be responsible for paying the NREs on re-developing the test vectors (as the extension licensers would be extremely unlikely to do that without payment), and given that said Companies have an "integration" job to do, it would be reasonable to expect them to have such additional costs as well.

The costs of this approach were outlined and discussed as being disproportionate and extreme compared to the actual likely cost of licensing the Custom Extensions in the first place. Additionally it was pointed out that not only hardware NREs would be involved but custom software tools (compilers and more) would also be required (and maintained separately, on the basis that upstream would not accept them except under extreme pressure, and then only with prejudice).

All similar schemes involving customisation of the custom extensions were likewise rejected, but not before the customisation process was mistakenly conflated with tne normal integration process of developing a custom processor (Bus Architectures, Cache layouts, peripheral layouts).

The most compelling hardware-related reason (excluding the severe impact on the software ecosystem) for rejecting the customisation-of-customisation approach was the case where Extensions were using an instruction encoding space (48-bit, 64-bit) greater than that which the chosen core could cope with (32-bit, 48-bit).

Overall, none of the options presented were feasible, and, in addition, with no clear leadership from the RISC-V Foundation on how to avoid global world-wide encoding conflict, even if they were followed through, still would result in the failure of the RISC-V ecosystem due to irreversible global conflicting ISA binary-encoding meanings (POWERPC's Altivec / SPE nightmare).

This in addition to the case where the RISC-V Foundation wishes to fix a critical show-stopping update to the Standard, post-release, where billions of dollars have been spent on deploying RISC-V in the field.

Do nothing (out of scope)

(Summary: may not be RV Foundation's "scope", still results in problem, so not an option)

This was one of the first arguments presented: The RISC-V Foundation considers Custom Extensions to be "out of scope"; that "it's not their problem, therefore there isn't a problem".

The logical errors in this argument were quickly enumerated: namely that the RISC-V Foundation is not in control of the uses to which RISC-V is put, such that public global conflicts in binary-encoding are a hundred percent guaranteed to occur (outside of the control and remit of the RISC-V Foundation), and a hundred percent guaranteed to occur in commodity hardware where Debian, Fedora, SUSE and other distros will be hardest hit by the resultant chaos, and that will just be the more "visible" aspect of the underlying problem.

Do nothing (Compliance too complex, therefore out of scope)

(Summary: may not be RV Foundation's "scope", still results in problem, so not an option)

The summary here was that Compliance testing of Custom Extensions is not just out-of-scope, but even if it was taken into account that binary-encoding meanings could change, it would still be out-of-scope.

However at the time that this argument was made, it had not yet been appreciated fully the impact that revisions to the Standard would have, when billions of dollars worth of (older, legacy) RISC-V hardware had already been deployed.

Two interestingly diametrically-opposed equally valid arguments exist here:

  • Whilst Compliance testing of Custom Extensions is definitely legitimately out of scope, Compliance testing of simultaneous legacy (old revisions of ISA Standards) and current (new revisions of ISA Standard) definitely is not. Efforts to reduce Compliance Testing complexity is therefore "Compliance Tail Wagging Standard Dog".
  • Beyond a certain threshold, complexity of Compliance Testing is so burdensome that it risks outright rejection of the entire Standard.

Meeting these two diametrically-opposed perspectives requires that the solution be very, very simple.


(Summary: MISA not suitable, leads to better idea)

MISA permits extensions to be disabled by masking out the relevant bit. Hypothetically it could be used to disable one extension, then enable another that happens to use the same binary encoding.


  • MISA Extension disabling is permitted (optionally) to destroy the state information. Thus it is totally unsuitable for cases where instructions from different Custom extensions are needed in quick succession.
  • MISA was only designed to cover Standard Extensions.
  • There is nothing to prevent multiple Extensions being enabled that wish to simultaneously interpret the same binary encoding.
  • There is nothing in the MISA specification which permits future versions (bug-fixes) of the RISC-V ISA to be "switched in".

Overall, whilst the MISA concept is a step in the right direction it's a hundred percent unsuitable for solving the problem.


(Summary: basically same as mvend/march WARL except needs an extra CSR where mv/ma doesn't. Along right lines, doesn't meet full requirements)

Out of the MISA discussion came a "MISA-like" proposal, which would take into account the flaws pointed out by trying to use "MISA":

  • The MISA-like CSR's meaning would be identified by compilers using the mvendor-id/march-id tuple as a compiler target
  • Each custom-defined bit of the MISA-like CSR would (mutually-exclusively) redirect binary encoding(s) to specific encodings
  • No Extension would actually be disabled: its internal state would be left on (permanently) so that switching of ISA decoding could be done inside inner loops without adverse impact on performance.

Whilst it was the first "workable" solution it was also noted that the scheme is invasive: it requires an entirely new CSR to be added to the privileged spec (thus making existing implementations redundant). This does not fulfil the "minimum impact" requirement.

Also interesting around the same time an additional discussion was raised that covered the compiler side of the same equation. This revolved around using mvendorid-marchid tuples at the compiler level, to be put into assembly output (by gcc), preserving the required globally unique identifying information for binutils to successfully turn the custom instruction into an actual binary-encoding (plus binary-encoding of the context-switching information). (TBD, Jacob, separate page? review this para?)

mvendorid/marchid WARL

(Summary: the only idea that meets the full requirements. Needs toolchain backup, but only when the first chip is released)

This proposal has full details at the following page: mvendor march warl

Coming out of the software-related proposal by Jacob Bachmeyer, which hinged on the idea of a globally-maintained gcc / binutils database that kept and coordinated architectural encodings (curated by the Free Software Foundation), was to quite simply make the mvendorid and marchid CSRs have WARL (writeable) characteristics. Read-only is taken to mean a declaration of "Having no Custom Extensions" (a zero-impact change).

By making mvendorid-marchid tuples WARL the instruction decode phase may re-route mutually-exclusively to different engines, thus providing a controlled means and method of supporting multiple (future, past and present) versions of the Base ISA, Custom Extensions and even completely foreign ISAs in the same processor.

This incredibly simple non-invasive idea has some unique and distinct advantages over other proposals:

  • Existing designs - even though the specification is not finalised (but has "frozen" aspects) - would be completely unaffected: the change is to the "wording" of the specification to "retrospectively" fit reality.
  • Unlike with the MISA idea this is purely at the "decode" phase: no internal Extension state information is permitted to be disabled, altered or destroyed as a direct result of writing to the mvendor/march-id CSRs.
  • Compliance Testing may be carried out with a different vendorid/marchid tuple set prior to a test, allowing a vendor to claim Certified compatibility with both one (or more) legacy variants of the RISC-V Specification and with a present one.
  • With sufficient care taken in the implementation an implementor may have multiple interpretations of the same binary encoding within an inner loop, with a single instruction (to the WARL register) changing the meaning.

This is the only one of the proposals that meet the full requirements

Overloadable opcodes

See overloadable opcodes for full details, including a description in terms of C functions.

NOTE: under discussion.

==RB 2018-5-1 dropped IOCTL proposal for the much simpler overloadable opcodes proposal==

The overloadable opcode (or xext) proposal allows a non standard extension to use a documented 20 + 3 bit (or 52 + 3 bit on RV64) UUID identifier for an instruction for software to use. At runtime, a cpu translates the UUID to a small implementation defined 12 + 3 bit bit identifier for hardware to use. It also defines a fallback mechanism for the UUID's of instructions the cpu does not recognise.

The overloadable opcodes proposal defines 8 standardised R-type instructions xcmd0, xcmd1, ...xcmd7 preferably in the brownfield opcode space. Each xcmd takes in rs1 a 12 bit "logical unit" (lun) identifying a device on the cpu that implements some "extension interface" (xintf) together with some additional data. An xintf is a set of up to 8 commands with 2 input and 1 output port (i.e. like an R-type instruction), together with a description of the semantics of the commands. Calling e.g. xcmd3 routes its two inputs and one output ports to command 3 on the device determined by the lun bits in rs1. Thus, the 8 standard xcmd instructions are standard-designated overloadable opcodes, with the non standard semantics of the opcode determined by the lun.

Portable software, does not use luns directly. Instead, it goes through a level of indirection using a further instruction xext that translates a 20 bit globally unique identifier UUID of an xintf, to the lun of a device on the cpu that implements that xintf. The cpu can do this, because it knows (at manufacturing or boot time) which devices it has, and which xintfs they provide. This includes devices that would be described as non standard extension of the cpu if the designers had used custom opcodes instead of xintf as an interface. If the UUID of the xintf is not recognised at the current privilege level, the xext instruction returns the special lun = 0, causing any xcmd to trap. Minor variations of this scheme (requiring two more instructions) cause xcmd instructions to fallback to always return 0 or -1 instead of trapping.

The 20 bit provided by the UUID of the xintf is much more room than provided by the 2 custom 32 bit, or even 4 custom 64/48 bit opcode spaces. Thus the overloadable opcodes proposal avoids most of the need to put a claim on opcode space and the associated collisions when combining independent extensions. In this respect it is similar to POSIX ioctls, which obviate the need for defining new syscalls to control new and nonstandard hardware.

Remark1: the main difference with a previous "ioctl like proposal" is that UUID translation is stateless and does not use resources. The xext instruction neither initialises a device nor builds global state identified by a cookie. If a device needs initialisation it can do this using xcmds as init and deinit instructions. Likewise, it can hand out cookies (which can include the lun) as a return value .

Remark2: Implementing devices can respond to an (essentially) arbitrary number of xintfs. Hence an implementing device can respond to an arbitrary number of commands. Organising related commands in xintfs, helps avoid UUID space pollution, and allows to amortise the (small) cost of UUID to lun translation if related commands are used in combination.

==RB not sure if this is still correct and relevant==

The proposal is functionally similar to that of the mvendor/march-id except the non standard extension is explicit and restricted to a small set of well defined individual opcodes. Hence several extensions can be mixed and there is no state to be tracked over context switches. As such it could hypothetically be proposed as an independent Standard Extension.

Despite the proposal (which is still undergoing clarification) being worthwhile in its own right, and standing on its own merits and thus definitely worthwhile pursuing, it is non-trivial and more invasive than the mvendor/march-id WARL concept.


Dynamic runtime hardware-adjustable custom opcode encodings

Perhaps this is a misunderstanding, that what is being advocated below (see link for full context):

The best that can be done is to allow each custom extension to have its opcodes easily re positioned depending on what other custom extensions the user wants available in the same program (without mode switches).

It was suggested to use markers in the object files as a way to identify opcodes that can be "re-encoded". Contrast this with Jacob Bachmeyer's original idea where the assembly code (only) contains such markers (on a global world-wide unique basis, using mvendorid-marchid-isamux tuples to do so).

There are two possible interpretations of this:

  • (1) the Hardware RTL is reconfigureable (parameterisable) to allow easy selection of static moving (adjustment) of which opcodes a particular instruction uses. This runs into the same difficulties as outlined in other areas of this document.
  • (2) the Hardware RTL contains DYNAMIC and RUN-TIME CONFIGUREABLE opcodes (presumably using additional CSRs to move meanings)

This would help any implementation to adjust to whatever future (official) uses a particular encoding was selected. It would be particularly useful if an implementation used certain brownfield encodings.

The only downsides are:

  • (1) Compiler support for dynamic opcode reconfiguration would be... complex.
  • (2) The instruction decode phase is also made more complex, now involving reconfigureable lookup tables. Whilst major opcodes can be easily redirected, brownfield encodings are more involved.

Compared to a stark choice of having to move (exclusively) to 48-bit or 64-bit encodings, dynamic runtime opcode reconfiguration is comparatively much more palatable.

In effect, it is a much more advanced version of ISAMUX/NS (see isamux isans).

Comments, Discussion and analysis

TBD: placeholder as of 26apr2018

new (old) m-a-i tuple idea

actually that's a good point: where the user decides that they want to boot one and only one tuple (for the entire OS), forcing a HARDWARE level default m-a-i tuple at them actually prevents and prohibits them from doing that, Jacob.

so we have apps on one RV-Base ISA and apps on an INCOMPATIBLE (future) variant of RV-Base ISA.  with the approach that i was advocating (S-mode does NOT switch automatically), there are totally separate mtvec / stvec / bstvec traps.

would it be reasonable to assume the following:

(a) RV-Base ISA, particularly code-execution in the critical S-mode trap-handling, is EXTREMELY unlikely to ever be changed, even thinking 30 years into the future ?

(b) if the current M-mode (user app level) context is "RV Base ISA 1" then i would hazard a guess that S-mode is prettty much going to drop down into exactly the same mode / context, the majority of the time

thus the hypothesis is that not only is it the common code-path to not switch the ISA in the S-mode trap but that the instructions used are extremely unlikely to be changed between "RV Base Revisions".

foreign isa hardware-level execution

this is the one i've not really thought through so much, other than it would clearly be disadvantageous for S-mode to be arbitrarily restricted to running RV-Base code (of any variant).  a case could be made that by the time the m-a-i tuple is switched to the foreign isa it's "all bets off", foreign arch is "on its own", including having to devise a means and method to switch back (equivalent in its ISA of m-a-i switching).

conclusion / idea

the multi-base "user wants to run one and only one tuple" is the key case, here, that is a show-stopper to the idea of hard-wiring the default S-mode m-a-i.

now, if instead we were to say, "ok so there should be a default S-mode m-a-i tuple" and it was permitted to SET (choose) that tuple, that would solve that problem.  it could even be set to the foreign isa.  which would be hilarious.

jacob's idea: one hart, one configuration:

 (a) RV-Base ISA, particularly code-execution in the critical S-mode trap-handling, is EXTREMELY unlikely to ever be changed, even thinking 30 years into the future ?

Oddly enough, due to the minimalism of RISC-V, I believe that this is actually quite likely.  :-)

 thus the hypothesis is that not only is it the common code-path to not switch the ISA in the S-mode trap but that the instructions used are extremely unlikely to be changed between "RV Base Revisions".

Correct.  I argue that S-mode should not be able to switch the selected ISA on multi-arch processors. 

that would produce an artificial limitation which would prevent and prohibit implementors from making a single-core (single-hart) multi-configuration processor.

Summary and Conclusion

In the early sections (those in the category "no action") it was established in each case that the problem is not solved. Avoidance of responsibility, or conflation of "not our problem" with "no problem" does not make "problem" go away. Even "making it the Fabless Semiconductor's design problem" resulted in a chip being more costly to engineer as hardware and more costly from a software-support perspective to maintain... without actually fixing the problem.

The first idea considered which could fix the problem was to just use the pre-existing MISA CSR, however this was determined not to have the right coverage (Standard Extensions only), and also crucially it destroyed state. Whilst unworkable it did lead to the first "workable" solution, "MISA-like".

The "MISA-like" proposal, whilst meeting most of the requirements, led to a better idea: "mvendor/march-id WARL", which, in combination with an offshoot idea related to gcc and binutils, is the only proposal that fully meets the requirements.

The "ioctl-like" idea also solves the problem, but, unlike the WARL idea does not meet the full requirements to be "non-invasive" and "backwards compatible" with pre-existing (pre-Standards-finalised) implementations. It does however stand on its own merit as a way to extend the extremely small Custom Extension opcode space, even if it itself implemented as a Custom Extension into which other Custom Extensions are subsequently shoe-horned. This approach has the advantage that it requires no "approval" from the RISC-V Foundation... but without the RISC-V Standard "approval" guaranteeing no binary-encoding conflicts, still does not actually solve the problem (if deployed as a Custom Extension for extending Custom Extensions).

Overall the mvendor/march-id WARL idea meets the three requirements, and is the only idea that meets the three requirements:

  • Any proposal must be a minimal change with minimal (or zero) impact (met through being purely a single backwards-compatible change to the wording of the specification: mvendor/march-id changes from read-only to WARL)
  • Any proposal should place no restriction on existing or future ISA encoding space (met because it is just a change to one pre-existing CSR, as opposed to requiring additional CSRs or requiring extra opcodes or changes to existing opcodes)
  • Any proposal should take into account that there are existing implementors of the (yet to be finalised but still "partly frozen") Standard who may resist, for financial investment reasons, efforts to make any change (at all) that could cost them immediate short-term profits. (met because existing implementations, with the exception of those that have Custom Extensions, come under the "vendor/arch-id read only is a formal declaration of an implementation having no Custom Extensions" fall-back category)

So to summarise:

  • The consequences of not tackling this are severe: the RISC-V Foundation cannot take a back seat. If it does, clear historical precedent shows 100% what the outcome will be (1).
  • Making the mvendorid and marchid CSRs WARL solves the problem in a minimal to zero-disruptive backwards-compatible fashion that provides indefinite transparent forwards-compatibility.
  • The retro-fitting cost onto existing implementations (even though the specification has not been finalised) is zero to negligeable (only changes to words in the specification required at this time: no vendor need discard existing designs, either being designed, taped out, or actually in production).
  • The benefits are clear (pain-free transition path for vendors to safely upgrade over time; no fights over Custom opcode space; no hassle for software toolchain; no hassle for GNU/Linux Distros)
  • The implementation details are clear (and problem-free except for vendors who insist on deploying dozens of conflicting Custom Extensions: an extreme unlikely outlier).
  • Compliance Testing is straightforward and allows vendors to seek and obtain multiple Compliance Certificates with past, present and future variants of the RISC-V Standard (in the exact same processor, simultaneously), in order to support end-customer legacy scenarios and provide the same with a way to avoid "impossible-to-make" decisions that throw out ultra-costly multi-decade-investment in proprietary legacy software at the same as the (legacy) hardware.

Conversation Exerpts

The following conversation exerpts are taken from the ISA-dev discussion

(1) Albert Calahan on SPE / Altiven conflict in POWERPC

Yes. Well, it should be blocked via legal means. Incompatibility is a disaster for an architecture.

The viability of PowerPC was badly damaged when SPE was introduced. This was a vector instruction set that was incompatible with the AltiVec instruction set. Software vendors had to choose, and typically the choice was "neither". Nobody wants to put in the effort when there is uncertainty and a market fragmented into small bits.

Note how Intel did not screw up. When SSE was added, MMX remained. Software vendors could trust that instructions would be supported. Both MMX and SSE remain today, in all shipping processors. With very few exceptions, Intel does not ship chips with missing functionality. There is a unified software ecosystem.

This goes beyond the instruction set. MMU functionality also matters. You can add stuff, but then it must be implemented in every future CPU. You can not take stuff away without harming the architecture.

(2) Luke Kenneth Casson Leighton on Standards backwards-compatibility

For the case where "legacy" variants of the RISC-V Standard are backwards-forwards-compatibly supported over a 10-20 year period in Industrial and Military/Goverment-procurement scenarios (so that the impossible-to-achieve pressure is off to get the spec ABSOLUTELY correct, RIGHT now), nobody would expect a seriously heavy-duty amount of instruction-by-instruction switching: it'd be used pretty much once and only once at boot-up (or once in a Hypervisor Virtual Machine client) and that's it.

(3) Allen Baum on Standards Compliance

Putting my compliance chair hat on: One point that was made quite clear to me is that compliance will only test that an implementation correctly implements the portions of the spec that are mandatory, and the portions of the spec that are optional and the implementor claims it is implementing. It will test nothing in the custom extension space, and doesn't monitor or care what is in that space.

(4) Jacob Bachmeyer on explaining disambiguation of opcode space

...have different harts with different sets of encodings.)  Adding a "select" CSR as has been proposed does not escape this fundamental truth that instruction decode must be unambiguous, it merely expands every opcode with extra bits from a "select" CSR.

(5) Krste Asanovic on clarification of use of opcode space

A CPU is even free to reuse some standard extension encoding space for non-standard extensions provided it does not claim to implement that standard extension.

(6) Clarification of difference between assembler and encodings

The extensible assembler database I proposed assumes that each processor will have one and only one set of recognized instructions.  (The "hidden prefix" is the immutable vendor/arch/impl tuple in my proposals.) 

 ah this is an extremely important thing to clarify, the difference between the recognised instruction assembly mnemonic (which must be globally world-wide accepted as canonical) and the binary-level encodings of that mnemonic used different vendor implementations which will most definitely not be unique but require "registration" in the form of atomic acceptance as a patch by the FSF to gcc and binutils [and other compiler tools].