• idea 1: modify cmp (and other CR generators?) with qualifiers that create single bit prefix vector into int reg
  • idea 2: override CR SO field in vector form to be predicate bit per element
  • idea 3: reading of predicates is from bits of int reg
  • idea 4: SO CR field no longer overflow, contains copy of int reg predicate element bit (passed through). when OE set?


  • must be easily implementable in any microarchitecture including out-of-order
  • must not compromise or penalise any microarchitectural performance
  • must cover up to 64 elements