# Vector mv operations

In the SIMD VSX set, section 6.8.1 and 6.8.2 p254 of v3.0B has a series of pack and unpack operations. This page covers those and more. svp64 provides the Vector Context to also add saturation as well as predication.

See https://bugs.libre-soc.org/show_bug.cgi?id=230#c30

Note that some of these may be covered by remap which is described in propagation

# move to/from vec2/3/4

Basic idea: mv operations where either the src or dest is specifically marked as having SUBVL apply to it, but, crucially, the *other* argument does *not*. Note that this is highly unusual in SimpleV, which normally only allows SUBVL to be applied uniformly across all dest and all src.

```
mv.srcvec r3, r4.vec2
mv.destvec r2.vec4, r5
```

TODO: evaluate whether this will fit with mv.swizzle involved as well (yes it probably will)

- M=0 is mv.srcvec
- M=1 is mv.destvec

mv.srcvec (leaving out elwidths and chop):

```
for i in range(VL):
regs[rd+i] = regs[rs+i*SUBVL]
```

mv.destvec (leaving out elwidths and chop):

```
for i in range(VL):
regs[rd+i*SUBVL] = regs[rs+i]
```

Note that these mv operations only become significant when elwidth is set on the vector to a small value. SUBVL=4, src elwidth=8, dest elwidth=32 for example.

intended to cover:

```
rd = (rs >> 0 * 8) & (2^8 - 1)
rd+1 = (rs >> 1 * 8) & (2^8 - 1)
rd+2 = (rs >> 2 * 8) & (2^8 - 1)
rd+3 = (rs >> 3 * 8) & (2^8 - 1)
```

and variants involving vec3 into 32 bit (4th byte set to zero). TODO: include this pseudocode which shows how the vecN can do that. in this example RA elwidth=32 and RB elwidth=8, RB is a vec4.

```
for i in range(VL):
if predicate_bit_not_set(i) continue
uint8_t *start_point = (uint8_t*)(int_regfile[RA].i[i])
for j in range(SUBVL): # vec4
start_point[j] = some_op(int_regfile[RB].b[i*SUBVL + j])
```

## Twin Predication, saturation, swizzle, and elwidth overrides

Note that mv is a twin-predicated operation, and is swizzlable. This implies that from the vec2, vec3 or vec4, 1 to 8 bytes may be selected and re-ordered (XYZW), mixed with 0 and 1 constants, skipped by way of twin predicate pack and unpack, and a huge amount besides.

Also saturation can be applied to individual elements, including the elements within a vec2/3/4.

# mv.zip and unzip

0.5 | 6.10 | 11.15 | 16..20 | 21..25 | 26.....30 | 31 | name |
---|---|---|---|---|---|---|---|

19 | RT | RC | RB/0 | RA/0 | XO[5:9] | Rc | mv.zip |

19 | RT | RC | RS/0 | RA/0 | XO[5:9] | Rc | mv.unzip |

these are specialist operations that zip or unzip to/from multiple regs to/from one vector including vec2/3/4. when SUBVL!=1 the vec2/3/4 is the contiguous unit that is copied (as if one register). different elwidths result in zero-extension or truncation except if saturation is enabled, where signed/unsigned may be applied as usual.

mv.zip, RA=0, RB=0

```
for i in range(VL):
regs[rt+i] = regs[rc+i]
```

mv.zip, RA=0, RB!=0

```
for i in range(VL):
regs[rt+i*2 ] = regs[rb+i]
regs[rt+i*2+1] = regs[rc+i]
```

mv.zip, RA!=0, RB!=0

```
for i in range(VL):
regs[rt+i*3 ] = regs[rb+i]
regs[rt+i*3+1] = regs[rc+i]
regs[rt+i*3+2] = regs[ra+i]
```