# PartitionedSignal nmigen-aware eq (assign)

For copying (assigning) PartitionedSignal to PartitionedSignal
of equal size there is no issue. However if the source has a
greater width than the target, *partition-aware* truncation
must occur. For the opposite, sign/zero extension must occur.
Finally for a Signal or Const, duplication across all Partitions
must occur, again, following the rules of zero, sign or unsigned.

Take two PartitionedSignals (source a, dest b) of 32 bit:

```
partition: p p p (3 bits)
a : AAA3 AAA2 AAA1 AAA0 (32 bits)
b : BBB3 BBB2 BBB1 BBB0 (32 bits)
```

For all partition settings this copies verbatim. However when A is either shorter or longer, different behaviour occurs. If A is shorter than B:

```
partition: p p p (3 bits)
a : A7A6 A5A4 A3A2 A1A0 (8 bits)
b : BBB3 BBB2 BBB1 BBB0 (16 bits)
```

then it matters what the partition settings are:

partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|

000 | [A7A7A7A7] | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 |

001 | [A7A7A7A7] | [A7A7]A7A6 | A5A4A3A2 | [A1A1]A1A0 |

010 | [A7A7A7A7] | A7A6A5A4 | [A3A3A3A3] | A3A2A1A0 |

011 | [A7A7A7A7] | A7A6A5A4 | [A3A3]A3A2 | [A1A1]A1A0 |

100 | [A7A7]A7A6 | [A5A5A5A5] | [A5A5]A5A4 | A3A2A1A0 |

101 | [A7A7]A7A6 | [A5A5A5A5] | A5A4A3A2 | [A1A1]A1A0 |

110 | [A7A7]A7A6 | [A5A5]A5A4 | [A3A3A3A3] | A3A2A1A0 |

111 | [A7A7]A7A6 | [A5A5]A5A4 | [A3A3]A3A2 | [A1A1]A1A0 |

where square brackets are zero if A is unsigned, and contains
the specified bits if signed. Here, each partition copies the
smaller value (A) into the larger partition (B) then, depending
on whether A is signed or unsigned, sign-extends or zero-extends
*on a per-partition basis*.

For A longer than B:

```
partition: p p p (3 bits)
a : AAAA AAAA AAAA AAAA (16 bits)
b : B7B6 B5B4 B3B2 B1B0 (8 bits)
```

truncation occurs at different points depending on partitions:

partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|

000 | A7A6 | A5A4 | A3A2 | A1A0 |

001 | A9A8 | A7A6 | A5A4 | A1A0 |

010 | A11A10 | A9A8 | A3A2 | A1A0 |

011 | A11A10 | A9A8 | A5A4 | A1A0 |

100 | A13A12 | A5A4 | A3A2 | A1A0 |

101 | A13A12 | A7A6 | A5A4 | A1A0 |

110 | A13A12 | A9A8 | A3A2 | A1A0 |

111 | A13A12 | A9A8 | A5A4 | A1A0 |

In effect, copying starts from the beginning of a partition, ending when a closed partition point is found.

# Scalar source

When the source A is scalar and is equal or larger than the destination it requires copying across multiple partitions:

```
partition: p p p (3 bits)
a : AAAA AAAA AAAA AAAA (16 bits)
b : B7B6 B5B4 B3B2 B1B0 (8 bits)
```

As the source is Scalar, it must be copied (broadcast) into each partition of the output, B, starting at the beginning of each partition. With each partition being smaller than A (except in one case) truncation is guaranteed. The exception is when all pattitions are open (1x) and the length of A and B are the same.

The partition options are:

partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|

000 | A7A6 | A5A4 | A3A2 | A1A0 |

001 | A5A4 | A3A2 | A1A0 | A1A0 |

010 | A3A2 | A1A0 | A3A2 | A1A0 |

011 | A3A2 | A1A0 | A1A0 | A1A0 |

100 | A1A0 | A5A4 | A3A2 | A1A0 |

101 | A1A0 | A3A2 | A1A0 | A1A0 |

110 | A1A0 | A1A0 | A3A2 | A1A0 |

111 | A1A0 | A1A0 | A1A0 | A1A0 |

When the partitions are all open (1x) only the bits that will fit across the whole of the target are copied. In this example, B is 8 bits so only 8 bits of A are copied.

When the partitions are all closed (4x SIMD) each partition of B is
2 bits wide, therefore only the *first two* bits of A are copied into
*each* of the four 2-bit partitions in B.

For the case where A is shorter than B output, sign or zero extension is required. Here we assume A is 8 bits, B is 16. This is similar to the parallel case except A is repeated (broadcast) across all of B.

partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|

000 | [A7A7A7A7] | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 |

001 | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 | A3A2A1A0 |

010 | A7A6A5A4 | A3A2A1A0 | A7A6A5A4 | A3A2A1A0 |

011 | A7A6A5A4 | A3A2A1A0 | A3A2A1A0 | A3A2A1A0 |

100 | A3A2A1A0 | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 |

101 | A3A2A1A0 | A7A6A5A4 | A3A2A1A0 | A3A2A1A0 |

110 | A3A2A1A0 | A3A2A1A0 | A7A6A5A4 | A3A2A1A0 |

111 | A3A2A1A0 | A3A2A1A0 | A3A2A1A0 | A3A2A1A0 |

Note how when the entire partition set is open (1x 16-bit output) that all of A is copied out, and either zero or sign extended in the top half of the output. At the other extreme is the 4x 4-bit output partitions, which have four copies of A, truncated from the first 4 bits of A.

Unlike the parallel case, A is not itself partitioned, so is copied
over as much as is possible. In some cases such as `1x 4-bit, 1x 12-bit`

(partition mask = `0b100`

, above) when copying the 8-bit scalar source
into the highest part of B (o3) it is truncated to 4 bis (because
each partition of B is only 4 bits) but for copying to the 12-bit partition
(o2-o1-00) the 8-bit scalar source, A, will need sign or zero extending.