PartitionedSignal nmigen-aware eq (assign)
For copying (assigning) PartitionedSignal to PartitionedSignal of equal size there is no issue. However if the source has a greater width than the target, partition-aware truncation must occur. For the opposite, sign/zero extension must occur. Finally for a Signal or Const, duplication across all Partitions must occur, again, following the rules of zero, sign or unsigned.
Take two PartitionedSignals (source a, dest b) of 32 bit:
partition: p p p (3 bits)
a : AAA3 AAA2 AAA1 AAA0 (32 bits)
b : BBB3 BBB2 BBB1 BBB0 (32 bits)
For all partition settings this copies verbatim. However when A is either shorter or longer, different behaviour occurs. If A is shorter than B:
partition: p p p (3 bits)
a : A7A6 A5A4 A3A2 A1A0 (8 bits)
b : BBB3 BBB2 BBB1 BBB0 (16 bits)
then it matters what the partition settings are:
partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|
000 | [A7A7A7A7] | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 |
001 | [A7A7A7A7] | [A7A7]A7A6 | A5A4A3A2 | [A1A1]A1A0 |
010 | [A7A7A7A7] | A7A6A5A4 | [A3A3A3A3] | A3A2A1A0 |
011 | [A7A7A7A7] | A7A6A5A4 | [A3A3]A3A2 | [A1A1]A1A0 |
100 | [A7A7]A7A6 | [A5A5A5A5] | [A5A5]A5A4 | A3A2A1A0 |
101 | [A7A7]A7A6 | [A5A5A5A5] | A5A4A3A2 | [A1A1]A1A0 |
110 | [A7A7]A7A6 | [A5A5]A5A4 | [A3A3A3A3] | A3A2A1A0 |
111 | [A7A7]A7A6 | [A5A5]A5A4 | [A3A3]A3A2 | [A1A1]A1A0 |
where square brackets are zero if A is unsigned, and contains the specified bits if signed. Here, each partition copies the smaller value (A) into the larger partition (B) then, depending on whether A is signed or unsigned, sign-extends or zero-extends on a per-partition basis.
For A longer than B:
partition: p p p (3 bits)
a : AAAA AAAA AAAA AAAA (16 bits)
b : B7B6 B5B4 B3B2 B1B0 (8 bits)
truncation occurs at different points depending on partitions:
partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|
000 | A7A6 | A5A4 | A3A2 | A1A0 |
001 | A9A8 | A7A6 | A5A4 | A1A0 |
010 | A11A10 | A9A8 | A3A2 | A1A0 |
011 | A11A10 | A9A8 | A5A4 | A1A0 |
100 | A13A12 | A5A4 | A3A2 | A1A0 |
101 | A13A12 | A7A6 | A5A4 | A1A0 |
110 | A13A12 | A9A8 | A3A2 | A1A0 |
111 | A13A12 | A9A8 | A5A4 | A1A0 |
In effect, copying starts from the beginning of a partition, ending when a closed partition point is found.
Scalar source
When the source A is scalar and is equal or larger than the destination it requires copying across multiple partitions:
partition: p p p (3 bits)
a : AAAA AAAA AAAA AAAA (16 bits)
b : B7B6 B5B4 B3B2 B1B0 (8 bits)
As the source is Scalar, it must be copied (broadcast) into each partition of the output, B, starting at the beginning of each partition. With each partition being smaller than A (except in one case) truncation is guaranteed. The exception is when all pattitions are open (1x) and the length of A and B are the same.
The partition options are:
partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|
000 | A7A6 | A5A4 | A3A2 | A1A0 |
001 | A5A4 | A3A2 | A1A0 | A1A0 |
010 | A3A2 | A1A0 | A3A2 | A1A0 |
011 | A3A2 | A1A0 | A1A0 | A1A0 |
100 | A1A0 | A5A4 | A3A2 | A1A0 |
101 | A1A0 | A3A2 | A1A0 | A1A0 |
110 | A1A0 | A1A0 | A3A2 | A1A0 |
111 | A1A0 | A1A0 | A1A0 | A1A0 |
When the partitions are all open (1x) only the bits that will fit across the whole of the target are copied. In this example, B is 8 bits so only 8 bits of A are copied.
When the partitions are all closed (4x SIMD) each partition of B is 2 bits wide, therefore only the first two bits of A are copied into each of the four 2-bit partitions in B.
For the case where A is shorter than B output, sign or zero extension is required. Here we assume A is 8 bits, B is 16. This is similar to the parallel case except A is repeated (broadcast) across all of B.
partition | o3 | o2 | o1 | o0 |
---|---|---|---|---|
000 | [A7A7A7A7] | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 |
001 | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 | A3A2A1A0 |
010 | A7A6A5A4 | A3A2A1A0 | A7A6A5A4 | A3A2A1A0 |
011 | A7A6A5A4 | A3A2A1A0 | A3A2A1A0 | A3A2A1A0 |
100 | A3A2A1A0 | [A7A7A7A7] | A7A6A5A4 | A3A2A1A0 |
101 | A3A2A1A0 | A7A6A5A4 | A3A2A1A0 | A3A2A1A0 |
110 | A3A2A1A0 | A3A2A1A0 | A7A6A5A4 | A3A2A1A0 |
111 | A3A2A1A0 | A3A2A1A0 | A3A2A1A0 | A3A2A1A0 |
Note how when the entire partition set is open (1x 16-bit output) that all of A is copied out, and either zero or sign extended in the top half of the output. At the other extreme is the 4x 4-bit output partitions, which have four copies of A, truncated from the first 4 bits of A.
Unlike the parallel case, A is not itself partitioned, so is copied
over as much as is possible. In some cases such as 1x 4-bit, 1x 12-bit
(partition mask = 0b100
, above) when copying the 8-bit scalar source
into the highest part of B (o3) it is truncated to 4 bis (because
each partition of B is only 4 bits) but for copying to the 12-bit partition
(o2-o1-00) the 8-bit scalar source, A, will need sign or zero extending.