Partitioned Add
this principle also applies to subtract and negate (-)
the basic principle is: the partition bits, when inverted, can actually be inserted into an (expanded) add, and, if the bit is set, it has the side-effect of "rolling through" the carry bit of the MSB from the previous partition.
this is a really neat trick, basically, that allows the use of a straight "add" (DSP in an FPGA, add in a simulator) where otherwise it would be extraordinarily complex, CPU-intensive and take up large resources.
partition: P P P (3 bits)
a : .... .... .... .... (32 bits)
b : .... .... .... .... (32 bits)
exp-a : ....P....P....P.... (32+3 bits, P=0 if no partition)
exp-b : ....0....0....0.... (32 bits plus 3 zeros)
exp-o : ....xN...xN...xN... (32+3 bits - x to be discarded)
o : .... N... N... N... (32 bits - x ignored, N is carry-over)
new version:
partition: p p p (3 bits)
carry-in : c c c c (4 bits)
C = c & P: C C C c (4 bits)
I = P=>c : I I I c (4 bits)
a : AAAA AAAA AAAA AAAA (32 bits)
b : BBBB BBBB BBBB BBBB (32 bits)
exp-a : 0AAAACAAAACAAAACAAAAc (32+3+2 bits, P=0 if no partition)
exp-b : 0BBBBIBBBBIBBBBIBBBBc (32+2 bits plus 3 zeros)
exp-o : o....oN...oN...oN...x (32+3+2 bits - x to be discarded)
o : .... N... N... N... (32 bits - x ignored, N is carry-over)
carry-out: o o o o (4 bits)
the new version
- brings in the carry-in (C) bits which, in combination with the Partition bits, are ANDed to create "C & p".
- C is positioned twice (in both A and B) intermediates, which has the effect of preserving carry-out, yet only performing a carry-over if the carry-in bit (c) is set and this is part of a partition
- o (carry-out) must be "cascaded" down to the relevant partition start-point. this can be done with a Mux-cascade.
carry-out-cascade example:
partition: 1 0 0 1 (4 bits)
actual : <--->|<------------>|<---> actual numbers
carryotmp: o4 o3 o2 o1 o0 (5 bits)
cascade : | | x x | o2 and o1 ignored
carry-out: o4 \-> --> o3 o0 (5 bits)
because the partitions subdivide the 5-wide input into 8-24-8, o4 is already in "both" the MSB-and-LSB position for the top 8-bit result; o3 is the carry-out for the 24-bit result and must be cascaded down to the beginning of the 24-bit partitioned result (the LSB), and o0, like o4, is already in position because the partition is only 1 wide.