Shape is 32-bits. When SHAPE is set entirely to zeros, remapping is disabled: the register's elements are a linear (1D) vector.

31..30 29..28 27..24 23..21 20..18 17..12 11..6 5..0
0b00 skip offset invxyz permute zdimsz ydimsz xdimsz
0b01 submode offset invxyz submode2 rsvd rsvd xdimsz

mode sets different behaviours (straight matrix multiply, FFT, DCT).

  • mode=0b00 sets straight Matrix Mode
  • mode=0b01 sets "FFT/DCT" mode and activates submodes

When submode2 is 0, for FFT submode the following schedules may be selected:

  • submode=0b00 selects the j offset of the innermost for-loop of Tukey-Cooley
  • submode=0b10 selects the j+halfsize offset of the innermost for-loop of Tukey-Cooley
  • submode=0b11 selects the k of exptable (which coefficient)

When submode2 is 1 or 2, for DCT inner butterfly submode the following schedules may be selected. When submode2 is 1, additional bit-reversing is also performed.

  • submode=0b00 selects the j offset of the innermost for-loop, in-place
  • submode=0b010 selects the j+halfsize offset of the innermost for-loop, in reverse-order, in-place
  • submode=0b10 selects the ci count of the innermost for-loop, useful for calculating the cosine coefficient
  • submode=0b11 selects the size offset of the outermost for-loop, useful for the cosine coefficient cos(ci + 0.5) * pi / size

When submode2 is 3 or 4, for DCT outer butterfly submode the following schedules may be selected. When submode is 3, additional bit-reversing is also performed.

  • submode=0b00 selects the j offset of the innermost for-loop,
  • submode=0b01 selects the j+1 offset of the innermost for-loop,

in Matrix Mode, skip allows dimensions to be skipped from being included in the resultant output index. this allows sequences to be repeated: 0 0 0 1 1 1 2 2 2 ... or in the case of skip=0b11 this results in modulo 0 1 2 0 1 2 ...

  • skip=0b00 indicates no dimensions to be skipped
  • skip=0b01 sets "skip 1st dimension"
  • skip=0b10 sets "skip 2nd dimension"
  • skip=0b11 sets "skip 3rd dimension"

invxyz will invert the start index of each of x, y or z. If invxyz[0] is zero then x-dimensional counting begins from 0 and increments, otherwise it begins from xdimsz-1 and iterates down to zero. Likewise for y and z.

offset will have the effect of offsetting the result by offset elements:

for i in 0..VL-1:
    GPR(RT + remap(i) + SVSHAPE.offset) = ....

this appears redundant because the register RT could simply be changed by a compiler, until element width overrides are introduced. also bear in mind that unlike a static compiler SVSHAPE.offset may be set dynamically at runtime.

xdimsz, ydimsz and zdimsz are offset by 1, such that a value of 0 indicates that the array dimensionality for that dimension is 1. any dimension not intended to be used must have its value set to 0 (dimensionality of 1). A value of xdimsz=2 would indicate that in the first dimension there are 3 elements in the array. For example, to create a 2D array X,Y of dimensionality X=3 and Y=2, set xdimsz=2, ydimsz=1 and zdimsz=0

The format of the array is therefore as follows:

array[xdimsz+1][ydimsz+1][zdimsz+1]

However whilst illustrative of the dimensionality, that does not take the "permute" setting into account. "permute" may be any one of six values (0-5, with values of 6 and 7 being reserved, and not legal). The table below shows how the permutation dimensionality order works:

permute order array format
000 0,1,2 (xdim+1)(ydim+1)(zdim+1)
001 0,2,1 (xdim+1)(zdim+1)(ydim+1)
010 1,0,2 (ydim+1)(xdim+1)(zdim+1)
011 1,2,0 (ydim+1)(zdim+1)(xdim+1)
100 2,0,1 (zdim+1)(xdim+1)(ydim+1)
101 2,1,0 (zdim+1)(ydim+1)(xdim+1)

In other words, the "permute" option changes the order in which nested for-loops over the array would be done. See executable python reference code for further details.