Wednesday, 2021-11-24

Veera[m]lkcl:  Is Libre-soc Talos machine POWER9 or POWER8?00:32
jnTalos II is POWER9; the Talos I was POWER8 but wasn't sold much00:33
programmerjakelibre-soc's talos server is power9 iirc00:38
sadoon_albader[mBtw, I couldn't get powerpc64-gdb to build on the talos for some reason, perhaps it assumes it is cross-debugging, I bet the gdb package in the repos should be enough? I might need symlinks perhaps00:50
Veera[m]sadoon_albader:  plain gdb in power system is enough03:08
sadoon_albader[mAwesome, just as I was expecting03:12
Veera[m]Need help with  Subtract From Immediate Carrying; subfic RT,RA,SI: RT = ¬ (RA) + EXTS(SI) + 107:03
Veera[m]Does it uses CA bit for adding or just alters CA bit after compute07:03
programmerjakeit does not read CA, it just alters CA and CA32 after compute. see subfe for an instruction that *does* read CA, for comparison.07:16
programmerjakeVeera ^07:16
Veera[m]if i have to find out what CA it will set, how that can be done07:18
Veera[m]I mean what CA value? In python script07:18
Veera[m]subfic 3, 1, imm07:20
Veera[m]carry = if imm < GPR[1] then CA = 107:21
programmerjakedo the addition in python, the carry out will be the first bit above the MSB, so counting from bit 0 at the lsb, for 64-bit the value will be in bit 64 cuz the msb is bit 63, for 32-bit the carry will be in bit 3207:21
programmerjakethat should apply for both signed and unsigned addition07:22
programmerjakeso, for example, 0x78+0x88==0x110 so the 8-bit sum is 0x10 and the 8-bit carry out is 1 cuz bit 0x100 is set07:23
Veera[m]"32-bit the carry will be in bit 32" sometimes this may be set 0 even if there is CA32=1 in 64bit mode07:24
programmerjakehmm, any examples?07:24
programmerjake0x78+0x88==0x100 oops, mis-added07:25
Veera[m]I am trying to do this for ALU test cases and subfic ¬ (RA) + EXTS(SI) + 1: is giving random results for CA bit07:27
Veera[m]Can you provide me a link for the file where subfic is implemented07:28
programmerjakeoh, wait, for N-bit carry out, the inputs need to be masked to N-bits unsigned, if not you'll get the wrong answer07:29
programmerjakesubfic in power-instruction-analyzer:
Veera[m]"need to be masked to N-bits unsigned" : yes07:31
programmerjakesubfic in soc.git (converted to a generic add):;a=blob;f=src/soc/fu/alu/;h=f4ad49183c1ffbd686644238a676d7dd807c64b6;hb=d40d5ded858bf09b7b46838d47410c9dc957167f#l14307:32
programmerjakeCA32 computation in openpower-isa.git:;a=blob;f=src/openpower/decoder/isa/;hb=e5d2a21bd25720f9267c7c8045df83163bc63a20#l85107:37
programmerjakehopefully you can figure it out from those, imho the power-instruction-analyzer one is probably the clearest07:41
programmerjaketoshywoshy: openpowerbot disconnected from oftc about 4hr ago07:43
Veera[m]I will try understanding the code, isn't carry is different in add versus substract ops08:13
programmerjakeno, carry in/out isn't all that different between add and subtract, subtract is just where one input is inverted and either CA or 1 is added, add adds either CA or 0.08:27
programmerjakeboth of them have carry out from the unsigned addition of the two inputs and the carry in (CA or 0/1) after the one input is optionally inverted08:27
Veera[m]        .checked_add(immediate as u32)10:38
Veera[m].and_then(|v| v.checked_add(1))10:39
Veera[m]what is .checked_add and |v| v.checked_add10:39
lkclVeera[m], basically, all add/subtract operations - and i do mean all - in the entirety of Power ISA use the exact same one internal piece of hardware10:50
lkcldo you know how to turn a number negative in binary?10:50
lkclyou invert all its bits then add one.10:51
lkclso that is how subtract is done.10:51
lkclsub(RA, RB) ==> ADD(  (~RA+1) + RB)10:51
lkcl*not* by doing an actual hardware-level subtract!10:52
lkclthen, to do carry-in and carry-out, the actual hardware-level adder is made not 64-bit, but *66* bit.10:53
lkclso, let's have a look here:11:09
lkclsubfic RT,RA,SI11:10
lkclis implemented as:11:10
lkcl     RT <- ¬(RA) + EXTS(SI) + 111:10
cesar" nosvp64 general" is hanging for me.11:10
lkclcesar, will take a look11:10
cesarStarted bisecting, but ran out of time.11:10
lkcli haven't run it in a while, but it doesn't surprise me11:11
lkclit's one that contains a loop11:11
lkcland i modified how the "end of program" is detected11:11
cesarGood commits for me are: 376ab6167e524f639587d054908f7cc18f9c427b in soc11:11
cesar... and d5f50879146ebd1de94d25137d732acbbb31868f in openpower-isa.11:12
lkclit is almost certainly a loop where the bc instruction is at the end11:15
cesar433556d1a3298d9d57820ae1087746d4170f9d0c in soc seems to introduce a regression, in combination with d5f50879146ebd1de94d25137d732acbbb31868f in openpower-isa.11:15
lkclthat's odd.  not what i expected.11:16
cesarAnd, with 376ab6167e524f639587d054908f7cc18f9c427b in soc, d5f50879146ebd1de94d25137d732acbbb31868f in openpower-isa works, but master in openpower-isa breaks.11:19
cesar(so a bisect in openpower-isa is needed as well)11:20
lkclrdmask on an addi instruction is all 1s. (0xf).  that should not be happening.11:27
lkclerr... actually... it's set *after* the instruction has completed!!11:30
lkclit's something unique to addi.  add is fine11:37
lkclohh hang on.  addi 9,9,-1 is a special type of hazard11:40
lkcladdi 9,0, 0x1011:41
lkclfollowed by11:41
lkcladdi 9,9, -111:41
lkclis a special type of hazard i'm currently debugging11:41
lkclallow_overlap=False should not be looking for it, at all11:42
lkcl1 sec i think i know how to stop that11:42
lkclerr... err.... ohhh.... addi 9, 0 is an (RA|0) instruction11:46
lkclthere *are* no read-hazards for that one because there's no operands read11:47
lkclahh got it.  the problem is the fact that the 2nd instruction - addi 9,9,-1 - is reading and writing to the same register.11:50
lkclthis is creating a hazard on itself11:50
lkclokaaay i think i have a workaround: disable hazard vectors entirely when doing the simple FSM11:54
lkclwhich was supposed to be... ok good, fixed11:54
lkclcesar, git pull11:55
lkcli'll run a complete (everything) and get some breakfast :) back in 20 mins with the results11:56
lkclokaaay deep joy, there's a couple of ld/st instructions that now barf.12:18
lkcli'll have a look at those12:18
lkclLD-st-with-update.  the update is going into the wrong register.  it's going into RT (3) rather than RA (4)12:27
lkclyep, i know why12:30
lkcli accidentally merged the RT and RA-as-update write info12:31
lkclcesar, ok all good again12:32
Veera[m]case_rand_imm: "subfic" 3, 1, {imm}": carry_out = result & (1<<64) is not giving correct values12:39
Veera[m]result = ~initial_regs[6] + imm + 112:39
Veera[m]programmerjake: need help12:42
lkclVeera[m], result = ~initial_regs[6] + imm + 112:52
lkclfollowed by12:52
lkclresult = result & (0xfffffffffffffff)12:52
lkclresult &= ((1<<64)-1)12:53
lkclbut the immediate also has to be sign-extended12:53
lkcl<lkcl> is implemented as:12:53
lkcl<lkcl>      RT <- ¬(RA) + EXTS(SI) + 112:53
lkclit's currently 5am in the United States so you will not get a reply from jacob for another 5-7 hours12:54
Veera[m]yeah totally forgot about EXTS12:54
Veera[m]"another 5-7 hours" oh12:55
Veera[m]EXTS(SI) sign extend by how much12:57
lkclthere is a function for it12:58
lkclbut, lookagain at the pseudocode12:58
lkclpage 68, v3.0C specification12:58
lkclRT --> 6..1012:59
lkclRA --> 11..1512:59
lkclSI --> 16..3112:59
lkcltherefore, SI is (31-16+1) bits long == 1612:59
lkclyou can use nmutil.extend13:00
lkclah no, it uses nmigen, sorry13:00
lkclit'll be something like:13:00
lkcl      if (imm & (1<<15)): imm |= 0xffff_ffff_ffff_000013:01
lkcltest *bit 15* of a 16-bit number to work out whether to sign-extend it13:02
Veera[m]do we have to sign extend SI to 64 bits?13:02
lkclof course13:26
lkclotherwise the 64-bit result will be corrupted.13:27
lkclVeera: this is shifting a 1-bit value down by 64-bits, and another 32-bit value down by 32-bits16:09
lkcl+   = (carry_out>>64) | (carry_out32>>31)16:09
lkclwhich is always guaranteed to be zero16:09
lkcl1>>64 is always zero16:09
lkcl0b100000000000000000000000000000000000 >> 64 (0b1 followed by 64 zeros) is going to be 116:09
lkclwhat's amusing is that this probably works only works because adde is not supposed to set :)16:10
lkclif it was addeo. (the overflow version) it would be a different matter16:11
lkcl            carry_out = result & (1<<64) # detect 65th bit as carry-out?16:11
lkcl            carry_out32 = ((initial_regs[6] & 0xffff_ffff) + (initial_regs[7] & 0xffff_ffff)) & (1<<32)16:11
lkclahh ok16:11
lkclyou changed the code so it does actually test bit 6416:12
lkclby ANDing with (1<<64)16:12
lkcldo keep to under 80 chars btw16:12
lkcl            carry_out32 = ((initial_regs[6] & 0xffff_ffff) + (initial_regs[7] & 0xffff_ffff)) & (1<<32)16:12
lkclis around 13016:13
lkcli put carry_out back to the original code:16:14
lkclcarry_out = result & (1<<64) != 016:14
lkcli leave it to you to sort out / tidy up carry_out3216:14
lkclshifting down by 31 rather than 32 because is carry_out | (carryout32<<1) is not obvious at all16:15
lkclcesar, hooray! write-after-write hazard detection works!16:18
lkclfrickin ell it's complicated16:24
lkclhmmm ok it works because it does too much :)16:27
lkclas in, the write-hazard is detected to be with the instruction itself, which then prevents *all* instructions from being issued until the current instruction is over16:27
programmerjakeVeera i'm assuming lkcl helped you figure it out16:30
lkclokay nooow we have working write-after-write hazard detection17:49
lkclit's still a little overactive.  this is marginally better than not kicking in at all though18:07
Veera[m]programmerjake: In subfic op what does .checked_add(immediate)21:55
Veera[m]programmerjake: .and_then(|v| v.checked_add(1))21:56
Veera[m]programmerjake: .is_none();21:56
programmerjakechecked_add adds two numbers of type T, returning an Option<T>, it returns Some(N) if the addition doesn't overflow (in this case > 2^64 cuz T=u64), and None if it overflows21:59
programmerjakea.and_then(|v| b) evaluates b with v set to the N if a is Some(N), otherwise it returns None22:01
programmerjakeis_none just returns true if the input is None22:02
programmerjakeso, all together, `a.checked_add(b).and_then(|v| v.checked_add(c)).is_none()` returns true if `a + b + c` overflows.22:04
Veera[m]programmerjake: thanks I made a working code for subfic23:18
Veera[m]lkcl: thanks I made a working code for subfic and also checked for adde(it is working)23:19
lkclwell done :)23:24

Generated by 2.17.1 by Marius Gedminas - find it at!