Wednesday, 2021-11-24

Veera[m]	lkcl: Is Libre-soc Talos machine POWER9 or POWER8?	00:32
jn	Talos II is POWER9; the Talos I was POWER8 but wasn't sold much	00:33
programmerjake	libre-soc's talos server is power9 iirc	00:38
sadoon_albader[m	Btw, I couldn't get powerpc64-gdb to build on the talos for some reason, perhaps it assumes it is cross-debugging, I bet the gdb package in the repos should be enough? I might need symlinks perhaps	00:50
Veera[m]	sadoon_albader: plain gdb in power system is enough	03:08
sadoon_albader[m	Awesome, just as I was expecting	03:12
Veera[m]	Need help with Subtract From Immediate Carrying; subfic RT,RA,SI: RT = ¬ (RA) + EXTS(SI) + 1	07:03
Veera[m]	Does it uses CA bit for adding or just alters CA bit after compute	07:03
programmerjake	it does not read CA, it just alters CA and CA32 after compute. see subfe for an instruction that does read CA, for comparison.	07:16
programmerjake	Veera ^	07:16
Veera[m]	if i have to find out what CA it will set, how that can be done	07:18
Veera[m]	I mean what CA value? In python script	07:18
Veera[m]	subfic 3, 1, imm	07:20
Veera[m]	carry = if imm < GPR[1] then CA = 1	07:21
programmerjake	do the addition in python, the carry out will be the first bit above the MSB, so counting from bit 0 at the lsb, for 64-bit the value will be in bit 64 cuz the msb is bit 63, for 32-bit the carry will be in bit 32	07:21
programmerjake	that should apply for both signed and unsigned addition	07:22
programmerjake	so, for example, 0x78+0x88==0x110 so the 8-bit sum is 0x10 and the 8-bit carry out is 1 cuz bit 0x100 is set	07:23
Veera[m]	"32-bit the carry will be in bit 32" sometimes this may be set 0 even if there is CA32=1 in 64bit mode	07:24
programmerjake	hmm, any examples?	07:24
programmerjake	0x78+0x88==0x100 oops, mis-added	07:25
Veera[m]	I am trying to do this for ALU test cases and subfic ¬ (RA) + EXTS(SI) + 1: is giving random results for CA bit	07:27
Veera[m]	Can you provide me a link for the file where subfic is implemented	07:28
programmerjake	oh, wait, for N-bit carry out, the inputs need to be masked to N-bits unsigned, if not you'll get the wrong answer	07:29
programmerjake	subfic in power-instruction-analyzer: https://salsa.debian.org/Kazan-team/power-instruction-analyzer/-/blob/95fdd1c4edbd91c0a02b772ba02aa2045101d2b0/src/instr_models.rs#L124	07:30
Veera[m]	"need to be masked to N-bits unsigned" : yes	07:31
programmerjake	subfic in soc.git (converted to a generic add): https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/fu/alu/main_stage.py;h=f4ad49183c1ffbd686644238a676d7dd807c64b6;hb=d40d5ded858bf09b7b46838d47410c9dc957167f#l143	07:32
programmerjake	CA32 computation in openpower-isa.git: https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/caller.py;hb=e5d2a21bd25720f9267c7c8045df83163bc63a20#l851	07:37
programmerjake	hopefully you can figure it out from those, imho the power-instruction-analyzer one is probably the clearest	07:41
programmerjake	toshywoshy: openpowerbot disconnected from oftc about 4hr ago	07:43
Veera[m]	I will try understanding the code, isn't carry is different in add versus substract ops	08:13
programmerjake	no, carry in/out isn't all that different between add and subtract, subtract is just where one input is inverted and either CA or 1 is added, add adds either CA or 0.	08:27
programmerjake	both of them have carry out from the unsigned addition of the two inputs and the carry in (CA or 0/1) after the one input is optionally inverted	08:27
Veera[m]	.checked_add(immediate as u32)	10:38
Veera[m]	.and_then(\|v\| v.checked_add(1))	10:39
Veera[m]	what is .checked_add and \|v\| v.checked_add	10:39
lkcl	Veera[m], basically, all add/subtract operations - and i do mean all - in the entirety of Power ISA use the exact same one internal piece of hardware	10:50
lkcl	do you know how to turn a number negative in binary?	10:50
lkcl	you invert all its bits then add one.	10:51
lkcl	so that is how subtract is done.	10:51
lkcl	sub(RA, RB) ==> ADD( (~RA+1) + RB)	10:51
lkcl	not by doing an actual hardware-level subtract!	10:52
lkcl	then, to do carry-in and carry-out, the actual hardware-level adder is made not 64-bit, but 66 bit.	10:53
lkcl	so, let's have a look here:	11:09
lkcl	https://libre-soc.org/openpower/isa/fixedarith/	11:10
lkcl	subfic RT,RA,SI	11:10
lkcl	is implemented as:	11:10
lkcl	RT <- ¬(RA) + EXTS(SI) + 1	11:10
cesar	"test_issuer.py nosvp64 general" is hanging for me.	11:10
lkcl	cesar, will take a look	11:10
cesar	Started bisecting, but ran out of time.	11:10
lkcl	i haven't run it in a while, but it doesn't surprise me	11:11
lkcl	it's one that contains a loop	11:11
lkcl	and i modified how the "end of program" is detected	11:11
cesar	Good commits for me are: 376ab6167e524f639587d054908f7cc18f9c427b in soc	11:11
cesar	... and d5f50879146ebd1de94d25137d732acbbb31868f in openpower-isa.	11:12
lkcl	thx	11:12
lkcl	it is almost certainly a loop where the bc instruction is at the end	11:15
cesar	433556d1a3298d9d57820ae1087746d4170f9d0c in soc seems to introduce a regression, in combination with d5f50879146ebd1de94d25137d732acbbb31868f in openpower-isa.	11:15
lkcl	that's odd. not what i expected.	11:16
cesar	And, with 376ab6167e524f639587d054908f7cc18f9c427b in soc, d5f50879146ebd1de94d25137d732acbbb31868f in openpower-isa works, but master in openpower-isa breaks.	11:19
cesar	(so a bisect in openpower-isa is needed as well)	11:20
lkcl	rdmask on an addi instruction is all 1s. (0xf). that should not be happening.	11:27
lkcl	err... actually... it's set after the instruction has completed!!	11:30
lkcl	ehn??	11:30
lkcl	it's something unique to addi. add is fine	11:37
lkcl	ohh hang on. addi 9,9,-1 is a special type of hazard	11:40
lkcl	wh	11:41
lkcl	addi 9,0, 0x10	11:41
lkcl	followed by	11:41
lkcl	addi 9,9, -1	11:41
lkcl	is a special type of hazard i'm currently debugging	11:41
lkcl	but	11:41
lkcl	allow_overlap=False should not be looking for it, at all	11:42
lkcl	1 sec i think i know how to stop that	11:42
lkcl	err... err.... ohhh.... addi 9, 0 is an (RA\|0) instruction	11:46
lkcl	there are no read-hazards for that one because there's no operands read	11:47
lkcl	ahh got it. the problem is the fact that the 2nd instruction - addi 9,9,-1 - is reading and writing to the same register.	11:50
lkcl	this is creating a hazard on itself	11:50
lkcl	okaaay i think i have a workaround: disable hazard vectors entirely when doing the simple FSM	11:54
lkcl	which was supposed to be... ok good, fixed	11:54
lkcl	cesar, git pull	11:55
lkcl	i'll run a complete test_issuer.py (everything) and get some breakfast :) back in 20 mins with the results	11:56
lkcl	https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=1a41b215f9b215a039327b81abb4dba2d97a1b80	11:56
lkcl	okaaay deep joy, there's a couple of ld/st instructions that now barf.	12:18
lkcl	i'll have a look at those	12:18
lkcl	LD-st-with-update. the update is going into the wrong register. it's going into RT (3) rather than RA (4)	12:27
lkcl	yep, i know why	12:30
lkcl	i accidentally merged the RT and RA-as-update write info	12:31
lkcl	fixed	12:31
lkcl	cesar, ok all good again	12:32
Veera[m]	case_rand_imm: "subfic" 3, 1, {imm}": carry_out = result & (1<<64) is not giving correct values	12:39
Veera[m]	result = ~initial_regs[6] + imm + 1	12:39
Veera[m]	programmerjake: need help	12:42
lkcl	Veera[m], result = ~initial_regs[6] + imm + 1	12:52
lkcl	followed by	12:52
lkcl	result = result & (0xfffffffffffffff)	12:52
lkcl	or	12:52
lkcl	result &= ((1<<64)-1)	12:53
lkcl	but the immediate also has to be sign-extended	12:53
lkcl	<lkcl> is implemented as:	12:53
lkcl	<lkcl> RT <- ¬(RA) + EXTS(SI) + 1	12:53
lkcl	^^^^^^	12:53
lkcl	EXTS(SI)	12:53
lkcl	^^^^^	12:53
lkcl	yes?	12:53
lkcl	it's currently 5am in the United States so you will not get a reply from jacob for another 5-7 hours	12:54
Veera[m]	yeah totally forgot about EXTS	12:54
Veera[m]	"another 5-7 hours" oh	12:55
Veera[m]	EXTS(SI) sign extend by how much	12:57
lkcl	there is a function for it	12:58
lkcl	but, lookagain at the pseudocode	12:58
lkcl	page 68, v3.0C specification	12:58
lkcl	RT --> 6..10	12:59
lkcl	RA --> 11..15	12:59
lkcl	SI --> 16..31	12:59
lkcl	therefore, SI is (31-16+1) bits long == 16	12:59
lkcl	you can use nmutil.extend	13:00
lkcl	ah no, it uses nmigen, sorry	13:00
lkcl	it'll be something like:	13:00
lkcl	if (imm & (1<<15)): imm \|= 0xffff_ffff_ffff_0000	13:01
lkcl	test bit 15 of a 16-bit number to work out whether to sign-extend it	13:02
Veera[m]	do we have to sign extend SI to 64 bits?	13:02
lkcl	of course	13:26
lkcl	otherwise the 64-bit result will be corrupted.	13:27
lkcl	Veera: this is shifting a 1-bit value down by 64-bits, and another 32-bit value down by 32-bits	16:09
lkcl	+ e.ca = (carry_out>>64) \| (carry_out32>>31)	16:09
lkcl	which is always guaranteed to be zero	16:09
lkcl	1>>64 is always zero	16:09
lkcl	0b100000000000000000000000000000000000 >> 64 (0b1 followed by 64 zeros) is going to be 1	16:09
lkcl	what's amusing is that this probably works only works because adde is not supposed to set e.ca :)	16:10
lkcl	if it was addeo. (the overflow version) it would be a different matter	16:11
lkcl	carry_out = result & (1<<64) # detect 65th bit as carry-out?	16:11
lkcl	carry_out32 = ((initial_regs[6] & 0xffff_ffff) + (initial_regs[7] & 0xffff_ffff)) & (1<<32)	16:11
lkcl	ahh ok	16:11
lkcl	you changed the code so it does actually test bit 64	16:12
lkcl	by ANDing with (1<<64)	16:12
lkcl	do keep to under 80 chars btw	16:12
lkcl	carry_out32 = ((initial_regs[6] & 0xffff_ffff) + (initial_regs[7] & 0xffff_ffff)) & (1<<32)	16:12
lkcl	is around 130	16:13
lkcl	i put carry_out back to the original code:	16:14
lkcl	carry_out = result & (1<<64) != 0	16:14
lkcl	i leave it to you to sort out / tidy up carry_out32	16:14
lkcl	shifting down by 31 rather than 32 because ea.ca is carry_out \| (carryout32<<1) is not obvious at all	16:15
lkcl	cesar, hooray! write-after-write hazard detection works!	16:18
lkcl	frickin ell it's complicated	16:24
lkcl	hmmm ok it works because it does too much :)	16:27
lkcl	as in, the write-hazard is detected to be with the instruction itself, which then prevents all instructions from being issued until the current instruction is over	16:27
lkcl	sigh	16:27
programmerjake	Veera i'm assuming lkcl helped you figure it out	16:30
lkcl	okay nooow we have working write-after-write hazard detection	17:49
programmerjake	yay!	18:02
lkcl	it's still a little overactive. this is marginally better than not kicking in at all though	18:07
Veera[m]	programmerjake: In subfic op what does .checked_add(immediate)	21:55
Veera[m]	programmerjake: .and_then(\|v\| v.checked_add(1))	21:56
Veera[m]	programmerjake: .is_none();	21:56
programmerjake	checked_add adds two numbers of type T, returning an Option<T>, it returns Some(N) if the addition doesn't overflow (in this case > 2^64 cuz T=u64), and None if it overflows	21:59
programmerjake	a.and_then(\|v\| b) evaluates b with v set to the N if a is Some(N), otherwise it returns None	22:01
programmerjake	https://doc.rust-lang.org/std/primitive.u64.html#method.checked_add	22:01
programmerjake	https://doc.rust-lang.org/std/option/enum.Option.html#method.and_then	22:02
programmerjake	is_none just returns true if the input is None	22:02
programmerjake	so, all together, `a.checked_add(b).and_then(\|v\| v.checked_add(c)).is_none()` returns true if `a + b + c` overflows.	22:04
Veera[m]	programmerjake: thanks I made a working code for subfic	23:18
Veera[m]	lkcl: thanks I made a working code for subfic and also checked for adde(it is working)	23:19
lkcl	hooraaay	23:23
lkcl	well done :)	23:24

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!