Sunday, 2021-08-15

ghostmansd-pc	lkcl: wow, you changed the test, thank you!	13:54
ghostmansd-pc	if I got correctly, we have two things left: batching 32 instructions into the single command, and checking addg6s, right?	13:55
justinrestivo[m]	lkcl: I've gone through the installation scripts here: https://libre-soc.org/HDL_workflow/devscripts/. Is there documentation somewhere on how to run tests (or where the tests live)?	15:32
lkcl	ghostmansd-pc, yes	15:51
lkcl	justinrestivo[m], in theeoorryyy it's just "make test" or python3 setup.py test or "nosetests3"	15:51
lkcl	jn: you went through the list recently?	15:52
lkcl	basically, it's python, therefore it's unittests, therefore go figure	15:52
lkcl	however	15:52
lkcl	bear in mind that some of the unit tests will take several hours or even days to complete	15:52
lkcl	particularly the IEEE754 FP ones	15:52
lkcl	so it's not something you arbitrarily run as part of the actual installation "because that's what every other package does"	15:53
lkcl	you run the unit tests specifically to actually confirm the functionality, which may potentially take one week even on high-end hardware	15:54
Madan_Kartheessa	Good evening all	17:18
Madan_Kartheessa	This is Madan from Object Automation Software Solutions Private Limited	17:19
*** Madan_Kartheessa <Madan_Kartheessa!~Madan_Kar@45.251.35.104> has left #libre-soc		17:20
*** lx0 is now known as lxo		17:32
justinrestivo[m]	Gotcha. And this is on a per repo basis for the repos cloned into src?	17:35
ghostmansd-pc	I'm currently completing the refactoring. Based on what I can see, numbers like 400 in BCD convert into DPD incorrectly. I'm still checking the reason, but the code seems to be 1:1 to specs.	17:44
ghostmansd-pc	I'll also check the result for an alternate formula, just to be sure.	17:44
ghostmansd-pc	Both formulas act the same way. 400bcd should have been converted to 200dpd, but it's not, it becomes 0.	17:53
lkcl	hi Madan, great to hear from you	18:14
lkcl	justinrestivo[m]: yes, basically, the way python works, anything starting "test_" is auto-run by the standard python unit test infrastructure	18:14
justinrestivo[m]	Perfect, thanks	18:15
lkcl	thus, the questions you are asking are "the answer is the standard answer for all python projects"	18:15
lkcl	basically, however, because we use standardpython unit test infrastructure, it's strongly inadviseable to run the tests as part of the install process	18:16
lkcl	ghostmansd-pc, there's no guarantee that the v3.0B specification pseudocode is actually correct.	18:16
lkcl	if it isn't, that's actually really good because, once again we've found that the specification is wrong.	18:17
lkcl	HOWEVER	18:17
lkcl	bear in mind, we have those brackets missing	18:17
lkcl	it MIGHT be the case that pywriter, because it is not inserting brackets, gets the order wrong	18:18
programmerjake	it might also be the case that qemu is wrong	18:18
lkcl	programmerjake: this is testing against tables listed in v3.0B	18:18
programmerjake	ah, ok	18:19
lkcl	ghostmansd-pc, what i recommend is, create a "test_bcdblahblah_regression" which just has the "wrong" values	18:19
lkcl	and change "def test_" to "def notest_" as usual. this will save you a lot of time investigating	18:19
programmerjake	check against ieee 754 (ever since ieee 854 was merged), that's where the bcd from/to dpd comes from	18:20
ghostmansd-pc	https://pastebin.com/vhj95QR5	18:21
ghostmansd-pc	the output is: SelectableInt(value=0x400, bits=12) SelectableInt(value=0x0, bits=12) SelectableInt(value=0x400, bits=12)	18:21
ghostmansd-pc	What am I missing?	18:22
lkcl	urrrr...	18:22
ghostmansd-pc	oops, wrong paste	18:23
ghostmansd-pc	let me re-check it	18:23
lkcl	what is BCD set to?	18:23
ghostmansd-pc	SelectableInt(1024, 64)	18:23
ghostmansd-pc	I think I see the problem...	18:23
lkcl	that's a 64-bit int	18:24
ghostmansd-pc	for BCD, it takes `getitem 62 64 0x400 0`	18:24
lkcl	therefore BCD[0] is 1024 & (1<<63)	18:24
lkcl	BCD[1] is 1024 & (1<<62)	18:24
lkcl	...	18:24
ghostmansd-pc	and for 12-bit, it takes `getitem 10 12 0x400 1`	18:24
lkcl	...	18:24
lkcl	if BCD = SelectableInt(1024,12) you have what you are expecting	18:24
ghostmansd-pc	the thing is, in tests, it comes from the register...	18:25
lkcl	when BCD=SelectableInt(1024, >>>>>12 NOT 64<<<<<<), BCD[0] = ...	18:25
lkcl	ok then the offset will need to be (63-12-0)	18:25
ghostmansd-pc	yuck	18:25
lkcl	63-12-1	18:25
lkcl	or something like that	18:25
ghostmansd-pc	yep	18:25
ghostmansd-pc	seems like this	18:25
programmerjake	bitslice the register from 64 to 12 bits	18:25
lkcl	welcome to Power ISA spec.	18:25
lkcl	sigh	18:25
ghostmansd-pc	really an awesome sh*t, lost about two hours	18:26
ghostmansd-pc	and what funny, it broke only around 400bcd	18:26
lkcl	wait... hang on....	18:26
lkcl	RA[n+8:n+19 ] <- DPD_TO_BCD ( (RS)[n+12:n+21] )	18:26
programmerjake	afaict: (RA)[52:63]	18:26
lkcl	https://libre-soc.org/openpower/isa/bcd/	18:27
lkcl	63-52+1=21	18:27
lkcl	63-52+1=12	18:27
lkcl	the SelectableInt being passed in to DPD_TO_BCD will not be a 64-bit value, it will be a 12 bit SelectableInt	18:28
ghostmansd-pc	hm, I'll check it more	18:28
ghostmansd-pc	perhaps at the point where we make the call	18:28
lkcl	so, that's why i said: create a regression test "def test_bcd_regression_400()"	18:28
ghostmansd-pc	ok	18:28
lkcl	specifically testing exactly that one and only that one value	18:28
lkcl	then, also, put print() statements into BCD_TO_DPD	18:29
ghostmansd-pc	yep, I understand :-)	18:29
lkcl	(just edit the pyparser-generated python output file)	18:29
ghostmansd-pc	that's how I've been doing it, even though I copy-pasted it	18:29
lkcl	really, we need brackets inserted by the parser	18:30
ghostmansd-pc	you mean parentheses?	18:32
ghostmansd-pc	kinda clearing the meaning of `k = q & ~s & ~t & v & w \| t & v & ~w & x \| q & v & w & ~x \| x & ~v`, right?	18:32
programmerjake	hmm, that seems obvious to me... ~ has highest precedence, then & then \|	18:33
programmerjake	(if the spec designers did the precedence differently, they're annoying)	18:34
lkcl	formal descriptions of instructions. RTL notation not	18:36
lkcl	summarized here should be self-explanatory.	18:36
lkcl	v3.0B p6.	18:36
lkcl	sigh	18:36
programmerjake	so you get `k = (q & (~s) & (~t) & v & w) \| (t & v & (~w) & x) \| (q & v & w & (~x)) \| (x & (~v))`	18:37
ghostmansd-pc	I don't really think one should really think a lot of precedence upon reading.	18:37
ghostmansd-pc	Whenever things can be made explicit, IMHO they should.	18:37
lkcl	yeah i get operator precedence wrong.	18:39
lkcl	this needs checking.	18:40
lkcl	python AST it is the lexer which specifies the precedence	18:41
lkcl	sorry, LALR parser	18:41
programmerjake	well, no one I've met seems to think the arithmetic equivalent needs parenthesis: `k = q * -s * -t * v * w + t * v * -w * x + q * v * w * -x + x * -v`	18:41
lkcl	which then generates python AST	18:41
lkcl	programmerjake: that's hopelessly unclear to me	18:41
programmerjake	ok...	18:42
lkcl	without brackets i'm unable to visually parse it	18:42
lkcl	the -x no problem (because the "-" is right next to the "x")	18:42
lkcl	* and +, because they're separated, it's unreadable.	18:43
lkcl	if it was a shorter expression, no problem.	18:43
lkcl	that's very interesting	18:46
lkcl	+x <- (y * 5) + 3	18:47
lkcl	+y <- (z + 5) * 3	18:47
lkcl	$ python3 decoder/power_pseudo.py	18:47
lkcl	produces:	18:47
lkcl	x = y * 5 + 3	18:47
lkcl	y = (z + 5) * 3	18:47
lkcl	https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=f89e15664c9fdeb4ce257c9b72fcd151dc60dbd1	18:47
lkcl	ghostmansd-pc, &	18:47
lkcl	^	18:47
lkcl	so we may reeeeeasonably assume that astor is capable of outputting code that respects the operator precedence of its input	18:48
programmerjake	that's because your parser discards parenthesis, and the AST printer only inserts them where strictly necessary ... I'd expect you can add parenthesis ast nodes if you like	18:49
lkcl	and, on discovering something that does not match operator precedence [e.g. "(z + 5) * 3"] will correctly insert the necessary brackets	18:49
lkcl	bizarrely, you can't	18:49
lkcl	because python standard AST doesn't have any	18:49
lkcl	the google-written lib2to3 and 3to2 parsers - which were the other choice - do have them	18:49
lkcl	and the ability to parse (and insert) full and complete and accurate whitespace	18:50
lkcl	which was needed in order to preserve whitespace on conversion, so that "diff -u" was the absolute bare minimum	18:50
lkcl	it's much more complex so i went with standard python AST and astor for pywriter.	18:51
lkcl	ghostmansd-pc, bottom line, it's looking like there's a v3.0B spec error	18:51
programmerjake	a way to cheat on inserting parenthesis: translate the code `a * b + c` to `__PAREN__(a * b) + c` then delete the `__PAREN__` identifiers with a regex	18:53
lkcl	a dummy function. or just leave it in (the dummy function returns its input)	18:54
lkcl	nice idea	18:55
programmerjake	if you leave it in, name it `identity`	18:55
lkcl	ghostmansd: by printing out the output in both binary and hex when converting 400bcd it should be pretty obvious which patterns do not match	19:02
lkcl	of course, err, one other thing, check that the tables are the reverse of each other!	19:04
lkcl	ghostmansd, i leave it with you. do regularly commit quotes "even though it doesn't work"quotes when it comes to unit tests.	19:11
lkcl	i would like to try running the 400bcd test, to see if i can see anything	19:12
lkcl	but if you don't commit the test, i can't do that	19:12
lkcl	and, thus, the entire burden of triaging is on you	19:12
lkcl	if however you commit the test, others can run it and others can help investigate	19:13
ghostmansd	Ok, I'll commi it in 5 mins	19:14
ghostmansd	*commit	19:14
ghostmansd	It's mostly working	19:14
ghostmansd	Wanted to complete the debug, but OK	19:15
lkcl	you get the general principle, though	19:17
programmerjake	you can check against https://gcc.gnu.org/onlinedocs/gccint/Decimal-float-library-routines.html	19:17
programmerjake	they probably have dpd<->bcd/int tables/functions	19:18
ghostmansd	lkcl: committed, you might opt to @unittest.skip it	19:25
ghostmansd-pc	anyway, I'd like to debug it as well	19:25
programmerjake	posted links to gcc's conversion tables to bug report	19:46
programmerjake	hopefully they'll be useful	19:46
ghostmansd-pc	dammit	20:43
ghostmansd-pc	I finally got what happens	20:43
ghostmansd-pc	I'm not sure whether it's right, but, granted that we have things like "cbcdtd 0,0", i.e. when the register's the same, the dst register's bits are overridden with src register	20:44
ghostmansd-pc	Let's consider this code	20:44
ghostmansd-pc	do i = 0 to 1	20:45
ghostmansd-pc	n <- i * 32	20:45
ghostmansd-pc	RA[n+0:n+11] <- 0	20:45
ghostmansd-pc	RA[n+12:n+21] <- BCD_TO_DPD ( (RS)[n+8:n+19] )	20:45
ghostmansd-pc	RA[n+22:n+31] <- BCD_TO_DPD ( (RS)[n+20:n+31] )	20:45
ghostmansd-pc	I need to recheck it, though	20:48
ghostmansd-pc	https://pastebin.com/4JSEABjc	20:50
ghostmansd-pc	granted that both operands point to the same register, e.g. via "cbcdtd 0, 0", the logs I end up with are...	20:51
ghostmansd-pc	GMSD op_cbcdtd RS SelectableInt(value=0x400, bits=64)	20:52
ghostmansd-pc	GMSD op_cbcdtd RA SelectableInt(value=0x400, bits=64)	20:52
ghostmansd-pc	GMSD op_cbcdtd RS SelectableInt(value=0x0, bits=64)	20:52
ghostmansd-pc	GMSD op_cbcdtd RA SelectableInt(value=0x0, bits=64)	20:52
ghostmansd-pc	Is it what we want? I've been under impression these are passed as values, not references, that is, I'd have expected RS and RA point to different variables.	20:54
ghostmansd-pc	The id(RS) equals id(RA), and the variable is being modified in-place.	20:54
ghostmansd-pc	I kinda can realize the intention why they're passed by ref, though: they point to the same register.	21:01
ghostmansd-pc	However, I'd have expected that the semantics would've been as if they're passed by value.	21:02
programmerjake	the semantics should be as though they're passed by value...if not the simulator needs to be fixed	21:27
programmerjake	(assuming they behave like all the other instructions i've seen)	21:28
ghostmansd-pc	Yes, my feeling's the same.	21:31
ghostmansd-pc	This'd be the least suprise principle.	21:31
ghostmansd-pc	Modifying `def op_cdtbcd(self, RA, RS)` so that it starts with these...	21:31
ghostmansd-pc	RA = SelectableInt(RA, RA.bits)	21:31
ghostmansd-pc	RS = SelectableInt(RS, RS.bits)	21:31
ghostmansd-pc	...obviously fixes the tests, immediately.	21:32
ghostmansd-pc	I suspect this must be handled at the point where we collect all the inputs into the instruction wrapper, at the point we call `info.func(self, *inputs)` at ISACaller.call().	21:33
ghostmansd-pc	I mean, before we invoke the call() method.	21:33
ghostmansd-pc	However, I'd like to know if everyone agrees it's correct.	21:33
programmerjake	afaik SimpleV is the only ISA I've seen that doesn't just do the operation based on the input values then writes the outputs at the end only after reading all inputs	21:33
programmerjake	sounds good to me...though SV instructions will need to behave differently since they depend on if the inputs overlap the outputs	21:35
programmerjake	lkcl?	21:35
programmerjake	he may have gone to sleep for the night already...	21:36
ghostmansd-pc	that's OK, there's no hurry :-)	21:37
ghostmansd-pc	this was an interesting thing to track and debug	21:37
programmerjake	maybe just make that change as a separate commit that can be reverted if it's wrong?	21:37
ghostmansd-pc	The call() method is tricky. I can do it in kinda hacky way, e.g. iterate over all inputs and, whenever I meet SelectibleInt, copy its instance.	21:39
ghostmansd-pc	FWIW, I, again, think, that mutability is most oftenly evil.	21:39
programmerjake	that works! the simulator is basically all a pile of hacks held together using duct tape anyway	21:40
ghostmansd-pc	I get the rationale for allowing to modify standalone bits in SelectableInt, but still, the whole thing is caused by the fact that we modify the variable on-the-fly.	21:40
programmerjake	I think `a[3:5] = ...` should have been compiled to `a = a.replace_slice(3:5, ...)`	21:41
programmerjake	yay for SSA!	21:42
ghostmansd-pc	:-D	21:42
programmerjake	https://en.wikipedia.org/wiki/Static_single_assignment_form	21:43
programmerjake	> caused by the fact that we modify the variable on-the-fly.	21:46
programmerjake	s/variable/object/	21:46
programmerjake	the variable stays pointing to the exact same thing...	21:46
programmerjake	one thing Rust definitely got right...shared mutability is the bane of easy understandability and a cause of many bugs ... too bad Python's favorite mode of operation is shared mutability	21:47
ghostmansd-pc	other languages also drop mutability, IIRC	21:48
ghostmansd-pc	erlang?...	21:48
ghostmansd-pc	def pass_by_value(arg):	21:49
ghostmansd-pc	if isinstance(arg, SelectableInt):	21:49
ghostmansd-pc	return SelectableInt(arg, arg.bits)	21:49
ghostmansd-pc	return arg	21:49
ghostmansd-pc		21:49
ghostmansd-pc	inputs = list(map(pass_by_value, inputs))	21:49
ghostmansd-pc	This thing works; I'm not sure it's worth to commit it, but at least it does the trick	21:49
programmerjake	yup! Rust didn't drop it, it relegated it to its proper place .... explicitly opt-in or where unique access is guaranteed	21:50
ghostmansd-pc	I actually suspect there might be things other than selectable int	21:50
programmerjake	you mean like most the codebase?	21:50
ghostmansd-pc	(that's at caller.py ISACaller.call())	21:50
ghostmansd-pc	yep, I think there might be some other interesting side effects. In fact, I think even I found it occasionally, since I wanted to batch 32 instructions in the same listing to win some time.	21:51
programmerjake	luckily most of the attributes are only written once avoiding the mutability pitfalls	21:51
ghostmansd-pc	I suspect most tests around deal with different in/out registers.	21:51
ghostmansd-pc	And don't resort to poor man's batching like `cbcdtd 0, 0`...	21:52
programmerjake	well, I'm heading out to visit family, have fun!	21:52
ghostmansd-pc	`cbcdtd 1, 1`, `cbcdtd 2, 2`, and so on	21:52
ghostmansd-pc	OK, you too!	21:52
programmerjake	:)	21:53
ghostmansd-pc	I thought that the code above is a bit hacky, so I pushed it into isa-si-wa branch for now. I think the right place for this stuff should be where we access the registers. At the same time, I'm somewhat attracted to programmerjake idea that slice update should create a new instance.	22:06
ghostmansd-pc	Please, let me know what you think. See you later!	22:07
lkcl	ghostmansd: the entire GPR (and FPR) is accessible from pseudocode as GPR(x) <- {somevalue}	22:26
lkcl	so it has to be supported to be able to update the GPR and pass in values	22:27
lkcl	good find btw	22:32
lkcl	all other pseudocode returns a replacement	22:32
lkcl	by doing e.g. RT <- {somevalue}	22:32
lkcl	i'll have to run a full suite of tests, this is an extremely low-level change	22:34
lkcl	programmerjake: that's incorrect. it does... however it does them based on each element being its own fully-independent instruction with its own hazards.	22:36
lkcl	the reason why RVV or other Cray-style Vector ISAs cannot possibly do that is because the instructions are designed specifically based around the vector register as being "the atomically-updated type"	22:37
lkcl	whereas SVP64's Sub-Program-Counter, is exactly as it says: a Sub-Program-Counter, and therefore is literally sitting between issue and execute, blatting instructions out in Program Order	22:38
lkcl	there is literally no difference between (VL=2) "sv.addi r0.v, r4,v, 5"	22:40
lkcl	and	22:40
lkcl	addi r0, r4, 5	22:40
lkcl	addi r1, r5, 5	22:40
lkcl	the two are DIRECTLY EQUIVALENT AND EXACTLY THE SAME.	22:40
lkcl	you have a choice.	22:40
lkcl	1) use the two v3.0B 32-bit addi instructions	22:41
lkcl	2) use the sv.addi instruction	22:41
lkcl	the behaviour AAABBBSSSOOOLLUUUTEELLLLYYY MMMMMUUUUSSSSTTTTT be the same	22:41
lkcl	this is a hard, inviolate and absolute requirement of SV.	22:42
lkcl	the only thing that "looks" like it isn't following that rule is the twin in-place FFT/DCT instructions	22:42
lkcl	and the only reason it _looks_ like that hard inviolate rule is not being followed is because they are 2-operand-in 2-operand-out instructions	22:43
lkcl	if you do not understand this then you have fundamentally misunderstood the design principles of SV for three years.	22:44
lkcl	SVP64 loops issue... instructions... in.... Program... Order.	22:44
lkcl	you don't jump about with a Program Counter or "think it strange that instructions get executed with register hazards being respected" on a scalar processor?	22:45
lkcl	so why with a Sub-Program-Counter would it be any different?	22:46
lkcl	what you are finding "strange" is not in the least bit strange.	22:46
lkcl	you	22:46
lkcl	have	22:46
lkcl	to	22:46
lkcl	respect	22:46
lkcl	register	22:46
lkcl	hazards.	22:46
lkcl	what you are suggesting is, effectively, to completely ignore register hazards for VL loops.	22:47
lkcl	allowing outputs to overwrite (destroy) inputs in any arbitrary order	22:47
lkcl	that would be extremely strange: it fundamentally violates common sense of practical processor architectures	22:48
lkcl	which, sigh, some ISAs in the past actually allow.	22:48
lkcl	SPARC i believe had fixed-length pipelines for e.g. MUL, and zero register hazard protection on branches.	22:49
lkcl	early compilers "fixed" that by (sigh) padding with nops.	22:49
lkcl	tests run, all good	22:54
lkcl	bcd test runs in 104 seconds	23:06
lkcl	with the batching	23:06
lkcl	i added the "with self.subTest()" back in	23:07
lkcl	so, conclusion (whew), v3.0B spec is ok	23:10
lkcl	nice work	23:11

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!