Sunday, 2021-08-15

ghostmansd-pclkcl: wow, you changed the test, thank you!13:54
ghostmansd-pcif I got correctly, we have two things left: batching 32 instructions into the single command, and checking addg6s, right?13:55
justinrestivo[m]lkcl:  I've gone through the installation scripts here: https://libre-soc.org/HDL_workflow/devscripts/. Is there documentation somewhere on how to run tests (or where the tests live)?15:32
lkclghostmansd-pc, yes15:51
lkcljustinrestivo[m], in theeoorryyy it's just "make test" or python3 setup.py test or "nosetests3"15:51
lkcljn: you went through the list recently?15:52
lkclbasically, it's python, therefore it's unittests, therefore go figure15:52
lkcl*however*15:52
lkclbear in mind that some of the unit tests will take *several hours or even days to complete*15:52
lkclparticularly the IEEE754 FP ones15:52
lkclso it's not something you arbitrarily run as part of the actual installation "because that's what every other package does"15:53
lkclyou run the unit tests *specifically* to actually confirm the functionality, which may potentially take one week even on high-end hardware15:54
Madan_KartheessaGood evening all17:18
Madan_KartheessaThis is Madan from Object Automation Software Solutions Private Limited17:19
*** Madan_Kartheessa <Madan_Kartheessa!~Madan_Kar@45.251.35.104> has left #libre-soc17:20
*** lx0 is now known as lxo17:32
justinrestivo[m]Gotcha. And this is on a per repo basis for the repos cloned into src?17:35
ghostmansd-pcI'm currently completing the refactoring. Based on what I can see, numbers like 400 in BCD convert into DPD incorrectly. I'm still checking the reason, but the code seems to be 1:1 to specs.17:44
ghostmansd-pcI'll also check the result for an alternate formula, just to be sure.17:44
ghostmansd-pcBoth formulas act the same way. 400bcd should have been converted to 200dpd, but it's not, it becomes 0.17:53
lkclhi Madan, great to hear from you18:14
lkcljustinrestivo[m]: yes, basically, the way python works, anything starting "test_" is auto-run by the standard python unit test infrastructure18:14
justinrestivo[m]Perfect, thanks18:15
lkclthus, the questions you are asking are "the answer is the standard answer for all python projects"18:15
lkclbasically, however, because we use standardpython unit test infrastructure, it's strongly inadviseable to run the tests as part of the install process18:16
lkclghostmansd-pc, there's no guarantee that the v3.0B specification pseudocode is actually correct.18:16
lkclif it isn't, that's actually really good because, once again we've found that the specification is wrong.18:17
lkclHOWEVER18:17
lkclbear in mind, we have those brackets missing18:17
lkclit MIGHT be the case that pywriter, because it is not inserting brackets, gets the order wrong18:18
programmerjakeit might also be the case that qemu is wrong18:18
lkclprogrammerjake: this is testing against tables listed in v3.0B18:18
programmerjakeah, ok18:19
lkclghostmansd-pc, what i recommend is, create a "test_bcdblahblah_regression" which just has the "wrong" values18:19
lkcland change "def test_" to "def notest_" as usual. this will save you a lot of time investigating18:19
programmerjakecheck against ieee 754 (ever since ieee 854 was merged), that's where the bcd from/to dpd comes from18:20
ghostmansd-pchttps://pastebin.com/vhj95QR518:21
ghostmansd-pcthe output is: SelectableInt(value=0x400, bits=12) SelectableInt(value=0x0, bits=12) SelectableInt(value=0x400, bits=12)18:21
ghostmansd-pcWhat am I missing?18:22
lkclurrrr...18:22
ghostmansd-pcoops, wrong paste18:23
ghostmansd-pclet me re-check it18:23
lkclwhat is BCD set to?18:23
ghostmansd-pcSelectableInt(1024, 64)18:23
ghostmansd-pcI think I see the problem...18:23
lkclthat's a 64-bit int18:24
ghostmansd-pcfor BCD, it takes `getitem 62 64 0x400 0`18:24
lkcltherefore BCD[0] is 1024 & (1<<63)18:24
lkclBCD[1] is 1024 & (1<<62)18:24
lkcl...18:24
ghostmansd-pcand for 12-bit, it takes `getitem 10 12 0x400 1`18:24
lkcl...18:24
lkclif BCD = SelectableInt(1024,12) you have what you are expecting18:24
ghostmansd-pcthe thing is, in tests, it comes from the register...18:25
lkclwhen BCD=SelectableInt(1024, >>>>>12 NOT 64<<<<<<), BCD[0] = ...18:25
lkclok then the offset will need to be (63-12-0)18:25
ghostmansd-pcyuck18:25
lkcl63-12-118:25
lkclor something like that18:25
ghostmansd-pcyep18:25
ghostmansd-pcseems like this18:25
programmerjakebitslice the register from 64 to 12 bits18:25
lkclwelcome to Power ISA spec.18:25
lkclsigh18:25
ghostmansd-pcreally an awesome sh*t, lost about two hours18:26
ghostmansd-pcand what funny, it broke only around 400bcd18:26
lkclwait... hang on....18:26
lkcl      RA[n+8:n+19 ] <- DPD_TO_BCD ( (RS)[n+12:n+21] )18:26
programmerjakeafaict: (RA)[52:63]18:26
lkclhttps://libre-soc.org/openpower/isa/bcd/18:27
lkcl63-52+1=2118:27
lkcl63-52+1=1218:27
lkclthe SelectableInt being passed in to DPD_TO_BCD will *not* be a 64-bit value, it will be a *12* bit SelectableInt18:28
ghostmansd-pchm, I'll check it more18:28
ghostmansd-pcperhaps at the point where we make the call18:28
lkclso, that's why i said: create a regression test "def test_bcd_regression_400()"18:28
ghostmansd-pcok18:28
lkclspecifically testing exactly that one and only that one value18:28
lkclthen, also, put print() statements into BCD_TO_DPD18:29
ghostmansd-pcyep, I understand :-)18:29
lkcl(just edit the pyparser-generated python output file)18:29
ghostmansd-pcthat's how I've been doing it, even though I copy-pasted it18:29
lkclreally, we need brackets inserted by the parser18:30
ghostmansd-pcyou mean parentheses?18:32
ghostmansd-pckinda clearing the meaning of `k = q & ~s & ~t & v & w | t & v & ~w & x | q & v & w & ~x | x & ~v`, right?18:32
programmerjakehmm, that seems obvious to me... ~ has highest precedence, then & then |18:33
programmerjake(if the spec designers did the precedence differently, they're annoying)18:34
lkclformal descriptions of instructions. RTL notation not18:36
lkclsummarized here should be self-explanatory.18:36
lkclv3.0B p6.18:36
lkclsigh18:36
programmerjakeso you get `k = (q & (~s) & (~t) & v & w) | (t & v & (~w) & x) | (q & v & w & (~x)) | (x & (~v))`18:37
ghostmansd-pcI don't really think one should really think a lot of precedence upon reading.18:37
ghostmansd-pcWhenever things can be made explicit, IMHO they should.18:37
lkclyeah i get operator precedence wrong.18:39
lkclthis needs checking.18:40
lkclpython AST it is the lexer which specifies the precedence18:41
lkclsorry, LALR parser18:41
programmerjakewell, no one I've met seems to think the arithmetic equivalent needs parenthesis: `k = q * -s * -t * v * w + t * v * -w * x + q * v * w * -x + x * -v`18:41
lkclwhich then generates python AST18:41
lkclprogrammerjake: that's hopelessly unclear to me18:41
programmerjakeok...18:42
lkclwithout brackets i'm unable to visually parse it18:42
lkclthe -x no problem (because the "-" is right next to the "x")18:42
lkcl* and +, because they're separated, it's unreadable.18:43
lkclif it was a shorter expression, no problem.18:43
lkclthat's very interesting18:46
lkcl+x <- (y * 5) + 318:47
lkcl+y <- (z + 5) * 318:47
lkcl$ python3 decoder/power_pseudo.py18:47
lkclproduces:18:47
lkclx = y * 5 + 318:47
lkcly = (z + 5) * 318:47
lkclhttps://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=f89e15664c9fdeb4ce257c9b72fcd151dc60dbd118:47
lkclghostmansd-pc, &18:47
lkcl^18:47
lkclso we may reeeeeasonably assume that astor is capable of outputting code that respects the operator precedence of its input18:48
programmerjakethat's because your parser discards parenthesis, and the AST printer only inserts them where strictly necessary ... I'd expect you can add parenthesis ast nodes if you like18:49
lkcland, on discovering something that does *not* match operator precedence [e.g. "(z + 5) * 3"] will correctly insert the necessary brackets18:49
lkclbizarrely, you can't18:49
lkclbecause python standard AST doesn't have any18:49
lkclthe google-written lib2to3 and 3to2 parsers - which were the other choice - *do* have them18:49
lkcland the ability to parse (and insert) full and complete and accurate whitespace18:50
lkclwhich was needed in order to preserve whitespace on conversion, so that "diff -u" was the absolute bare minimum18:50
lkclit's much more complex so i went with standard python AST and astor for pywriter.18:51
lkclghostmansd-pc, bottom line, it's looking like there's a v3.0B spec error18:51
programmerjakea way to cheat on inserting parenthesis: translate the code `a * b + c` to `__PAREN__(a * b) + c` then delete the `__PAREN__` identifiers with a regex18:53
lkcla dummy function.  or just leave it in (the dummy function returns its input)18:54
lkclnice idea18:55
programmerjakeif you leave it in, name it `identity`18:55
lkclghostmansd: by printing out the output in both binary and hex when converting 400bcd it should be pretty obvious which patterns do not match19:02
lkclof course, err, one other thing, check that the tables are the reverse of each other!19:04
lkclghostmansd, i leave it with you. do regularly commit quotes "even though it doesn't work"quotes when it comes to unit tests.19:11
lkcli would like to try running the 400bcd test, to see if i can see anything19:12
lkclbut if you don't commit the test, i can't do that19:12
lkcland, thus, the entire burden of triaging is on *you*19:12
lkclif however you commit the test, others can run it and others can help investigate19:13
ghostmansdOk, I'll commi it in 5 mins19:14
ghostmansd*commit19:14
ghostmansdIt's mostly working19:14
ghostmansdWanted to complete the debug, but OK19:15
lkclyou get the general principle, though19:17
programmerjakeyou can check against https://gcc.gnu.org/onlinedocs/gccint/Decimal-float-library-routines.html19:17
programmerjakethey probably have dpd<->bcd/int tables/functions19:18
ghostmansdlkcl: committed, you might opt to @unittest.skip it19:25
ghostmansd-pcanyway, I'd like to debug it as well19:25
programmerjakeposted links to gcc's conversion tables to bug report19:46
programmerjakehopefully they'll be useful19:46
ghostmansd-pcdammit20:43
ghostmansd-pcI finally got what happens20:43
ghostmansd-pcI'm not sure whether it's right, but, granted that we have things like "cbcdtd 0,0", i.e. when the register's the same, the dst register's bits are overridden with src register20:44
ghostmansd-pcLet's consider this code20:44
ghostmansd-pc    do i = 0 to 120:45
ghostmansd-pc      n <- i * 3220:45
ghostmansd-pc      RA[n+0:n+11] <- 020:45
ghostmansd-pc      RA[n+12:n+21] <- BCD_TO_DPD ( (RS)[n+8:n+19] )20:45
ghostmansd-pc      RA[n+22:n+31] <- BCD_TO_DPD ( (RS)[n+20:n+31] )20:45
ghostmansd-pcI need to recheck it, though20:48
ghostmansd-pchttps://pastebin.com/4JSEABjc20:50
ghostmansd-pcgranted that both operands point to the same register, e.g. via "cbcdtd 0, 0", the logs I end up with are...20:51
ghostmansd-pcGMSD op_cbcdtd RS SelectableInt(value=0x400, bits=64)20:52
ghostmansd-pcGMSD op_cbcdtd RA SelectableInt(value=0x400, bits=64)20:52
ghostmansd-pcGMSD op_cbcdtd RS SelectableInt(value=0x0, bits=64)20:52
ghostmansd-pcGMSD op_cbcdtd RA SelectableInt(value=0x0, bits=64)20:52
ghostmansd-pcIs it what we want? I've been under impression these are passed as values, not references, that is, I'd have expected RS and RA point to different variables.20:54
ghostmansd-pcThe id(RS) equals id(RA), and the variable is being modified in-place.20:54
ghostmansd-pcI kinda can realize the intention why they're passed by ref, though: they point to the same register.21:01
ghostmansd-pcHowever, I'd have expected that the semantics would've been as if they're passed by value.21:02
programmerjakethe semantics should be as though they're passed by value...if not the simulator needs to be fixed21:27
programmerjake(assuming they behave like all the other instructions i've seen)21:28
ghostmansd-pcYes, my feeling's the same.21:31
ghostmansd-pcThis'd be the least suprise principle.21:31
ghostmansd-pcModifying `def op_cdtbcd(self, RA, RS)` so that it starts with these...21:31
ghostmansd-pcRA = SelectableInt(RA, RA.bits)21:31
ghostmansd-pcRS = SelectableInt(RS, RS.bits)21:31
ghostmansd-pc...obviously fixes the tests, immediately.21:32
ghostmansd-pcI suspect this must be handled at the point where we collect all the inputs into the instruction wrapper, at the point we call `info.func(self, *inputs)` at ISACaller.call().21:33
ghostmansd-pcI mean, before we invoke the call() method.21:33
ghostmansd-pcHowever, I'd like to know if everyone agrees it's correct.21:33
programmerjakeafaik SimpleV is the only ISA I've seen that doesn't just do the operation based on the input values then writes the outputs at the end only after reading all inputs21:33
programmerjakesounds good to me...though SV instructions will need to behave differently since they depend on if the inputs overlap the outputs21:35
programmerjakelkcl?21:35
programmerjakehe may have gone to sleep for the night already...21:36
ghostmansd-pcthat's OK, there's no hurry :-)21:37
ghostmansd-pcthis was an interesting thing to track and debug21:37
programmerjakemaybe just make that change as a separate commit that can be reverted if it's wrong?21:37
ghostmansd-pcThe call() method is tricky. I can do it in kinda hacky way, e.g. iterate over all inputs and, whenever I meet SelectibleInt, copy its instance.21:39
ghostmansd-pcFWIW, I, again, think, that mutability is most oftenly evil.21:39
programmerjakethat works! the simulator is basically all a pile of hacks held together using duct tape anyway21:40
ghostmansd-pcI get the rationale for allowing to modify standalone bits in SelectableInt, but still, the whole thing is caused by the fact that we modify the variable on-the-fly.21:40
programmerjakeI think `a[3:5] = ...` should have been compiled to `a = a.replace_slice(3:5, ...)`21:41
programmerjakeyay for SSA!21:42
ghostmansd-pc:-D21:42
programmerjakehttps://en.wikipedia.org/wiki/Static_single_assignment_form21:43
programmerjake> caused by the fact that we modify the variable on-the-fly.21:46
programmerjakes/variable/object/21:46
programmerjakethe variable stays pointing to the exact same thing...21:46
programmerjakeone thing Rust definitely got right...shared mutability is the bane of easy understandability and a cause of many bugs ... too bad Python's favorite mode of operation is shared mutability21:47
ghostmansd-pcother languages also drop mutability, IIRC21:48
ghostmansd-pcerlang?...21:48
ghostmansd-pc        def pass_by_value(arg):21:49
ghostmansd-pc            if isinstance(arg, SelectableInt):21:49
ghostmansd-pc                return SelectableInt(arg, arg.bits)21:49
ghostmansd-pc            return arg21:49
ghostmansd-pc21:49
ghostmansd-pc        inputs = list(map(pass_by_value, inputs))21:49
ghostmansd-pcThis thing works; I'm not sure it's worth to commit it, but at least it does the trick21:49
programmerjakeyup! Rust didn't drop it, it relegated it to its proper place .... explicitly opt-in or where unique access is guaranteed21:50
ghostmansd-pcI actually suspect there might be things other than selectable int21:50
programmerjakeyou mean like most the codebase?21:50
ghostmansd-pc(that's at caller.py ISACaller.call())21:50
ghostmansd-pcyep, I think there might be some other interesting side effects. In fact, I think even I found it occasionally, since I wanted to batch 32 instructions in the same listing to win some time.21:51
programmerjakeluckily most of the attributes are only written once avoiding the mutability pitfalls21:51
ghostmansd-pcI suspect most tests around deal with different in/out registers.21:51
ghostmansd-pcAnd don't resort to poor man's batching like `cbcdtd 0, 0`...21:52
programmerjakewell, I'm heading out to visit family, have fun!21:52
ghostmansd-pc`cbcdtd 1, 1`, `cbcdtd 2, 2`, and so on21:52
ghostmansd-pcOK, you too!21:52
programmerjake:)21:53
ghostmansd-pcI thought that the code above is a bit hacky, so I pushed it into isa-si-wa branch for now. I think the right place for this stuff should be where we access the registers. At the same time, I'm somewhat attracted to programmerjake idea that slice update should create a new instance.22:06
ghostmansd-pcPlease, let me know what you think. See you later!22:07
lkclghostmansd: the entire GPR (and FPR) is accessible from pseudocode as GPR(x) <- {somevalue}22:26
lkclso it has to be supported to be able to update the GPR *and* pass in values22:27
lkclgood find btw22:32
lkclall other pseudocode returns a *replacement*22:32
lkclby doing e.g. RT <- {somevalue}22:32
lkcli'll have to run a full suite of tests, this is an extremely low-level change22:34
lkclprogrammerjake: that's incorrect.  it does... however it does them based on each element being its own fully-independent instruction with its own hazards.22:36
lkclthe reason why RVV or other Cray-style Vector ISAs cannot possibly do that is because the instructions are designed specifically based around the *vector register* as being "the atomically-updated type"22:37
lkclwhereas SVP64's Sub-Program-Counter, is exactly as it says: a *Sub-Program-Counter*, and therefore is literally sitting between issue and execute, blatting instructions out *in Program Order*22:38
lkclthere is literally no difference between (VL=2) "sv.addi r0.v, r4,v, 5"22:40
lkcland22:40
lkcladdi r0, r4, 522:40
lkcladdi r1, r5, 522:40
lkclthe two are DIRECTLY EQUIVALENT AND EXACTLY THE SAME.22:40
lkclyou have a choice.22:40
lkcl1) use the two v3.0B 32-bit addi instructions22:41
lkcl2) use the sv.addi instruction22:41
lkclthe behaviour AAABBBSSSOOOLLUUUTEELLLLYYY MMMMMUUUUSSSSTTTTT be the same22:41
lkclthis is a hard, inviolate and absolute requirement of SV.22:42
lkclthe only thing that "looks" like it isn't following that rule is the twin in-place FFT/DCT instructions22:42
lkcland the only reason it _looks_ like that hard inviolate rule is not being followed is because they are 2-operand-in 2-operand-out instructions22:43
lkclif you do not understand this then you have fundamentally misunderstood the design principles of SV for three years.22:44
lkclSVP64 loops issue... instructions... in.... Program... Order.22:44
lkclyou don't jump about with a Program Counter or "think it strange that instructions get executed with register hazards being respected" on a scalar processor?22:45
lkclso why with a *Sub*-Program-Counter would it be any different?22:46
lkclwhat you are finding "strange" is not in the least bit strange.22:46
lkclyou22:46
lkclhave22:46
lkclto22:46
lkclrespect22:46
lkclregister22:46
lkclhazards.22:46
lkclwhat you are suggesting is, effectively, to completely ignore register hazards for VL loops.22:47
lkclallowing outputs to overwrite (destroy) inputs in any arbitrary order22:47
lkclthat *would* be extremely strange: it fundamentally violates common sense of practical processor architectures22:48
lkclwhich, sigh, some ISAs in the past actually allow.22:48
lkclSPARC i believe had fixed-length pipelines for e.g. MUL, and *zero* register hazard protection on branches.22:49
lkclearly compilers "fixed" that by (sigh) padding with nops.22:49
lkcltests run, all good22:54
lkclbcd test runs in 104 seconds23:06
lkclwith the batching23:06
lkcli added the "with self.subTest()" back in23:07
lkclso, conclusion (whew), v3.0B spec is ok23:10
lkclnice work23:11

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!