ghostmansd-pc | lkcl: wow, you changed the test, thank you! | 13:54 |
---|---|---|
ghostmansd-pc | if I got correctly, we have two things left: batching 32 instructions into the single command, and checking addg6s, right? | 13:55 |
justinrestivo[m] | lkcl: I've gone through the installation scripts here: https://libre-soc.org/HDL_workflow/devscripts/. Is there documentation somewhere on how to run tests (or where the tests live)? | 15:32 |
lkcl | ghostmansd-pc, yes | 15:51 |
lkcl | justinrestivo[m], in theeoorryyy it's just "make test" or python3 setup.py test or "nosetests3" | 15:51 |
lkcl | jn: you went through the list recently? | 15:52 |
lkcl | basically, it's python, therefore it's unittests, therefore go figure | 15:52 |
lkcl | *however* | 15:52 |
lkcl | bear in mind that some of the unit tests will take *several hours or even days to complete* | 15:52 |
lkcl | particularly the IEEE754 FP ones | 15:52 |
lkcl | so it's not something you arbitrarily run as part of the actual installation "because that's what every other package does" | 15:53 |
lkcl | you run the unit tests *specifically* to actually confirm the functionality, which may potentially take one week even on high-end hardware | 15:54 |
Madan_Kartheessa | Good evening all | 17:18 |
Madan_Kartheessa | This is Madan from Object Automation Software Solutions Private Limited | 17:19 |
*** Madan_Kartheessa <Madan_Kartheessa!~Madan_Kar@45.251.35.104> has left #libre-soc | 17:20 | |
*** lx0 is now known as lxo | 17:32 | |
justinrestivo[m] | Gotcha. And this is on a per repo basis for the repos cloned into src? | 17:35 |
ghostmansd-pc | I'm currently completing the refactoring. Based on what I can see, numbers like 400 in BCD convert into DPD incorrectly. I'm still checking the reason, but the code seems to be 1:1 to specs. | 17:44 |
ghostmansd-pc | I'll also check the result for an alternate formula, just to be sure. | 17:44 |
ghostmansd-pc | Both formulas act the same way. 400bcd should have been converted to 200dpd, but it's not, it becomes 0. | 17:53 |
lkcl | hi Madan, great to hear from you | 18:14 |
lkcl | justinrestivo[m]: yes, basically, the way python works, anything starting "test_" is auto-run by the standard python unit test infrastructure | 18:14 |
justinrestivo[m] | Perfect, thanks | 18:15 |
lkcl | thus, the questions you are asking are "the answer is the standard answer for all python projects" | 18:15 |
lkcl | basically, however, because we use standardpython unit test infrastructure, it's strongly inadviseable to run the tests as part of the install process | 18:16 |
lkcl | ghostmansd-pc, there's no guarantee that the v3.0B specification pseudocode is actually correct. | 18:16 |
lkcl | if it isn't, that's actually really good because, once again we've found that the specification is wrong. | 18:17 |
lkcl | HOWEVER | 18:17 |
lkcl | bear in mind, we have those brackets missing | 18:17 |
lkcl | it MIGHT be the case that pywriter, because it is not inserting brackets, gets the order wrong | 18:18 |
programmerjake | it might also be the case that qemu is wrong | 18:18 |
lkcl | programmerjake: this is testing against tables listed in v3.0B | 18:18 |
programmerjake | ah, ok | 18:19 |
lkcl | ghostmansd-pc, what i recommend is, create a "test_bcdblahblah_regression" which just has the "wrong" values | 18:19 |
lkcl | and change "def test_" to "def notest_" as usual. this will save you a lot of time investigating | 18:19 |
programmerjake | check against ieee 754 (ever since ieee 854 was merged), that's where the bcd from/to dpd comes from | 18:20 |
ghostmansd-pc | https://pastebin.com/vhj95QR5 | 18:21 |
ghostmansd-pc | the output is: SelectableInt(value=0x400, bits=12) SelectableInt(value=0x0, bits=12) SelectableInt(value=0x400, bits=12) | 18:21 |
ghostmansd-pc | What am I missing? | 18:22 |
lkcl | urrrr... | 18:22 |
ghostmansd-pc | oops, wrong paste | 18:23 |
ghostmansd-pc | let me re-check it | 18:23 |
lkcl | what is BCD set to? | 18:23 |
ghostmansd-pc | SelectableInt(1024, 64) | 18:23 |
ghostmansd-pc | I think I see the problem... | 18:23 |
lkcl | that's a 64-bit int | 18:24 |
ghostmansd-pc | for BCD, it takes `getitem 62 64 0x400 0` | 18:24 |
lkcl | therefore BCD[0] is 1024 & (1<<63) | 18:24 |
lkcl | BCD[1] is 1024 & (1<<62) | 18:24 |
lkcl | ... | 18:24 |
ghostmansd-pc | and for 12-bit, it takes `getitem 10 12 0x400 1` | 18:24 |
lkcl | ... | 18:24 |
lkcl | if BCD = SelectableInt(1024,12) you have what you are expecting | 18:24 |
ghostmansd-pc | the thing is, in tests, it comes from the register... | 18:25 |
lkcl | when BCD=SelectableInt(1024, >>>>>12 NOT 64<<<<<<), BCD[0] = ... | 18:25 |
lkcl | ok then the offset will need to be (63-12-0) | 18:25 |
ghostmansd-pc | yuck | 18:25 |
lkcl | 63-12-1 | 18:25 |
lkcl | or something like that | 18:25 |
ghostmansd-pc | yep | 18:25 |
ghostmansd-pc | seems like this | 18:25 |
programmerjake | bitslice the register from 64 to 12 bits | 18:25 |
lkcl | welcome to Power ISA spec. | 18:25 |
lkcl | sigh | 18:25 |
ghostmansd-pc | really an awesome sh*t, lost about two hours | 18:26 |
ghostmansd-pc | and what funny, it broke only around 400bcd | 18:26 |
lkcl | wait... hang on.... | 18:26 |
lkcl | RA[n+8:n+19 ] <- DPD_TO_BCD ( (RS)[n+12:n+21] ) | 18:26 |
programmerjake | afaict: (RA)[52:63] | 18:26 |
lkcl | https://libre-soc.org/openpower/isa/bcd/ | 18:27 |
lkcl | 63-52+1=21 | 18:27 |
lkcl | 63-52+1=12 | 18:27 |
lkcl | the SelectableInt being passed in to DPD_TO_BCD will *not* be a 64-bit value, it will be a *12* bit SelectableInt | 18:28 |
ghostmansd-pc | hm, I'll check it more | 18:28 |
ghostmansd-pc | perhaps at the point where we make the call | 18:28 |
lkcl | so, that's why i said: create a regression test "def test_bcd_regression_400()" | 18:28 |
ghostmansd-pc | ok | 18:28 |
lkcl | specifically testing exactly that one and only that one value | 18:28 |
lkcl | then, also, put print() statements into BCD_TO_DPD | 18:29 |
ghostmansd-pc | yep, I understand :-) | 18:29 |
lkcl | (just edit the pyparser-generated python output file) | 18:29 |
ghostmansd-pc | that's how I've been doing it, even though I copy-pasted it | 18:29 |
lkcl | really, we need brackets inserted by the parser | 18:30 |
ghostmansd-pc | you mean parentheses? | 18:32 |
ghostmansd-pc | kinda clearing the meaning of `k = q & ~s & ~t & v & w | t & v & ~w & x | q & v & w & ~x | x & ~v`, right? | 18:32 |
programmerjake | hmm, that seems obvious to me... ~ has highest precedence, then & then | | 18:33 |
programmerjake | (if the spec designers did the precedence differently, they're annoying) | 18:34 |
lkcl | formal descriptions of instructions. RTL notation not | 18:36 |
lkcl | summarized here should be self-explanatory. | 18:36 |
lkcl | v3.0B p6. | 18:36 |
lkcl | sigh | 18:36 |
programmerjake | so you get `k = (q & (~s) & (~t) & v & w) | (t & v & (~w) & x) | (q & v & w & (~x)) | (x & (~v))` | 18:37 |
ghostmansd-pc | I don't really think one should really think a lot of precedence upon reading. | 18:37 |
ghostmansd-pc | Whenever things can be made explicit, IMHO they should. | 18:37 |
lkcl | yeah i get operator precedence wrong. | 18:39 |
lkcl | this needs checking. | 18:40 |
lkcl | python AST it is the lexer which specifies the precedence | 18:41 |
lkcl | sorry, LALR parser | 18:41 |
programmerjake | well, no one I've met seems to think the arithmetic equivalent needs parenthesis: `k = q * -s * -t * v * w + t * v * -w * x + q * v * w * -x + x * -v` | 18:41 |
lkcl | which then generates python AST | 18:41 |
lkcl | programmerjake: that's hopelessly unclear to me | 18:41 |
programmerjake | ok... | 18:42 |
lkcl | without brackets i'm unable to visually parse it | 18:42 |
lkcl | the -x no problem (because the "-" is right next to the "x") | 18:42 |
lkcl | * and +, because they're separated, it's unreadable. | 18:43 |
lkcl | if it was a shorter expression, no problem. | 18:43 |
lkcl | that's very interesting | 18:46 |
lkcl | +x <- (y * 5) + 3 | 18:47 |
lkcl | +y <- (z + 5) * 3 | 18:47 |
lkcl | $ python3 decoder/power_pseudo.py | 18:47 |
lkcl | produces: | 18:47 |
lkcl | x = y * 5 + 3 | 18:47 |
lkcl | y = (z + 5) * 3 | 18:47 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=f89e15664c9fdeb4ce257c9b72fcd151dc60dbd1 | 18:47 |
lkcl | ghostmansd-pc, & | 18:47 |
lkcl | ^ | 18:47 |
lkcl | so we may reeeeeasonably assume that astor is capable of outputting code that respects the operator precedence of its input | 18:48 |
programmerjake | that's because your parser discards parenthesis, and the AST printer only inserts them where strictly necessary ... I'd expect you can add parenthesis ast nodes if you like | 18:49 |
lkcl | and, on discovering something that does *not* match operator precedence [e.g. "(z + 5) * 3"] will correctly insert the necessary brackets | 18:49 |
lkcl | bizarrely, you can't | 18:49 |
lkcl | because python standard AST doesn't have any | 18:49 |
lkcl | the google-written lib2to3 and 3to2 parsers - which were the other choice - *do* have them | 18:49 |
lkcl | and the ability to parse (and insert) full and complete and accurate whitespace | 18:50 |
lkcl | which was needed in order to preserve whitespace on conversion, so that "diff -u" was the absolute bare minimum | 18:50 |
lkcl | it's much more complex so i went with standard python AST and astor for pywriter. | 18:51 |
lkcl | ghostmansd-pc, bottom line, it's looking like there's a v3.0B spec error | 18:51 |
programmerjake | a way to cheat on inserting parenthesis: translate the code `a * b + c` to `__PAREN__(a * b) + c` then delete the `__PAREN__` identifiers with a regex | 18:53 |
lkcl | a dummy function. or just leave it in (the dummy function returns its input) | 18:54 |
lkcl | nice idea | 18:55 |
programmerjake | if you leave it in, name it `identity` | 18:55 |
lkcl | ghostmansd: by printing out the output in both binary and hex when converting 400bcd it should be pretty obvious which patterns do not match | 19:02 |
lkcl | of course, err, one other thing, check that the tables are the reverse of each other! | 19:04 |
lkcl | ghostmansd, i leave it with you. do regularly commit quotes "even though it doesn't work"quotes when it comes to unit tests. | 19:11 |
lkcl | i would like to try running the 400bcd test, to see if i can see anything | 19:12 |
lkcl | but if you don't commit the test, i can't do that | 19:12 |
lkcl | and, thus, the entire burden of triaging is on *you* | 19:12 |
lkcl | if however you commit the test, others can run it and others can help investigate | 19:13 |
ghostmansd | Ok, I'll commi it in 5 mins | 19:14 |
ghostmansd | *commit | 19:14 |
ghostmansd | It's mostly working | 19:14 |
ghostmansd | Wanted to complete the debug, but OK | 19:15 |
lkcl | you get the general principle, though | 19:17 |
programmerjake | you can check against https://gcc.gnu.org/onlinedocs/gccint/Decimal-float-library-routines.html | 19:17 |
programmerjake | they probably have dpd<->bcd/int tables/functions | 19:18 |
ghostmansd | lkcl: committed, you might opt to @unittest.skip it | 19:25 |
ghostmansd-pc | anyway, I'd like to debug it as well | 19:25 |
programmerjake | posted links to gcc's conversion tables to bug report | 19:46 |
programmerjake | hopefully they'll be useful | 19:46 |
ghostmansd-pc | dammit | 20:43 |
ghostmansd-pc | I finally got what happens | 20:43 |
ghostmansd-pc | I'm not sure whether it's right, but, granted that we have things like "cbcdtd 0,0", i.e. when the register's the same, the dst register's bits are overridden with src register | 20:44 |
ghostmansd-pc | Let's consider this code | 20:44 |
ghostmansd-pc | do i = 0 to 1 | 20:45 |
ghostmansd-pc | n <- i * 32 | 20:45 |
ghostmansd-pc | RA[n+0:n+11] <- 0 | 20:45 |
ghostmansd-pc | RA[n+12:n+21] <- BCD_TO_DPD ( (RS)[n+8:n+19] ) | 20:45 |
ghostmansd-pc | RA[n+22:n+31] <- BCD_TO_DPD ( (RS)[n+20:n+31] ) | 20:45 |
ghostmansd-pc | I need to recheck it, though | 20:48 |
ghostmansd-pc | https://pastebin.com/4JSEABjc | 20:50 |
ghostmansd-pc | granted that both operands point to the same register, e.g. via "cbcdtd 0, 0", the logs I end up with are... | 20:51 |
ghostmansd-pc | GMSD op_cbcdtd RS SelectableInt(value=0x400, bits=64) | 20:52 |
ghostmansd-pc | GMSD op_cbcdtd RA SelectableInt(value=0x400, bits=64) | 20:52 |
ghostmansd-pc | GMSD op_cbcdtd RS SelectableInt(value=0x0, bits=64) | 20:52 |
ghostmansd-pc | GMSD op_cbcdtd RA SelectableInt(value=0x0, bits=64) | 20:52 |
ghostmansd-pc | Is it what we want? I've been under impression these are passed as values, not references, that is, I'd have expected RS and RA point to different variables. | 20:54 |
ghostmansd-pc | The id(RS) equals id(RA), and the variable is being modified in-place. | 20:54 |
ghostmansd-pc | I kinda can realize the intention why they're passed by ref, though: they point to the same register. | 21:01 |
ghostmansd-pc | However, I'd have expected that the semantics would've been as if they're passed by value. | 21:02 |
programmerjake | the semantics should be as though they're passed by value...if not the simulator needs to be fixed | 21:27 |
programmerjake | (assuming they behave like all the other instructions i've seen) | 21:28 |
ghostmansd-pc | Yes, my feeling's the same. | 21:31 |
ghostmansd-pc | This'd be the least suprise principle. | 21:31 |
ghostmansd-pc | Modifying `def op_cdtbcd(self, RA, RS)` so that it starts with these... | 21:31 |
ghostmansd-pc | RA = SelectableInt(RA, RA.bits) | 21:31 |
ghostmansd-pc | RS = SelectableInt(RS, RS.bits) | 21:31 |
ghostmansd-pc | ...obviously fixes the tests, immediately. | 21:32 |
ghostmansd-pc | I suspect this must be handled at the point where we collect all the inputs into the instruction wrapper, at the point we call `info.func(self, *inputs)` at ISACaller.call(). | 21:33 |
ghostmansd-pc | I mean, before we invoke the call() method. | 21:33 |
ghostmansd-pc | However, I'd like to know if everyone agrees it's correct. | 21:33 |
programmerjake | afaik SimpleV is the only ISA I've seen that doesn't just do the operation based on the input values then writes the outputs at the end only after reading all inputs | 21:33 |
programmerjake | sounds good to me...though SV instructions will need to behave differently since they depend on if the inputs overlap the outputs | 21:35 |
programmerjake | lkcl? | 21:35 |
programmerjake | he may have gone to sleep for the night already... | 21:36 |
ghostmansd-pc | that's OK, there's no hurry :-) | 21:37 |
ghostmansd-pc | this was an interesting thing to track and debug | 21:37 |
programmerjake | maybe just make that change as a separate commit that can be reverted if it's wrong? | 21:37 |
ghostmansd-pc | The call() method is tricky. I can do it in kinda hacky way, e.g. iterate over all inputs and, whenever I meet SelectibleInt, copy its instance. | 21:39 |
ghostmansd-pc | FWIW, I, again, think, that mutability is most oftenly evil. | 21:39 |
programmerjake | that works! the simulator is basically all a pile of hacks held together using duct tape anyway | 21:40 |
ghostmansd-pc | I get the rationale for allowing to modify standalone bits in SelectableInt, but still, the whole thing is caused by the fact that we modify the variable on-the-fly. | 21:40 |
programmerjake | I think `a[3:5] = ...` should have been compiled to `a = a.replace_slice(3:5, ...)` | 21:41 |
programmerjake | yay for SSA! | 21:42 |
ghostmansd-pc | :-D | 21:42 |
programmerjake | https://en.wikipedia.org/wiki/Static_single_assignment_form | 21:43 |
programmerjake | > caused by the fact that we modify the variable on-the-fly. | 21:46 |
programmerjake | s/variable/object/ | 21:46 |
programmerjake | the variable stays pointing to the exact same thing... | 21:46 |
programmerjake | one thing Rust definitely got right...shared mutability is the bane of easy understandability and a cause of many bugs ... too bad Python's favorite mode of operation is shared mutability | 21:47 |
ghostmansd-pc | other languages also drop mutability, IIRC | 21:48 |
ghostmansd-pc | erlang?... | 21:48 |
ghostmansd-pc | def pass_by_value(arg): | 21:49 |
ghostmansd-pc | if isinstance(arg, SelectableInt): | 21:49 |
ghostmansd-pc | return SelectableInt(arg, arg.bits) | 21:49 |
ghostmansd-pc | return arg | 21:49 |
ghostmansd-pc | 21:49 | |
ghostmansd-pc | inputs = list(map(pass_by_value, inputs)) | 21:49 |
ghostmansd-pc | This thing works; I'm not sure it's worth to commit it, but at least it does the trick | 21:49 |
programmerjake | yup! Rust didn't drop it, it relegated it to its proper place .... explicitly opt-in or where unique access is guaranteed | 21:50 |
ghostmansd-pc | I actually suspect there might be things other than selectable int | 21:50 |
programmerjake | you mean like most the codebase? | 21:50 |
ghostmansd-pc | (that's at caller.py ISACaller.call()) | 21:50 |
ghostmansd-pc | yep, I think there might be some other interesting side effects. In fact, I think even I found it occasionally, since I wanted to batch 32 instructions in the same listing to win some time. | 21:51 |
programmerjake | luckily most of the attributes are only written once avoiding the mutability pitfalls | 21:51 |
ghostmansd-pc | I suspect most tests around deal with different in/out registers. | 21:51 |
ghostmansd-pc | And don't resort to poor man's batching like `cbcdtd 0, 0`... | 21:52 |
programmerjake | well, I'm heading out to visit family, have fun! | 21:52 |
ghostmansd-pc | `cbcdtd 1, 1`, `cbcdtd 2, 2`, and so on | 21:52 |
ghostmansd-pc | OK, you too! | 21:52 |
programmerjake | :) | 21:53 |
ghostmansd-pc | I thought that the code above is a bit hacky, so I pushed it into isa-si-wa branch for now. I think the right place for this stuff should be where we access the registers. At the same time, I'm somewhat attracted to programmerjake idea that slice update should create a new instance. | 22:06 |
ghostmansd-pc | Please, let me know what you think. See you later! | 22:07 |
lkcl | ghostmansd: the entire GPR (and FPR) is accessible from pseudocode as GPR(x) <- {somevalue} | 22:26 |
lkcl | so it has to be supported to be able to update the GPR *and* pass in values | 22:27 |
lkcl | good find btw | 22:32 |
lkcl | all other pseudocode returns a *replacement* | 22:32 |
lkcl | by doing e.g. RT <- {somevalue} | 22:32 |
lkcl | i'll have to run a full suite of tests, this is an extremely low-level change | 22:34 |
lkcl | programmerjake: that's incorrect. it does... however it does them based on each element being its own fully-independent instruction with its own hazards. | 22:36 |
lkcl | the reason why RVV or other Cray-style Vector ISAs cannot possibly do that is because the instructions are designed specifically based around the *vector register* as being "the atomically-updated type" | 22:37 |
lkcl | whereas SVP64's Sub-Program-Counter, is exactly as it says: a *Sub-Program-Counter*, and therefore is literally sitting between issue and execute, blatting instructions out *in Program Order* | 22:38 |
lkcl | there is literally no difference between (VL=2) "sv.addi r0.v, r4,v, 5" | 22:40 |
lkcl | and | 22:40 |
lkcl | addi r0, r4, 5 | 22:40 |
lkcl | addi r1, r5, 5 | 22:40 |
lkcl | the two are DIRECTLY EQUIVALENT AND EXACTLY THE SAME. | 22:40 |
lkcl | you have a choice. | 22:40 |
lkcl | 1) use the two v3.0B 32-bit addi instructions | 22:41 |
lkcl | 2) use the sv.addi instruction | 22:41 |
lkcl | the behaviour AAABBBSSSOOOLLUUUTEELLLLYYY MMMMMUUUUSSSSTTTTT be the same | 22:41 |
lkcl | this is a hard, inviolate and absolute requirement of SV. | 22:42 |
lkcl | the only thing that "looks" like it isn't following that rule is the twin in-place FFT/DCT instructions | 22:42 |
lkcl | and the only reason it _looks_ like that hard inviolate rule is not being followed is because they are 2-operand-in 2-operand-out instructions | 22:43 |
lkcl | if you do not understand this then you have fundamentally misunderstood the design principles of SV for three years. | 22:44 |
lkcl | SVP64 loops issue... instructions... in.... Program... Order. | 22:44 |
lkcl | you don't jump about with a Program Counter or "think it strange that instructions get executed with register hazards being respected" on a scalar processor? | 22:45 |
lkcl | so why with a *Sub*-Program-Counter would it be any different? | 22:46 |
lkcl | what you are finding "strange" is not in the least bit strange. | 22:46 |
lkcl | you | 22:46 |
lkcl | have | 22:46 |
lkcl | to | 22:46 |
lkcl | respect | 22:46 |
lkcl | register | 22:46 |
lkcl | hazards. | 22:46 |
lkcl | what you are suggesting is, effectively, to completely ignore register hazards for VL loops. | 22:47 |
lkcl | allowing outputs to overwrite (destroy) inputs in any arbitrary order | 22:47 |
lkcl | that *would* be extremely strange: it fundamentally violates common sense of practical processor architectures | 22:48 |
lkcl | which, sigh, some ISAs in the past actually allow. | 22:48 |
lkcl | SPARC i believe had fixed-length pipelines for e.g. MUL, and *zero* register hazard protection on branches. | 22:49 |
lkcl | early compilers "fixed" that by (sigh) padding with nops. | 22:49 |
lkcl | tests run, all good | 22:54 |
lkcl | bcd test runs in 104 seconds | 23:06 |
lkcl | with the batching | 23:06 |
lkcl | i added the "with self.subTest()" back in | 23:07 |
lkcl | so, conclusion (whew), v3.0B spec is ok | 23:10 |
lkcl | nice work | 23:11 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!