Wednesday, 2022-06-22

*** kylel1 is now known as kylel07:08
lkclmarkos, mornin. i added those abs-diff-accumulate etc. scalar instructions we talked about a couple weeks ago (?)
markoslkcl, yes, I read, this is great, sorry for being afk, I should be more active soon :)09:54
lkclmarkos, no problem.10:50
lkclghostmansd[m], for the description and links-to-spec that was suggested to be added i created a little blurb here
lkcli'm going to do the parallel prefix iteration next10:54
lkclwhich would allow to do "result = i.x * i.y * i.z * i.w" easily10:54
lkcloctavius, mornin11:45
lkcloctavius, after programmerjake kindly noted the bitmanip ops from x86 i'm going to redo those as a batch of *24* instructions in one hit (actually, just one)11:46
lkclbut there's one dead-easy one that you could do in the meantime whilst i'm doing that - cprop11:47
lkcli've updated with everything you need.11:47
lkclthe pseudocode is a dead-simple one-liner (ok ok 3)11:47
octaviusMorning lkcl, will do11:48
lkclthere's aaallmost no thought involved, here. this will leave you a leeetle... "out-of-place" right up until you actually add the unit test and run it11:49
lkclat which point you should experience an "ah ha!" moment :)11:50
lkclbut up until that point it maaay feel a little "what am i doooiiiing", which i experience all the time on these11:50
lkclyou captured the chat logs from yesterday?11:51
* lkcl afk keeping an eye out on irclogs11:51
octaviusYes I have the chat log saved. It mentions PyWriter, do I need that right now?12:02
octaviusAlso I don't seem to have write rw permissions for openpower-isa12:10
lkcloctavius, ah brilliant, do drop them into the bugreport12:52
lkcloctavius, done, added12:53
octaviusThe chat from our call is quite long, do you just want the instructions at the end?12:53
octaviusNo, I don't have write permission13:07
lkcl repo openpower-isa13:35
lkcl-    RW+     =   admin adminbigmac addw addw2  cesar oliva jacob1 jacob2 vklr tobias lauri dmitry3mdeb klehman mikolaj markos tpearson ghostmansd13:35
lkcl+    RW+     =   admin adminbigmac addw addw2  cesar oliva jacob1 jacob2 vklr tobias lauri dmitry3mdeb klehman mikolaj markos tpearson ghostmansd andreym13:35
lkclyour key rohdo@firedragon has always been present and active13:35
lkclsigh, helps if i do "git push", eh13:36
octaviusPyWriter now breaks because I added OP_CPROP to minor_22.csv13:40
octaviusWhere do I need to add the definition?13:41
lkclin power_enums.py13:42
lkcllook at the patch sets (diffs) from yesterday13:43
lkcl*all* of those things need to be done (equivalents-of) before it "works"13:43
lkclclass MicrOp(Enum):13:44
lkcl    OP_ABSDIFF = 9113:44
lkcl    OP_ABSADD = 9213:44
lkcl    OP_CPROP=.....13:44
lkcl# supported instructions: make sure to keep up-to-date with CSV files13:44
lkcl# just like everything else13:44
lkcl_insns = [13:44
lkcl    "NONE", "add", "addc", "addco", "adde", "addeo",13:44
lkcl  "cprop", # AV bitmanip13:44
lkcloctavius, so the way that big-integer math works is13:46
lkcl* you do a sequence of 64-bit *non-carrying* adds13:46
lkcl  (in parallel)13:46
lkcl* whilst you're doing that you analyse A and B to see if they *would* carry13:47
lkcl  (because if A==0xffffffffff and B=0xfffffffff then then's going to "propagate" a carry-bit, isn't it?)13:47
lkcl* you drop all the "Propagate-Carry-to-next-64-bit element" and "Generate-Carry-to-next-64-bit element" bits into that funny-looking algorithm13:48
lkcl   (P|G+G)^P13:48
lkcl* that tells you whether you need to "add one" on a per-element basis13:48
lkclyou then go *back* to the parallel set of (independent) 64-bit element adds that you did earlier13:49
lkcland you add 1 to each where the P/G-calculation says so13:49
lkcltherefore what we can do is: use that cprop instruction to compute a predicate mask13:49
lkclthen do13:49
lkcl    sv.addi/sm=r3 rt.v, ra.v, 113:50
lkcland, ta-daaa, we've done a parallel-copy-propagate big-integer add13:50
lkclhere's an instruction which does actually use CA/CA3213:52
lkclyou recall i said cut/paste-cookie-cut-style "maxs"?13:52
lkclif you compare maxs pseudocode to addic pseudocode13:52
lkclyou'll see that maxs *doesn't* include "Special Registers CA CA32"13:52
lkcland when i said "cookie-cut verbatim maxs"13:53
lkcli didn't also say13:53
lkcl"cookie-cut verbatim maxs and then add CA/CA32 to Special Registers", did i? :)13:53
lkcl(if that's what was needed i would have said "cookie-cut addic not maxs")13:53
lkclor, more probably, addc13:54
octaviusI thought the psuedo-code I need is the one in the bug. How does maxs resemble cprop?13:54
octaviusYeah, I already copied the structure in the mdwn file13:56
lkclthis looks good13:57
lkcl0110001 11013:58
lkcl 011000111013:58
lkclyep also good!13:58
lkclbtw the priority-order is to add to *first*13:59
lkcldo that as quickly as you can because you've just broken the repo13:59
lkclhard rule: never push commits that cause the master branch to fail for absolutely everyone13:59
lkclyou should have waited until you'd added...13:59
lkclah, you did :)14:00
lkclbelay that. all good14:00
octaviusI added cprop to all the places but the test cases, can you check that the params are correct?14:07
octaviusIn the, I just copied the numbers after the instr: 'cprop 3,12,5', however I don't know what they mean14:09
octaviusOooh, the test cases are making some sense, I'll give that a try14:22
lkclok so to make sense of the stuff in you need to look (in this case) at X-Form14:23
lkclwhich, errr, i'll just cut/paste from fields.txt now..14:24
* lkcl tum-te-tum...14:24
lkclbtw if you want to save yourself some time use this trick in
lkcl:%s/def case_/def cse_/g14:27
lkclthen enable *only* the one you want to see14:28
lkcl$ python3 decoder/isa/ >& /tmp/f14:28
lkcland take a look at the output to make sure the answers are correct14:28
lkclthen *BEFORE* committing do14:28
lkcl$ git diff14:28
lkclnote that there's a bunch of case/cse stuff14:28
lkcl:%s/def cse_/def case_/g14:28
lkcl$ git diff again14:28
lkclnote that there's a lot less crap14:29
lkcl_then_ do a commit :)14:29
octaviusWhy are you finding and replacing? There are no "cse_" functions in av_test_cases.py14:31
lkclthere are if you do this, first: "<lkcl> :%s/def case_/def cse_/g"14:33
lkclbasically it's a rapid way to disable a bunch of tests you have absolutely no interest whatsoever in running14:33
octaviuswhy though?14:34
lkclto make (a) a quicker development cycle and14:34
octaviusAh, so the test suite searches for "case_" prefixed tests and runs them?14:34
lkcl(b) so you're not hunting through thousands of lines of output from the test desperately looking for the one you're actually interested in14:34
octaviusShould've just said that XD14:34
octaviusWill do14:34
lkclsorry :)14:35
lkclbtw "G" stands for "Generate a Carry"14:35
lkcland "P" stands for "Propagate a Carry [to the next element]"14:36
lkclso in the tests you do, you'll need to *begin* a carry-bit by setting a LSB in RB(?)14:36
lkclthen have a few bits (including the one where "G" started) set to 114:36
lkcland you shouuuuld end up with that bit being thrown forward, matching the "P" bits, until a P bit goes zero14:37
lkcl*should* produce the output... err...14:38
lkclsomething like that14:38
lkclG = 0b00001014:38
lkclP = 0b00011114:38
lkclshould produce14:38
lkclC = 0b0011014:39
lkclC = 0b00111014:39
lkclbecause the carry *starts* from the G bit14:39
lkcland *ends* when a P(ropagate) drops to zero14:39
lkclyou'll soon get the hang of it14:39
lkclbtw feel free to hand-edit src/openpower/decoder/isa/av.py14:40
octaviusIs "^" the exclusive-OR operator?14:40
lkclto put in print() statements14:40
lkclwhatever you need14:40
lkclsafe in the knowledge that when you re-generate "pywriter noall av"14:40
lkclanything you added to will get totally destroyed14:40
octaviusI just hand-calculated your example and got 0x00111114:40
lkcland not end up in the repo14:40
*** josuah is now known as cousin_mario14:40
octaviusoops 0b00111114:41
*** cousin_mario is now known as josuah14:41
lkclok so it's 1-over14:41
octaviusSo is that an expected result?14:41
lkclP(ropagate) is Propagate-to-1-more-bit-than-where-it-went-zero14:41
lkcli don't know! :)14:41
lkcli'm blithely copying this s*** off the internet!14:42
octaviusOk, I'll just run some tests14:42
lkclwhat would be interesting (hilarious) is if can generate this.14:43
lkcli'm kiiinda hoping/expecting it will?14:43
lkclbut, pffh14:43
octaviusAh, didn't generate the correct code: RT = (P | G) + G ^ P (missed the brackets before the XOR)14:45
lkclah deep joy14:45
octaviusActually, no, the pseudo-code is right14:46
lkcli'll have to fix that in the generator14:46
octaviusbut the auto-generated code missed14:46
lkclthat's down to pywriter.py14:46
lkcl1 sec14:46
lkclwhich is frickin complex code14:47
octaviusAlso has two op_cprop definitions, it seems to be doubling up somewhere (other av functions do this too)14:50
lkclyes i know. ignore it14:51
lkclnggh frick14:56
lkclcan you do it as a temporary variable instead for now?14:57
lkclt <- (P|G)+G14:57
lkclRT <- t ^ P14:57
lkcland raise a bugreport about this14:57
lkclhang on i've got it14:59
lkcla massive cheat14:59
lkclcall a function with a function name of "" (blank)14:59
octaviusShould I change the pseudo code?15:03
lkclthe hack doesn't work15:03
lkclplease do raise the bugreport, cross-reference this commit15:04
octaviusI disabled all tests in, but the generated file is still massive. Are there other test files that are pulled in?15:21
lkcloctavius, yep, that's "normal"15:22
lkclnone of this matters because it's not like we're doing a server or a client application or {insert-normal-program}15:23
lkclsearch for the function name of the test case you added (left enabled)15:23
lkcland it helps to have up on-screen at the same time and to look for the log message text15:24
lkcl"reading reg RA"15:25
lkcland "writing gpr 3"15:25
lkclshould be obvious neon-flashing signs15:26
lkclanother way is to just look for the string "Err"15:26
lkclwhich, if there's an assertion, you should have "Assertion Error" in the output15:27
lkclso you can kinda get an overall binary yes/no from that15:27
octaviusThe result is correct (0xf), but the test failed.15:29
octaviusI'll go for a tea break then resume15:29
octaviusPhew, finally pushed commit15:38
octaviuschroot is having 403 error problems15:39
octaviusThe test is prefixed "cse_" so shouldn't cause failed regressions15:39
octaviusI'll be back15:39
lkclhastur la vista15:39
lkcldon't ever worry about regression problems15:39
lkclthe point of tests *is* that they fail15:40
lkclcprop's frickin fascinating15:46
lkclExpectedState(pc=4) not pc=815:47
lkclpc=8 is if you have 2 instructions in the lst15:47
octaviusBack, do you mind if I turn the pc count into a var? Use len(lst)*4 to determine where to stop16:36
ghostmansd[m]> RW+     =   admin adminbigmac addw addw2  cesar oliva jacob1 jacob2 vklr tobias lauri dmitry3mdeb klehman mikolaj markos tpearson ghostmansd16:41
ghostmansd[m]lkcl, you can kick dmitry3mdeb :-)16:41
lkcloctavius, knock yourself out16:47
lkclghostmansd[m], willdo16:47
octaviuslkcl, are there any other ops I could add?17:39
octaviusActually, I guess I haven't finished cprop because ci is not being used?17:42
octaviusI read the stackoverflow answer and understand the algorithm a bit more (haven't seen carry-lookahead adders), however how do you prepare the P and G int's? Will this require prep by the programmer, or are we going to make an instr for it?18:02
octavius* haven't seen carry-lookahead for a while I mean18:02
lkclhave a look at the algorithm at the bottom of the page
lkcl    At each id, compute C[id] = A[id]+B[id]+018:04
lkcl    Get G[id] = C[id] > radix -118:04
lkcl    Get P[id] = C[id] == radix-118:04
lkclso if C[id] == 0x10000000 then G[id] = 118:05
lkclif C[id] = 0xffffffff then P[id] = 118:05
lkclci - you mean "cin" - carry-in18:06
lkclyyeah i'm wondering about that18:06
lkcli honestly don't know how to deal with it, programmerjake may know18:07
lkclbut if added then it should be as a cpropo18:07
lkclwhich enables OE=118:07
lkclwhich activates CA-in *and* CA-out18:07
lkclbut, that has to be planned carefully, the XO has to change18:08
programmerjakelkcl, imho sv.adde is sufficient for biginteger add, cprop is rendered redundant because you can just do the trick of having your 256-bit simd unit do a 256-bit add and forward co from the previous clock cycle to ci in the current cycle to get full-speed bigint add19:12
programmerjakeso imho we should remove cprop19:12
lkclprogrammerjake, i was kinda thinking either well beyond 256, 512 or 1024, and also of other circumstances invlving carry19:26
lkcland, also, for other vector mask purposes, problem being it was 20 years ago i worked with the Aspex ASP19:26
programmerjakebeyond 1024 bits? just use the CA register to hold carry between one vector add and the next. also, scalar adde can be used as a carry propagate instruction like cprop, but with the inputs encoded differently.19:29
programmerjakefor adde RT, RA, RB: set the bit in RA when the element add produces >= 0xFFFF...FFFF, set the bit in RB when the element add overflows.19:32
programmerjakethe same sv.adde 256-bit and carry forwarding tricks work for sv.subfe19:35
programmerjakeso, imho cprop is still rendered unnecessary19:35
lkclthere's more uses than just carry-propagation (a lot).19:41
lkcli just can't remember what they were19:41
programmerjakeadde *can do* carry propagation, even if your using that carry propagation for something other than arithmetic19:42
octaviusYeah Luke, thanks for the explanation22:21
lkcljust added a follow-up about the Subdecoder (which you could have eventually found if you'd grep -r'd for "minor_22.csv")22:25
octaviusyeah ok, might understand that better tomorrow, my brain is done for today XD22:26
lkcli got a couple hours nap so am sort-of-awake22:27
octaviusOh, so the NN actually means something?22:28
octaviusAh, bit's 0-522:29
octaviusI thought it indicated a 0.5 version or something22:29
lkclNN is a stand-in for "we don't have an allocated major opcode [bits 0-5]"22:41
octaviusI've almost gone through adding the bmask entry, but seems to lack the BM2 form22:42
lkcl(sotto voice so are picking EXT022 surreptitiously)22:42
octaviusah ok22:42 won't have it, correct, because i only added it 4 hours ago22:42
lkcland it's the very first instruction to have it22:42
lkclso you'll need to grab the entry from fields.txt and work things out22:43
lkclthe fields[] list corresponds with the arguments22:43
lkcl   bmask RT,RA,RB,mode,L22:44
lkclso there will be *give* things to "insn |= fields[N] << (31-xxx)22:44
lkclso there will be *five* things to "insn |= fields[N] << (31-xxx)"22:44
lkclreally that should be entirely automated22:44
lkclthat was the topic of discussion here with ghostmansd a few days ago22:44
lkclbtw, bonus points for documenting all these steps :)22:46
octaviusis there a way to comment out minor_22.csv entries?23:25
octaviusI made a bunch of edits, but have no more brain left to get it working23:26

Generated by 2.17.1 by Marius Gedminas - find it at!