*** kylel1 is now known as kylel | 07:08 | |
lkcl | markos, mornin. i added those abs-diff-accumulate etc. scalar instructions we talked about a couple weeks ago (?) https://libre-soc.org/openpower/sv/av_opcodes/ | 09:52 |
---|---|---|
markos | lkcl, yes, I read, this is great, sorry for being afk, I should be more active soon :) | 09:54 |
lkcl | markos, no problem. | 10:50 |
lkcl | ghostmansd[m], for the description and links-to-spec that was suggested to be added i created a little blurb here https://bugs.libre-soc.org/show_bug.cgi?id=857#c10 | 10:51 |
lkcl | i'm going to do the parallel prefix iteration next | 10:54 |
lkcl | which would allow to do "result = i.x * i.y * i.z * i.w" easily | 10:54 |
lkcl | octavius, mornin | 11:45 |
lkcl | just | 11:45 |
lkcl | octavius, after programmerjake kindly noted the bitmanip ops from x86 i'm going to redo those as a batch of *24* instructions in one hit (actually, just one) | 11:46 |
lkcl | but there's one dead-easy one that you could do in the meantime whilst i'm doing that - cprop | 11:47 |
lkcl | https://bugs.libre-soc.org/show_bug.cgi?id=865 | 11:47 |
lkcl | i've updated https://libre-soc.org/openpower/sv/vector_ops/ with everything you need. | 11:47 |
lkcl | the pseudocode is a dead-simple one-liner (ok ok 3) | 11:47 |
octavius | Morning lkcl, will do | 11:48 |
lkcl | there's aaallmost no thought involved, here. this will leave you a leeetle... "out-of-place" right up until you actually add the unit test and run it | 11:49 |
lkcl | at which point you should experience an "ah ha!" moment :) | 11:50 |
lkcl | but up until that point it maaay feel a little "what am i doooiiiing", which i experience all the time on these | 11:50 |
lkcl | you captured the chat logs from yesterday? | 11:51 |
* lkcl afk keeping an eye out on irclogs | 11:51 | |
octavius | Yes I have the chat log saved. It mentions PyWriter, do I need that right now? | 12:02 |
octavius | Also I don't seem to have write rw permissions for openpower-isa | 12:10 |
lkcl | octavius, ah brilliant, do drop them into the bugreport | 12:52 |
lkcl | octavius, done, added | 12:53 |
octavius | The chat from our call is quite long, do you just want the instructions at the end? | 12:53 |
octavius | No, I don't have write permission | 13:07 |
lkcl | repo openpower-isa | 13:35 |
lkcl | - RW+ = admin adminbigmac addw addw2 cesar oliva jacob1 jacob2 vklr tobias lauri dmitry3mdeb klehman mikolaj markos tpearson ghostmansd | 13:35 |
lkcl | + RW+ = admin adminbigmac addw addw2 cesar oliva jacob1 jacob2 vklr tobias lauri dmitry3mdeb klehman mikolaj markos tpearson ghostmansd andreym | 13:35 |
lkcl | your key rohdo@firedragon has always been present and active | 13:35 |
lkcl | sigh, helps if i do "git push", eh | 13:36 |
octavius | XD | 13:37 |
octavius | PyWriter now breaks because I added OP_CPROP to minor_22.csv | 13:40 |
octavius | Where do I need to add the definition? | 13:41 |
lkcl | in power_enums.py | 13:42 |
lkcl | look at the patch sets (diffs) from yesterday | 13:43 |
lkcl | *all* of those things need to be done (equivalents-of) before it "works" | 13:43 |
lkcl | @unique | 13:44 |
lkcl | class MicrOp(Enum): | 13:44 |
lkcl | ..... | 13:44 |
lkcl | OP_ABSDIFF = 91 | 13:44 |
lkcl | OP_ABSADD = 92 | 13:44 |
lkcl | OP_CPROP=..... | 13:44 |
lkcl | # supported instructions: make sure to keep up-to-date with CSV files | 13:44 |
lkcl | # just like everything else | 13:44 |
lkcl | _insns = [ | 13:44 |
lkcl | "NONE", "add", "addc", "addco", "adde", "addeo", | 13:44 |
lkcl | ... | 13:44 |
lkcl | .... | 13:44 |
lkcl | "cprop", # AV bitmanip | 13:44 |
lkcl | octavius, so the way that big-integer math works is | 13:46 |
lkcl | (normally) | 13:46 |
lkcl | * you do a sequence of 64-bit *non-carrying* adds | 13:46 |
lkcl | (in parallel) | 13:46 |
lkcl | * whilst you're doing that you analyse A and B to see if they *would* carry | 13:47 |
lkcl | (because if A==0xffffffffff and B=0xfffffffff then then's going to "propagate" a carry-bit, isn't it?) | 13:47 |
octavius | yep | 13:47 |
lkcl | * you drop all the "Propagate-Carry-to-next-64-bit element" and "Generate-Carry-to-next-64-bit element" bits into that funny-looking algorithm | 13:48 |
lkcl | (P|G+G)^P | 13:48 |
lkcl | * that tells you whether you need to "add one" on a per-element basis | 13:48 |
lkcl | you then go *back* to the parallel set of (independent) 64-bit element adds that you did earlier | 13:49 |
lkcl | and you add 1 to each where the P/G-calculation says so | 13:49 |
lkcl | therefore what we can do is: use that cprop instruction to compute a predicate mask | 13:49 |
lkcl | then do | 13:49 |
lkcl | sv.addi/sm=r3 rt.v, ra.v, 1 | 13:50 |
lkcl | and, ta-daaa, we've done a parallel-copy-propagate big-integer add | 13:50 |
lkcl | here's an instruction which does actually use CA/CA32 | 13:52 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isa/fixedarith.mdwn;h=46f635294a0ecda189782a11ae985d443a674578;hb=0fe33b758d6ba6db64b9021079d9410b3b749cfa#l88 | 13:52 |
lkcl | you recall i said cut/paste-cookie-cut-style "maxs"? | 13:52 |
lkcl | if you compare maxs pseudocode to addic pseudocode | 13:52 |
lkcl | you'll see that maxs *doesn't* include "Special Registers CA CA32" | 13:52 |
lkcl | and when i said "cookie-cut verbatim maxs" | 13:53 |
lkcl | i didn't also say | 13:53 |
lkcl | "cookie-cut verbatim maxs and then add CA/CA32 to Special Registers", did i? :) | 13:53 |
lkcl | (if that's what was needed i would have said "cookie-cut addic not maxs") | 13:53 |
lkcl | or, more probably, addc | 13:54 |
octavius | I thought the psuedo-code I need is the one in the bug. How does maxs resemble cprop? | 13:54 |
lkcl | everything-except-the-pseudo-code | 13:56 |
octavius | Yeah, I already copied the structure in the mdwn file | 13:56 |
lkcl | this looks good | 13:57 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=1273502a1d60b3772aa69376278b9d9eb79598b9 | 13:57 |
lkcl | 0110001 110 | 13:58 |
lkcl | +0110001110-,ALU,OP_CPROP,RA,RB,NONE,RT | 13:58 |
lkcl | 0110001110 | 13:58 |
lkcl | yep also good! | 13:58 |
lkcl | btw the priority-order is to add to power_enums.py *first* | 13:59 |
lkcl | do that as quickly as you can because you've just broken the repo | 13:59 |
lkcl | hard rule: never push commits that cause the master branch to fail for absolutely everyone | 13:59 |
lkcl | you should have waited until you'd added... | 13:59 |
lkcl | ah, you did :) | 14:00 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=0fe33b758d6ba6db64b9021079d9410b3b749cfa | 14:00 |
lkcl | belay that. all good | 14:00 |
octavius | I added cprop to all the places but the test cases, can you check that the params are correct? | 14:07 |
octavius | In the svp64.py, I just copied the numbers after the instr: 'cprop 3,12,5', however I don't know what they mean | 14:09 |
octavius | Oooh, the test cases are making some sense, I'll give that a try | 14:22 |
lkcl | :) | 14:22 |
lkcl | ok so to make sense of the stuff in svp64.py you need to look (in this case) at X-Form | 14:23 |
lkcl | which, errr, i'll just cut/paste from fields.txt now.. | 14:24 |
* lkcl tum-te-tum... | 14:24 | |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isatables/fields.text;hb=HEAD#l47 | 14:24 |
lkcl | octavius, https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=3f4ed1f5c617b85ecac30ab60404cb2faa1f6f81 | 14:27 |
lkcl | btw if you want to save yourself some time use this trick in av_test_cases.py: | 14:27 |
lkcl | :%s/def case_/def cse_/g | 14:27 |
lkcl | then enable *only* the one you want to see | 14:28 |
lkcl | then | 14:28 |
lkcl | $ python3 decoder/isa/test_caller_bitmanip_av.py >& /tmp/f | 14:28 |
lkcl | and take a look at the output to make sure the answers are correct | 14:28 |
lkcl | then *BEFORE* committing do | 14:28 |
lkcl | $ git diff | 14:28 |
lkcl | note that there's a bunch of case/cse stuff | 14:28 |
lkcl | do | 14:28 |
lkcl | :%s/def cse_/def case_/g | 14:28 |
lkcl | then | 14:28 |
lkcl | $ git diff again | 14:28 |
lkcl | note that there's a lot less crap | 14:29 |
lkcl | _then_ do a commit :) | 14:29 |
octavius | Why are you finding and replacing? There are no "cse_" functions in av_test_cases.py | 14:31 |
lkcl | there are if you do this, first: "<lkcl> :%s/def case_/def cse_/g" | 14:33 |
lkcl | :) | 14:33 |
lkcl | basically it's a rapid way to disable a bunch of tests you have absolutely no interest whatsoever in running | 14:33 |
octavius | why though? | 14:34 |
lkcl | to make (a) a quicker development cycle and | 14:34 |
octavius | Ah, so the test suite searches for "case_" prefixed tests and runs them? | 14:34 |
lkcl | (b) so you're not hunting through thousands of lines of output from the test desperately looking for the one you're actually interested in | 14:34 |
lkcl | correct | 14:34 |
octavius | Should've just said that XD | 14:34 |
octavius | Will do | 14:34 |
lkcl | sorry :) | 14:35 |
lkcl | btw "G" stands for "Generate a Carry" | 14:35 |
lkcl | and "P" stands for "Propagate a Carry [to the next element]" | 14:36 |
lkcl | so in the tests you do, you'll need to *begin* a carry-bit by setting a LSB in RB(?) | 14:36 |
lkcl | then have a few bits (including the one where "G" started) set to 1 | 14:36 |
lkcl | and you shouuuuld end up with that bit being thrown forward, matching the "P" bits, until a P bit goes zero | 14:37 |
lkcl | G=0b000001 | 14:37 |
lkcl | P=0b00111 | 14:37 |
lkcl | whoops | 14:37 |
lkcl | G=0b000001 | 14:37 |
lkcl | P=0b000111 | 14:38 |
lkcl | *should* produce the output... err... | 14:38 |
lkcl | C=0b001111 | 14:38 |
lkcl | something like that | 14:38 |
lkcl | G = 0b000010 | 14:38 |
lkcl | P = 0b000111 | 14:38 |
lkcl | should produce | 14:38 |
lkcl | C = 0b00110 | 14:39 |
lkcl | sorry | 14:39 |
lkcl | C = 0b001110 | 14:39 |
lkcl | because the carry *starts* from the G bit | 14:39 |
lkcl | and *ends* when a P(ropagate) drops to zero | 14:39 |
lkcl | you'll soon get the hang of it | 14:39 |
lkcl | btw feel free to hand-edit src/openpower/decoder/isa/av.py | 14:40 |
octavius | Is "^" the exclusive-OR operator? | 14:40 |
lkcl | to put in print() statements | 14:40 |
lkcl | yes | 14:40 |
lkcl | whatever you need | 14:40 |
lkcl | safe in the knowledge that when you re-generate "pywriter noall av" | 14:40 |
lkcl | anything you added to av.py will get totally destroyed | 14:40 |
octavius | I just hand-calculated your example and got 0x001111 | 14:40 |
lkcl | and not end up in the repo | 14:40 |
*** josuah is now known as cousin_mario | 14:40 | |
octavius | oops 0b001111 | 14:41 |
*** cousin_mario is now known as josuah | 14:41 | |
lkcl | cool! | 14:41 |
lkcl | ok so it's 1-over | 14:41 |
octavius | So is that an expected result? | 14:41 |
lkcl | P(ropagate) is Propagate-to-1-more-bit-than-where-it-went-zero | 14:41 |
lkcl | i don't know! :) | 14:41 |
lkcl | i'm blithely copying this s*** off the internet! | 14:42 |
octavius | Ok, I'll just run some tests | 14:42 |
lkcl | what would be interesting (hilarious) is if bmask.py can generate this. | 14:43 |
lkcl | i'm kiiinda hoping/expecting it will? | 14:43 |
lkcl | but, pffh | 14:43 |
octavius | Ah, av.py didn't generate the correct code: RT = (P | G) + G ^ P (missed the brackets before the XOR) | 14:45 |
lkcl | ah deep joy | 14:45 |
lkcl | frick | 14:45 |
octavius | Actually, no, the pseudo-code is right | 14:46 |
lkcl | i'll have to fix that in the generator | 14:46 |
octavius | but the auto-generated code missed | 14:46 |
octavius | yeah | 14:46 |
lkcl | mooo | 14:46 |
lkcl | that's down to pywriter.py | 14:46 |
lkcl | 1 sec | 14:46 |
lkcl | nggggh | 14:46 |
lkcl | which is frickin complex code | 14:47 |
octavius | joy | 14:47 |
octavius | Also av.py has two op_cprop definitions, it seems to be doubling up somewhere (other av functions do this too) | 14:50 |
lkcl | yes i know. ignore it | 14:51 |
lkcl | nggh frick | 14:56 |
lkcl | can you do it as a temporary variable instead for now? | 14:57 |
lkcl | t <- (P|G)+G | 14:57 |
lkcl | RT <- t ^ P | 14:57 |
lkcl | and raise a bugreport about this | 14:57 |
lkcl | hang on i've got it | 14:59 |
lkcl | a massive cheat | 14:59 |
lkcl | call a function with a function name of "" (blank) | 14:59 |
octavius | Should I change the pseudo code? | 15:03 |
lkcl | yeees | 15:03 |
lkcl | sigh | 15:03 |
octavius | ok | 15:03 |
lkcl | the hack doesn't work | 15:03 |
lkcl | yet | 15:03 |
lkcl | please do raise the bugreport, cross-reference this commit | 15:04 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=ef967c61da13d8c80ff5424fe43c35756928b145 | 15:04 |
octavius | https://bugs.libre-soc.org/show_bug.cgi?id=866 | 15:17 |
octavius | I disabled all tests in av_cases.py, but the generated file is still massive. Are there other test files that are pulled in? | 15:21 |
lkcl | octavius, yep, that's "normal" | 15:22 |
lkcl | none of this matters because it's not like we're doing a server or a client application or {insert-normal-program} | 15:23 |
lkcl | search for the function name of the test case you added (left enabled) | 15:23 |
lkcl | and it helps to have caller.py up on-screen at the same time and to look for the log message text | 15:24 |
lkcl | "reading reg RA" | 15:25 |
lkcl | and "writing gpr 3" | 15:25 |
lkcl | should be obvious neon-flashing signs | 15:26 |
lkcl | another way is to just look for the string "Err" | 15:26 |
lkcl | which, if there's an assertion, you should have "Assertion Error" in the output | 15:27 |
lkcl | duh | 15:27 |
lkcl | so you can kinda get an overall binary yes/no from that | 15:27 |
octavius | The result is correct (0xf), but the test failed. | 15:29 |
octavius | I'll go for a tea break then resume | 15:29 |
lkcl | cool | 15:38 |
octavius | Phew, finally pushed commit | 15:38 |
octavius | chroot is having 403 error problems | 15:39 |
octavius | The test is prefixed "cse_" so shouldn't cause failed regressions | 15:39 |
octavius | I'll be back | 15:39 |
lkcl | hastur la vista | 15:39 |
lkcl | don't ever worry about regression problems | 15:39 |
lkcl | the point of tests *is* that they fail | 15:40 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=09db8b5102785d129b268a712006e6638b978353 | 15:42 |
lkcl | cprop's frickin fascinating | 15:46 |
lkcl | ExpectedState(pc=4) not pc=8 | 15:47 |
lkcl | pc=8 is if you have 2 instructions in the lst | 15:47 |
octavius | Back, do you mind if I turn the pc count into a var? Use len(lst)*4 to determine where to stop | 16:36 |
ghostmansd[m] | > RW+ = admin adminbigmac addw addw2 cesar oliva jacob1 jacob2 vklr tobias lauri dmitry3mdeb klehman mikolaj markos tpearson ghostmansd | 16:41 |
ghostmansd[m] | lkcl, you can kick dmitry3mdeb :-) | 16:41 |
lkcl | octavius, knock yourself out | 16:47 |
lkcl | ghostmansd[m], willdo | 16:47 |
octavius | lkcl, are there any other ops I could add? | 17:39 |
octavius | Actually, I guess I haven't finished cprop because ci is not being used? | 17:42 |
octavius | I read the stackoverflow answer and understand the algorithm a bit more (haven't seen carry-lookahead adders), however how do you prepare the P and G int's? Will this require prep by the programmer, or are we going to make an instr for it? | 18:02 |
octavius | * haven't seen carry-lookahead for a while I mean | 18:02 |
lkcl | have a look at the algorithm at the bottom of the page https://libre-soc.org/openpower/sv/vector_ops | 18:04 |
lkcl | At each id, compute C[id] = A[id]+B[id]+0 | 18:04 |
lkcl | Get G[id] = C[id] > radix -1 | 18:04 |
lkcl | Get P[id] = C[id] == radix-1 | 18:04 |
lkcl | so if C[id] == 0x10000000 then G[id] = 1 | 18:05 |
lkcl | if C[id] = 0xffffffff then P[id] = 1 | 18:05 |
lkcl | ci - you mean "cin" - carry-in | 18:06 |
lkcl | yyeah i'm wondering about that | 18:06 |
lkcl | i honestly don't know how to deal with it, programmerjake may know | 18:07 |
lkcl | but if added then it should be as a cpropo | 18:07 |
lkcl | which enables OE=1 | 18:07 |
lkcl | which activates CA-in *and* CA-out | 18:07 |
lkcl | but, that has to be planned carefully, the XO has to change | 18:08 |
programmerjake | lkcl, imho sv.adde is sufficient for biginteger add, cprop is rendered redundant because you can just do the trick of having your 256-bit simd unit do a 256-bit add and forward co from the previous clock cycle to ci in the current cycle to get full-speed bigint add | 19:12 |
programmerjake | so imho we should remove cprop | 19:12 |
lkcl | programmerjake, i was kinda thinking either well beyond 256, 512 or 1024, and also of other circumstances invlving carry | 19:26 |
lkcl | and, also, for other vector mask purposes, problem being it was 20 years ago i worked with the Aspex ASP | 19:26 |
programmerjake | beyond 1024 bits? just use the CA register to hold carry between one vector add and the next. also, scalar adde can be used as a carry propagate instruction like cprop, but with the inputs encoded differently. | 19:29 |
programmerjake | for adde RT, RA, RB: set the bit in RA when the element add produces >= 0xFFFF...FFFF, set the bit in RB when the element add overflows. | 19:32 |
programmerjake | the same sv.adde 256-bit and carry forwarding tricks work for sv.subfe | 19:35 |
programmerjake | so, imho cprop is still rendered unnecessary | 19:35 |
lkcl | there's more uses than just carry-propagation (a lot). | 19:41 |
lkcl | i just can't remember what they were | 19:41 |
programmerjake | adde *can do* carry propagation, even if your using that carry propagation for something other than arithmetic | 19:42 |
lkcl | octavius, https://bugs.libre-soc.org/show_bug.cgi?id=865#c13 | 22:20 |
octavius | Yeah Luke, thanks for the explanation | 22:21 |
lkcl | just added a follow-up about power_decode.py the Subdecoder (which you could have eventually found if you'd grep -r'd for "minor_22.csv") | 22:25 |
octavius | yeah ok, might understand that better tomorrow, my brain is done for today XD | 22:26 |
lkcl | :) | 22:26 |
lkcl | i got a couple hours nap so am sort-of-awake | 22:27 |
octavius | Oh, so the NN actually means something? | 22:28 |
octavius | Ah, bit's 0-5 | 22:29 |
octavius | I thought it indicated a 0.5 version or something | 22:29 |
lkcl | NN is a stand-in for "we don't have an allocated major opcode [bits 0-5]" | 22:41 |
octavius | I've almost gone through adding the bmask entry, but svp64.py seems to lack the BM2 form | 22:42 |
lkcl | (sotto voice so are picking EXT022 surreptitiously) | 22:42 |
octavius | ah ok | 22:42 |
lkcl | svp64.py won't have it, correct, because i only added it 4 hours ago | 22:42 |
lkcl | and it's the very first instruction to have it | 22:42 |
lkcl | so you'll need to grab the entry from fields.txt and work things out | 22:43 |
lkcl | the fields[] list corresponds with the arguments | 22:43 |
lkcl | bmask RT,RA,RB,mode,L | 22:44 |
lkcl | so there will be *give* things to "insn |= fields[N] << (31-xxx) | 22:44 |
lkcl | so there will be *five* things to "insn |= fields[N] << (31-xxx)" | 22:44 |
lkcl | really that should be entirely automated | 22:44 |
lkcl | that was the topic of discussion here with ghostmansd a few days ago | 22:44 |
lkcl | btw, bonus points for documenting all these steps :) | 22:46 |
octavius | is there a way to comment out minor_22.csv entries? | 23:25 |
octavius | I made a bunch of edits, but have no more brain left to get it working | 23:26 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!