lkcl | markos, yes, that makes sense. | 01:37 |
---|---|---|
lkcl | programmerjake, the very first function from the Power ISA spec i started hand-converting the pseudo-code into python. | 01:37 |
lkcl | i quickly realised "this was nuts" (i could easily make a transcription mistake) and to instead add the fphelpers.mdwn | 01:38 |
lkcl | what i didn't do however was go back and redo (re-put) the pseudocode for SINGLE() from p144 4.6.3 into fphelpers.mdwn | 01:38 |
lkcl | (because time) | 01:38 |
programmerjake | DOUBLE2SINGLE does not occur anywhere in the PowerISA spec. v3.1B | 01:41 |
programmerjake | not by that name, at least | 01:41 |
lkcl | 6 <!-- Power ISA Book I Version 3.0B Section A.1 page 775-778 --> | 01:46 |
lkcl | they fail to give it a function name. | 01:46 |
lkcl | likewise A.2 | 01:46 |
programmerjake | k | 01:47 |
programmerjake | it's specifically only used for frsp, why not just stick it in frsp's pseudo-code? | 01:48 |
lkcl | because that's not what has been done in frsp. | 01:48 |
lkcl | the function is useful (and used, a lot) outside of frsp | 01:49 |
lkcl | see the unit tests that i wrote. | 01:50 |
programmerjake | frsp just doesn't have any pseudocode ... probably because it's just too long, so they just reference the appendix instead. | 01:50 |
programmerjake | where they put frsp's suggested pseudocode | 01:51 |
lkcl | agreed, likely. | 01:53 |
lkcl | turns out it can be used for other purposes | 01:53 |
lkcl | i was in a rush to get it working for lauri | 01:54 |
programmerjake | k | 01:54 |
programmerjake | it's still broken tho | 01:54 |
lkcl | as he was held up as the instructions were missing | 01:54 |
lkcl | it's entirely missing the rounding modes (and i don't care at this immediate time, it "does the job" of demoing mp3) | 01:55 |
lkcl | i needed to get lauri up-and-running as quickly as possible, and did a hell of a lot of instructions in a very short amount of time | 01:56 |
programmerjake | i can fix it, if you like | 01:58 |
programmerjake | shouldn't take long | 01:58 |
lkcl | sure go for it | 03:58 |
ghostmansd[m] | lkcl, markos, are you done with new insns? Are they stable to be added into binutils? | 06:45 |
markos | ghostmansd, I think so | 10:28 |
markos | I could add a few more tests tbh, to check for very large values, and inf/NaN values | 10:29 |
ghostmansd[m] | This has nothing to do with binutils, though | 10:29 |
ghostmansd[m] | I'm mostly worrying about the semantics, e.g. operands and XOP | 10:30 |
markos | ok then, in that case it should be considered done | 10:31 |
markos | syntax/semantics is not expected to change | 10:31 |
ghostmansd | markos, star! | 10:57 |
ghostmansd | https://bugs.libre-soc.org/show_bug.cgi?id=845#c2 | 10:57 |
ghostmansd | Also, a question: any tips on the longest instruction name (including all possible qualifiers)? Without macros, obviously, just plain constants or well-known symbols. | 10:58 |
lkcl | ghostmansd, those two fmvis and fishmv are "done" | 11:02 |
lkcl | eek. probably... ternlogi or crternlog | 11:03 |
lkcl | markos, did you appreciate the Iain Banks quote? | 11:05 |
lkcl | sv.ternlogi/sm=32/dm=32/pr=1<<%r3/vec2/sw=32/dw=16 *r120, *r100, 255 | 11:12 |
lkcl | grevlut *might* exceed that slightly | 11:14 |
programmerjake | well, I've had enough converting pseudo-code and fighting with the parser for one day...gn all | 11:15 |
lkcl | programmerjake, :) | 11:15 |
lkcl | yyeah considerable care has to be taken | 11:15 |
lkcl | https://libre-soc.org/openpower/isa/svfparith/ | 11:15 |
lkcl | it's a one-pass so you have to create temporary variables by declaring them as empty (full of zeros) | 11:16 |
lkcl | tmp = [0] * 32 | 11:16 |
lkcl | sorry | 11:16 |
lkcl | tmp <- [0] * 32 | 11:17 |
lkcl | gaah | 11:17 |
lkcl | then filling in any bits to be changed | 11:17 |
programmerjake | in my case, considerable head bashing...imho when we have time, we should just rewrite the parser as a recursive descent parser...it'll be waay easier to understand and give better error messages. | 11:17 |
lkcl | commit what you have, commented-out, i'll take a look | 11:17 |
programmerjake | one of the errors I was running into was a unexpected DEDENT at the end of the function, caused by using incorrect code near the beginning of the function...idk how it decided that was the error... | 11:18 |
programmerjake | what I committed is the final fixed code... | 11:18 |
lkcl | ahh ok | 11:18 |
programmerjake | imho using ply was a bad choice, LR parsers often have trouble giving good error messages | 11:19 |
lkcl | except you've just removed DOUBLE2SINGLE which terminates being able to run any of the unit tests | 11:19 |
programmerjake | plus, it's slow because it has to generate the parsing tables | 11:20 |
lkcl | 1 sec... | 11:20 |
programmerjake | no, it's added to helper.py | 11:20 |
lkcl | yes, ok i see a DOUBLE2SINGLE helper function | 11:20 |
lkcl | ok brilliant | 11:21 |
lkcl | hmmm i didn't realise there's a "return" statement. | 11:21 |
lkcl | that's not part of the IBM pseudocode. | 11:21 |
programmerjake | I had to fix the python2 ast usage for assigning to a tuple | 11:22 |
programmerjake | actually, the original doc used a Return statement in 1 func -- Done in the rest | 11:22 |
lkcl | bizarre. and then not documented i get the general impression people sort-of randomly used stuff as they felt like it | 11:23 |
lkcl | on the basis "nobody is ever going to do this as real executable code" | 11:23 |
programmerjake | it is an appendix... | 11:23 |
markos | lkcl, sadly I am not familiar with Iain Banks :( | 11:23 |
lkcl | paul even said that some of the modifications we've done were rejected "because you just read the english, dummy". sigh | 11:23 |
lkcl | markos, a book called Excession, in which a [not-AI, but so advanced it exceeds the capacity of human intelligence by several orders of magnitude] | 11:24 |
lkcl | a machine-consciousness plans for decades how to do a "sleeper" on its hyperdrive engines | 11:25 |
programmerjake | ttyl | 11:25 |
lkcl | converting them on-the-fly in order to drop its pursuers | 11:25 |
lkcl | thx jacob | 11:25 |
programmerjake | yw | 11:25 |
lkcl | and over a period of a few hours, begins ramping up to a sustained speed of 238,000 times the speed of light | 11:25 |
markos | "times"? | 11:26 |
lkcl | the pursuer, its engines severely overtaxed and damaged, does the math and works out that it must have converted its entire 16x2x2 *miles* long cargo bay into hyperdrive engines | 11:26 |
lkcl | speed of light times 230,000. | 11:26 |
markos | ah hyperdrive | 11:26 |
lkcl | in Star Trek terminology that would be approximately... Warp 10,000 | 11:27 |
lkcl | :) | 11:27 |
markos | hahaha | 11:27 |
lkcl | at which point, it declares, "dear holy fucking shit, where's it thinking of going, Andromeda??" | 11:27 |
lkcl | so the quote is appropriate to apply to those intrinsic numbers. | 11:28 |
lkcl | RVV is *25,000* | 11:28 |
markos | lol | 11:28 |
lkcl | https://raw.githubusercontent.com/riscv-non-isa/rvv-intrinsic-doc/master/intrinsic_funcs.md | 11:28 |
lkcl | the joke's on us, though, because if done as 1-D we have 10^6 intrinsics | 11:29 |
lkcl | forrrrtunately.... | 11:29 |
markos | so much for RISC-V being "simple" | 11:29 |
markos | yeah, I'm with you on that | 11:29 |
lkcl | if done in any conceivable way as {Prefix-Intrinsic}{Suffix-Intrinsic} | 11:29 |
markos | it's ridiculous | 11:29 |
lkcl | that drops down to {10^4} *plus* {10^2} appx order-of-magnitude | 11:30 |
markos | it could/should be down to a prefix intrinsic to set the state for the next intrinsic | 11:30 |
markos | but as 2 different intrinsics used in combination | 11:31 |
lkcl | potentially a hell of a lot less than that if you can consider elwidth and predicate mask to be arguments of the intrinsics | 11:31 |
lkcl | yes basically | 11:31 |
lkcl | the only thing being that it is the prefix that specifies the register-augmentation, so there will be some interaction between the two | 11:31 |
markos | well, it will certainly be a very interesting challenge | 11:32 |
markos | but one that will change the SIMD programming paradigm | 11:33 |
lkcl | you mean, stamp up and down on it until the hole's 10 ft deep? | 11:34 |
markos | pretty much | 11:36 |
markos | or make all SIMD developers all over the world (incl. myself) shed tears of joy because their torment is over | 11:36 |
* lkcl rueful - yeah | 11:38 | |
lkcl | you will however sadly still see people banging their heads against the wall. | 11:38 |
lkcl | i witnessed that on comp.arch last week | 11:38 |
markos | change is hard | 11:39 |
markos | for me it's an easy choice | 11:39 |
octavius | "paul even said that some of the modifications we've done were rejected "because you just read the english, dummy", lkcl what did he mean by that? | 11:40 |
markos | I cannot see myself still coding SIMD in the next 10 years, mostly because it will be impossible for me to learn 100k instructions -by that time, at that rate SVE5, AVX16777216, will probably hold that many instructions | 11:41 |
lkcl | the people who wrote the ISA docs still consider it "theirs" | 11:41 |
lkcl | and they wrote it mostly as an aide-memoire to themselves, as a reminder of what they already know | 11:42 |
markos | you will find it easier to convince new developers | 11:42 |
lkcl | it's pretty clear that they have absolutely no idea or appreciation (or, they're beginning to learn) how much it takes for newcomers | 11:42 |
lkcl | example: | 11:42 |
markos | old dinosaurs will find it very difficult to leave their bread & butter for something better | 11:42 |
markos | which is actually understandable | 11:43 |
lkcl | markos, yes basically. that's exactly what happened | 11:43 |
markos | but good luck trying to convince new devs on how to program SIMD is extremely difficult | 11:43 |
lkcl | octavius, so there's an instruction which needs a signed-comparison | 11:43 |
lkcl | sorry, needs an unsigned-comparison | 11:43 |
lkcl | but in the pseudocode source code it uses "<" | 11:44 |
lkcl | and then says in the "english words" (below), "all operations in above pseudocode are unsigned" | 11:44 |
lkcl | which is bullshit | 11:44 |
lkcl | when we proposed fixing that, we were told, "but the pseudocode is not supposed to be executable, you're supposed to just read the english" | 11:44 |
octavius | '<' means 'less than' in this case? | 11:44 |
lkcl | which is definitely bullshit | 11:45 |
lkcl | yes. | 11:45 |
lkcl | there's also an "<u" operator | 11:45 |
octavius | Ah, they still have the mindset of "this is just an aide" | 11:45 |
lkcl | if you compare a negative number against a positive number and use each of those operators "<" and "<u" you get *different* answers | 11:45 |
markos | it's supposed to be just descriptive until it isn't | 11:45 |
markos | as I learned | 11:45 |
lkcl | :) | 11:46 |
octavius | The thing I noticed when first joining libre-soc is the software engineering methodology to hardware, I'm afraid it'll take a while for this view to become mainstream | 11:46 |
markos | in that aspect it's no different than eg. protobuf | 11:46 |
markos | and just as making a stupid syntax error in protobuf will end up in destroying your communications protocol | 11:47 |
lkcl | octavius, there are 10x more software engineers than hardware engineers in the world | 11:47 |
octavius | So are they not planning to fix the pseudocode? | 11:47 |
markos | you will have the same effect here | 11:47 |
lkcl | ao486 i believe actually compiles x86 spec pseudocode into verilog | 11:47 |
lkcl | octavius, actually their attitude runs afoul of the Anti-Trust provisions set down by the OPF. | 11:48 |
lkcl | strictly speaking | 11:48 |
octavius | Yeah, the hardware engineers who understand will switch | 11:48 |
lkcl | we're going to have to remind them, diplomatically and "appropriately assertively", that there are now other people working with Power ISA | 11:49 |
markos | I see a trend in the industry in general | 11:49 |
lkcl | markos, it helps that Anton, Ben Herrenschmidt, Joel Shenki, Mikey and Paul are all software-engineer-trained | 11:49 |
markos | today's technology is overly complicated | 11:49 |
lkcl | (those are the major contributors to microwatt) | 11:50 |
markos | and people have started to actually step back a few years and rethink some of the changes | 11:50 |
markos | s/changes/advances | 11:50 |
lkcl | yehyeh. both words work | 11:50 |
markos | hence the need for RISC-V, or SVP64 or even the sudden love for retro computing | 11:51 |
lkcl | heh | 11:51 |
markos | I know I prefer my retro computers much more than even my fastest Xeon | 11:51 |
octavius | does the '/s' notation only come from vim, or does it originate from somewhere else/ | 11:51 |
lkcl | octavius, sed | 11:51 |
markos | because I can actually *understand* how they work | 11:51 |
markos | sadly I cannot work on those | 11:51 |
lkcl | sed, perl, ex, and i think perl | 11:52 |
octavius | ah ok | 11:52 |
lkcl | sorry, awk | 11:52 |
markos | but if you have a simple yet powerful architecture that skips all this complexity and just gives you raw speed | 11:52 |
markos | I honestly can't see how this can do anything but succeed | 11:52 |
lkcl | ex was - is - a line-based editor (for when you only had say a single-line screen, a serial console, or even a line-at-a-time printer) | 11:53 |
lkcl | it's still one of the modes of vi (try typing :ex) | 11:53 |
lkcl | i really really want to move ahead onto the coherent scheduled distributed computing thing | 11:54 |
lkcl | but there's still a loooong way to go yet | 11:54 |
octavius | Also lkcl, which program do you use to edit pdf's? | 11:54 |
octavius | Other than proprietary rubbish | 11:54 |
lkcl | octavius, i don't. never tried, so i don't know | 11:55 |
octavius | Ah ok | 11:55 |
lkcl | someone else might. libreoffice? | 11:55 |
lkcl | upload to google docs and convert it? | 11:55 |
octavius | that's no better XD | 11:56 |
lkcl | i sent you the docx | 11:56 |
octavius | Thanks | 11:56 |
lkcl | whatever the hell that is | 11:56 |
* lkcl need breakfast | 11:56 | |
lkcl | ghostmansd, there's a reply to you several lines up. in case you missed it. sorry saw what you wrote last night but it was a bit late | 13:17 |
ghostmansd[m] | lkcl, sorry, reply to what? About insns? | 14:00 |
ghostmansd[m] | If so, I haven't missed, just looking at the different stuff now though (prefix) | 14:01 |
lkcl | yes, esp. the length | 17:21 |
lkcl | basically any 32-bit scalar instructions they are completely stand-alone [but oh look need a corresponding sv_binutils-generated entry] | 17:22 |
lkcl | given that the opc-svp64.[ch] haven't been upstreamed yet i'd recommend considering waiting until they have | 17:23 |
lkcl | otherwise you have to replace the damn thing with a new patch-run | 17:23 |
ghostmansd[m] | lkcl, I'm kinda confused. Could you, please, send me a link to irclog entry? | 19:21 |
ghostmansd[m] | Usual 32-bit insns (w/o sv. prefix) do not need anything from sv_binutils. | 19:22 |
ghostmansd[m] | In fact, they are no different than any other insns which already exist in binutils. | 19:22 |
ghostmansd[m] | And, as such, go to the same table with mark "allow only when -mlibresoc is present". | 19:23 |
ghostmansd[m] | So, we don't have to wait for generated files to be committed, because these do not belong to generated files (and binutils folks were pretty clear they want to have new instructions in the same table but with different flags). | 19:25 |
ghostmansd[m] | Please, correct me if I'm wrong. I might be misinterpreting your messages. | 19:25 |
ghostmansd[m] | Wasted the most part of the day with prefices, so my mind isn't particularly sharp now. :-) | 19:26 |
ghostmansd | It'd have been way better if IRC allowed to reply to concrete message as eg. Telegram does. | 19:27 |
programmerjake | matrix allows replying to specific messages.... | 19:41 |
programmerjake | though iirc that's just rendered in irc as a `> <replied-to-msg>` line before your message | 19:42 |
programmerjake | ooh: https://www.phoronix.com/news/Google-SkyWater-90nm | 19:45 |
programmerjake | maybe libre-soc would fit now! | 19:45 |
tplaten | I guess that this is incomplete. some parts are missing and/or non-free | 19:53 |
programmerjake | hmm...maybe they haven't finished releasing everything? | 19:54 |
tplaten | Godot 4 will be releases soon https://www.phoronix.com/news/Godot-4.0-Beta-Soon | 19:54 |
tplaten | I hope to be able to test the libre-soc vulkan driver with that version on my Talos II. Since the POWER9 does not have SVP64, I'll need a different backend here. | 19:56 |
tplaten | I've already ported embree to the POWER9 and I am currently porting other VR-related software too. | 19:58 |
programmerjake | my plan for kazan has always had support for traditional simd or just scalar cpus...that said kazan is nowhere close to ready to test | 19:58 |
programmerjake | llvm handles all of the vector translation stuff | 19:59 |
tplaten | there is also a mesa driver, but that one is incomplete too | 20:00 |
tplaten | So I first test using AMD Vulkan or maybe PanVK | 20:01 |
lkcl | ghostmansd, https://libre-soc.org/irclog/%23libre-soc.2022-07-28.log.html#t2022-07-28T11:12:48 | 21:04 |
lkcl | that was the original question i answered | 21:04 |
lkcl | yes, you are right in that 32-bit instructions without prefixing do not need sv_binutils, and would mark "allow only when -mlibresoc is present" | 21:05 |
lkcl | however the addition of say fmvis *has* also triggered an entry into SVP64 CSV files | 21:05 |
lkcl | so you *will* require a corresponding re-run of sv_binutils | 21:06 |
lkcl | actually, ha ha, irony, except there's a bug in sv_analysis.py, it *won't* end up in the SVP64 CSV files | 21:07 |
lkcl | fishmv on the other hand does | 21:08 |
lkcl | openpower/isatables/RM-2P-1S1D.csv:fishmv,NORMAL,,2P,EXTRA3,TODO,0,0,0,FRS,0,0,FRS,0,0,0 | 21:08 |
lkcl | because it's been identified as an RM-2P-1S1D | 21:08 |
lkcl | but fmvis hasn't yet got a register "profile identity type" | 21:09 |
lkcl | because it's only 1R (1-Read) | 21:10 |
lkcl | # mapping to old SVPrefix "Forms" | 21:10 |
lkcl | mapsto = { | 21:10 |
lkcl | '1R': 'non-SV', | 21:10 |
lkcl | } | 21:10 |
lkcl | wark-wark | 21:10 |
ghostmansd[m] | I'll check it, but likely after some experiments with prefices. First, I want to move a bit further with that task, and second, this might give FSF some time to finish this damn copyright assignment. | 21:25 |
lkcl | ha. | 21:25 |
ghostmansd[m] | Not sure of the latter, though. It seems that it's never enough. :-) | 21:25 |
lkcl | last hoop was "sign-off by president" wasn't it? | 21:25 |
ghostmansd[m] | I guess they have to approach Vladimir Putin, otherwise I have no explanation why it gets that much time. | 21:26 |
lkcl | have you seen the series "Limitless"? | 21:27 |
ghostmansd[m] | Nope | 21:27 |
lkcl | https://en.wikipedia.org/wiki/Limitless_(TV_series)#Episodes | 21:28 |
lkcl | Episode 18 | 21:28 |
lkcl | it's a really funny series. not too serious about itself. totally breaks the 4th wall | 21:28 |
ghostmansd[m] | I just recalled an old Russian anecdote. | 21:30 |
ghostmansd[m] | What does it take to fix a roof? Two guys, three meters of ruberoid, one bucket of tar. The only issue is that the two guys must be sent by the President. | 21:30 |
ghostmansd[m] | Thanks Luke, will check! I like the genre of breaking the 4th wall. Wilfred, Deadpool, even Family Guy do it | 21:31 |
lkcl | for context there's the film as well, although the series works even if you haven't seen the film of the same name | 21:32 |
lkcl | ghostmansd[m], congratulations, you have a new SVP64 RM type RM-1P-1S | 21:33 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=9fc4f5fd4ec2e3a3e52acacaf699f18d324b9f2d | 21:33 |
ghostmansd[m] | I realized I'd seen the film | 21:34 |
lkcl | ha! | 21:34 |
ghostmansd[m] | Only realized when I saw the main actor's name | 21:34 |
ghostmansd[m] | Never knew the name of the film, saw three or four times on TV :-) | 21:35 |
lkcl | ok so the tv series is the continuation, a few weeks after that | 21:35 |
lkcl | doh | 21:35 |
lkcl | you have a TV?? | 21:35 |
markos | lkcl, just saw the email, could please explain why? | 21:35 |
lkcl | because it is an input. FRS is.... i am stupid | 21:35 |
lkcl | i am very very stupid | 21:36 |
lkcl | that's why | 21:36 |
markos | I'm even more confused now | 21:36 |
lkcl | i have a mild form of dyslexia | 21:36 |
ghostmansd[m] | > you have a TV?? | 21:37 |
ghostmansd[m] | Anytime I visit my parents | 21:37 |
markos | pretty sure that doesn't turn you into stupid :) | 21:37 |
markos | but in any case, do I have to fix it? | 21:38 |
lkcl | markos, nope | 21:38 |
* lkcl on it | 21:38 | |
lkcl | +++ b/openpower/isatables/RM-1P-1D.csv | 21:39 |
lkcl | @@ -1,2 +1,3 @@ | 21:39 |
lkcl | insn,mode,CONDITIONS,Ptype,Etype,0,1,2,3,in1,in2,in3,out,CR in,CR out,out2 | 21:39 |
lkcl | +fmvis,NORMAL,,1P,EXTRA3,d:FRS,0,0,0,0,0,0,FRS,0,0,0 | 21:39 |
lkcl | that's more like it | 21:39 |
lkcl | ghostmansd[m], nope. there's only a new entry in RM-1P-1D | 21:39 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=2ce11894505e4194904ef095ad6aa32ff36a3fbf | 21:42 |
markos | ah, right forgot to commit that file after running sv_analysis | 21:43 |
lkcl | markos, you'd not have had anything added because it was not recognised | 21:43 |
ghostmansd[m] | All these exceptions in process_cvs look suspicious | 21:44 |
ghostmansd[m] | Aren't there way too many of those? | 21:44 |
lkcl | the profiling initially creates a key "unit-num(inregs)-num(inCRs)-num(outregs)-num(outCRs)-immediate(yes/no)" | 21:45 |
lkcl | which then gets turned into a string | 21:45 |
lkcl | 1R-1W-imm | 21:45 |
lkcl | or | 21:45 |
lkcl | 2R-1W-CRo | 21:45 |
lkcl | the entry "1W-imm" which covers fmvis was entirely missing | 21:45 |
lkcl | well... i mean, they have to go somewhere | 21:46 |
lkcl | rfid for example it has a register profile but is completely inappropriate to be Vectorised | 21:47 |
ghostmansd[m] | Ok, fair enough | 21:59 |
ghostmansd[m] | Still I have to admit that the naming looks kinda cryptic | 22:00 |
lkcl | it needs documenting, for sure. sv_analysis.py is basically part of the specification. | 22:27 |
lkcl | markos, https://gist.github.com/zingaburga/805669eb891c820bd220418ee3f0d6bd#file-sve2-md | 22:27 |
lkcl | found ARM's SVE2 Matrix extension | 23:22 |
lkcl | https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/SMOPA--Signed-integer-sum-of-outer-products-and-accumulate-?lang=en | 23:22 |
lkcl | based on "outer-product-and-accumulate" | 23:22 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!