Friday, 2022-10-14

markoslkcl, ... and done, all values are reproduced, only thing that remains is getting the max and position, will commit now all work so far just in case my disk decides to die on me while I sleep :D01:15
markosref cost:01:15
markos04858917 05cf5742 021c7323 01c68c56 05931132 03de109a 02f8e489 00f02d4b01:15
markosreg 24 04858917 05cf5742 021c7323 01c68c56 05931132 03de109a 02f8e489 00f02d4b01:15
markosgn, ttyt01:17
programmerjakeyay!02:24
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc07:23
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC07:27
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC08:22
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.42.179> has joined #libre-soc08:22
lkclmarkos, sorry - here's the syntax examples https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/trans/test_pysvp64dis.py;h=808b3b504a4a1192efce066ac4d16256a59b3896;hb=a65084c24742b43e79da714e5cd08f0d24a83eab#l31208:58
markosthanks, it will have to wait for maxs impklementation first :)09:01
lkclalthough i'm not sure why RC1 is necessary (annoyingly), /ff=eq should be fine09:02
lkclack09:02
lkclno you don't need it.09:02
lkcli sent the sequence yesterday.09:02
lkclhttps://libre-soc.org/irclog/%23libre-soc.2022-10-13.log.html#t2022-10-13T23:05:4109:03
markosmaybe it's something in binutils09:03
lkclyou don't need failfirst09:04
lkclfailfirst is an optimisation to terminate at the first compare that's equal09:04
markosindeed09:04
* lkcl thinks do you need the mapreduce max?09:04
lkclyes. sorry09:05
lkclit's in sv/trans/svp64.py09:05
lkclhttps://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/trans/svp64.py;hb=HEAD#l48409:06
lkcltry a vertical-first loop instead09:08
markosbinutils is giving me unrecognized opcode for maxs, I'll wait for ghostmansd[m]09:09
lkclhttps://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_setvl.py;h=058a6be794a7416c0732b2551c92ad3a79bce45b;hb=a65084c24742b43e79da714e5cd08f0d24a83eab#l93409:09
markosI'm confused, vertical-first loop to replace what?09:10
lkclwhen i'm a little more awake i'll do a CTR-based thing.09:10
lkclnot replace.09:10
lkclget.09:10
markosto get the position of max?09:10
lkclyes09:10
markosack09:10
lkclyou can extract the index by conditionally using an svstep to extract the srcstep *in the middle* of the vertical-loop09:12
lkclyou remember how sv.svstep extracts a *vector* of srcsteps, to create a sequence 0,1,2,3,.... ?09:12
markosI remember how I can use svstep to generate the sequence to some registers yes09:13
lkclthere's nothing stopping you from extracting *the current* srcstep when using vertical-first mode, in exactly the same way09:13
lkcluse it exactly like you would a standard scalar loop09:15
lkcltmp_idx = -109:15
lkclmax_val = -109:15
lkclfor ....09:15
markosok, I have to see the code that does it, I am with very little sleep right now and my comprehension is limited, coffee hasn't kicked in yet :/09:15
lkcl   if array[i] > max_val: { tmp_idx = i; max_val = array[i] }09:15
lkcl:)09:15
lkclyeah i am not awake either09:15
lkclghostmansd[m], can we first rebase in the ldst-postinc branch?09:49
ghostmansd[m]lkcl, does it have many changes?09:51
ghostmansd[m]If so, rebase it with master, I'll rebase my local stuff too09:52
lkclack09:53
lkcldone.  no, not a lot09:53
ghostmansd[m]Ok just rebase it, I'll integrate these changes too10:16
lkclon the "reserved" area in ld/st-imm i added two new mode-bits10:17
lkclonly one is tested so far (/pi).10:17
lkcl /lf is still TODO but is low priority10:18
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.42.179> has quit IRC10:30
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.168.84> has joined #libre-soc10:30
markosthis is the snippet I've written for max/pos:11:05
markos        setvl                   0,0,8,0,1,1                     # Set VL to 8 elements11:05
markos        #sv.maxs/mr             max, max, *cost11:05
markos        sv.cmp/ff=eq            0, 1, *cost, max11:05
markos        svstep                  retval, 5, 011:05
markossv.maxs is commented out as it's not supported yet11:05
lkcli'm currently investigating, there's a bug in vertical-first mode when used with predication11:06
markossv.cmp/ff=eq also fails with binutils11:06
markosError: ffirst BO only possible when Rc=111:06
lkcli said don't use fail-first11:06
markosok, what happens if 2 values are equal to max?11:06
lkcllet me write the vertical-first loop11:07
lkcland fix the bug11:07
markosnot really relevant to this particular unit test but curious11:07
lkclyou want the index of the maximum element, i have to fix this bug11:07
markosok11:07
lkclwhen the predicate-bit is ok, things are fine.11:14
lkclwhen it's not ok, "skipping" occurs (which it should not), which triggers "end of loop"11:15
lkclngggh11:15
lkclthat's only supposed to happen in horizontal-mode11:15
lkcldamn11:23
lkclgoing to take a lot more to sort out11:23
lkclmarkos, nuts to it.  can i suggest simply putting in the RFP now.11:28
lkcli'm closing the bugreport now11:29
markosI'll submit the updated code11:30
lkclbugreport is now closed. task declared completed.  i'm putting in an RFP now.11:31
markos:D11:33
lkcldone11:34
*** octavius <octavius!~octavius@249.147.93.209.dyn.plus.net> has joined #libre-soc11:40
markoslkcl, just added a comment11:44
markoswith some explanation11:44
markoslet me know if I should add anything more11:44
markossent RFP as well11:48
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc11:50
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC11:54
lkclbrilliant, confirmed12:03
markosif we had one more week, I'd do the mp3 as well :)12:10
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.168.84> has quit IRC12:45
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc12:45
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC12:58
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.168.123> has joined #libre-soc13:01
lkclmarkos, i got it - the problem is using predication (at all) within vertical-first14:11
lkclby using isel as a substitute for sv.max i managed it14:11
lkclcommitting in 1 sec14:12
lkcl 985         lst = SVP64Asm(["setvl 0, 0, 5, 1, 1, 1",14:17
lkcl 986                         'sv.cmp 0, 1, *4, 14', # r8 contains the temp14:17
lkcl 987                         'sv.isel 14,*4,14,1', # copy if cmp was greater14:17
lkcl 988                         "svstep. 12, 6, 0", # get srcstep14:17
lkcl 989                         'sv.isel 10,12,10,1', # copy if cmp was greater14:17
lkcl 990                         "svstep. 0, 1, 0", # svstep (Rc=1)14:17
lkcl 991                         "bc 6, 3, -0x24" # branch to cmp14:17
octaviuslkcl, is there a way to set up an ubuntu chroot? I wanted to try setting up tasyagle using ubuntu instead14:22
octaviuswithin a host debian that is14:22
markosdebootstrap/vmdebootstrap14:24
markos... which is now vmdb214:24
markosbut for a plain chroot debootstrap should work just fine14:25
markoslkcl, why the branch, does it repeat until done?14:27
markoswhy not use sv.max?14:30
lkclyyep.14:39
lkclstep14:39
lkclcompare-branch14:39
lkclstep14:39
lkclcompare-branch14:39
lkclstep ... end-reached (srcstep == VL-1): EQ-bit set to 114:39
lkclcompare-branch fails, EXIT14:40
lkclbecause _you_ cannot use sv.max right now.14:40
lkclplus, the same sv.cmp can be used for both isel()s14:40
lkclusing the same trick, it avoids the need for a branch-jump over a copy of the current-max-index and the current-max-value14:41
lkclhm we need "max." to actually perform a cmp.14:42
lkcland maxs. and min. and mins.14:42
lkclyou did say you needed an _array_-based max-detection-and-index-of?14:43
markosyes14:46
lkclok then you can use that as a recipe15:01
lkclyou may have to tweak it a bit so it definitely uses ">" rather than ">="15:02
lkclotherwise if there are duplicates it'll get you the *last* element's index rather than the first15:03
lkclwhich might involve some "crands" because sv.isel can only do testing on one bit15:03
lkclcrands or crors15:04
lkclthe cmp produces a CR0 ("sv.cmp 0,..." targets CR0) of EQ,LT,GT15:05
lkclbut if you wanted GE you have to OR the EQ and GT bit together15:05
lkclmarkos, all those batches have me drumming my fingers :)15:30
lkcl 203         setvl                   0,0,4,0,1,1                     # Set VL to 4 elements15:30
lkcl 204         sv.add                  *psum_alt+0, *psum_alt+0, *img+015:30
lkcl 205         sv.add                  *psum_alt+1, *psum_alt+1, *img+415:30
markosyes, I know they are not optimal15:30
lkcllike, that's exactly what REMAP is supposed to *not* have to have :)15:31
markosI really should get a better grasp of remap15:31
lkclbut, it'll need some thought and evaluation as to what would be needed15:31
markosbut I can do that gradually and convert the function to perhaps half the size?15:31
lkclwell i want to put a specific task/budget on it15:31
lkclam making notes, now15:32
markoswhat I would really like to do is get a cookbook started, with SVP64 best practices, eg. when you have this C source, here's how you can do it in SVP64, etc15:33
lkclyehyeh15:33
lkclraise a bug about it, link it to #95215:33
lkclthen it goes on the list to get some EUR for doing it15:34
markoscomponent: source code or website?15:35
lkclwebsite15:36
lkclfor no particular reason :)15:37
lkclfrickinell that's 13 bugreports already linked to #952 in only 10 minutes of looking15:37
markoshow to link?15:38
markos#95315:38
markosblocks or depends on?15:38
lkclermm...ermermerm... blocks15:38
lkcloh btw, all of these?15:39
lkcl 259         setvl                   0,0,8,0,1,1                     # Set VL to 8 elements15:39
lkcl 260         sv.lha                  *img, 0(ptr_copy)               # Load 8 ints from (ptr_copy)15:39
lkcl 261         add                     ptr_copy, ptr_copy, stride      # Advance ptr_copy by stride15:39
lkclthere *is* a way to do those in a loop15:39
lkclbut it needs something called "hphint" in Vertical-First Mode to be implemented, first15:39
markosI understand setvl also provides a stride also?15:39
lkcl"hphint" is like a hybrid Vfirst-Hfirst15:39
lkclno it doesn't15:39
lkclbut there's a planned "horizontal-parallelism hint" for VF mode15:40
markosok15:40
lkclbasically what that does is, it says15:40
lkcl"yes i know we're in Vertical-First Mode, but you're allowed to do up to N elements in *horizontal* batches"15:40
lkclit's primarily for when you have loops like this:15:40
markosyup, that's stride mode there15:41
markosused in pretty much *all* video codecs15:41
lkclfor i in range(VL): mem[i+2] += mem[i]15:41
lkclwhere a programmer knows that you can do up to *two* elements - safely - in parallel15:41
lkclwithout memory-corruption15:41
markoscool15:42
lkclbut... it's... complicated by the fact that if implemented naively, we'd need to store *even more* state in SVSTATE15:42
lkclyet more indices (yet more srcsteps, dststeps, and sub-steps)15:42
lkclso i really have to think about it, first15:43
markossure, I think now that the deadline is (almost) over, we can sit back and think things more carefully now15:43
lkclyes. thank goodness15:47
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.168.123> has quit IRC16:56
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc17:00
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC17:11
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.164.157> has joined #libre-soc17:13
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.164.157> has quit IRC17:23
*** octavius <octavius!~octavius@249.147.93.209.dyn.plus.net> has quit IRC17:29
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.164.121> has joined #libre-soc17:44
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.42.162> has joined #libre-soc17:45
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.42.162> has quit IRC17:53
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc17:57
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc17:57
ghostmansdmarkos, lkcl, https://bugs.libre-soc.org/show_bug.cgi?id=95418:06
ghostmansdWhilst we're here, I'll raise tasks for the rest of them, first being https://bugs.libre-soc.org/show_bug.cgi?id=95518:06
ghostmansdhttps://bugs.libre-soc.org/show_bug.cgi?id=95618:09
ghostmansdhttps://bugs.libre-soc.org/show_bug.cgi?id=95718:12
lkclghostmansd, ack started cross-referencing18:12
ghostmansdhang on18:12
ghostmansdI'm also creating a parent task :-)18:12
ghostmansd1 sec18:12
ghostmansdthis is meta task: https://bugs.libre-soc.org/show_bug.cgi?id=95818:14
ghostmansdshoulda started with it but only then found all these reside in av.mdwn18:15
ghostmansdlkcl, are these covered by cavatools? I remember you mentioned that we're able to reserve some budget for binutils from cavatools.18:15
lkclyehyeh18:16
ghostmansdOn one hand, I'm pissed by the fact that we still don't generate these; on the other, adding several instructions manually is way less work than rewriting the whole binutils logic with the tables (this is quite hard-coded).18:16
lkclirony...18:17
ghostmansdI'd like to finish binutils as quick as possible, since most of the time is needed for cavatools.18:19
ghostmansdI don't want us to end up in the situation when we lack time to complete cavatools.18:20
lkcli've 4 students from india considering helping18:20
lkclfrom the OpenPOWER Academic Working Group18:21
ghostmansdI mean, I get the benefits of generating this, but would rather prefer doing this in some separate way, this is a huge task, mostly consisting of boring cross-checking how much stuff is broken when we migrate to autogeneration (spoiler: I expect a lot).18:21
lkclthey are 3rd year UGs so have done a compiler course18:21
ghostmansdThat'd be great!18:21
ghostmansdI'll add min/max routines now, for markos.18:22
ghostmansdI hope it won't take much time, hopefully this is less complicated than svshape2. :-)18:22
lkclyehyeh :)18:22
markosghostmansd, thanks, I think we are now going to be tackling issues at a much leisurely pace -not lazy but we will not have to stay up till 3am to get it running :)18:23
markoss/much/much more18:23
markoslkcl, make sure they know their stuff18:24
ghostmansdDon't worry, coding at 3am is my natural modus operandi :-D18:24
markoslast year I hired an MSc from CompSci who claimed to know his C and Linux, turns out I had to teach him C data types and that C/C++ is NOT the IDE Visual Studio18:25
* lkcl facepalm18:25
markosafter teaching him 3 hours/day *every* day to get him at a good state, he told him I should double his salary or he would leave because he found another better job abroad18:26
lkclcheeky bugger18:26
markosturns out I was basically paying him to train him18:26
markosnever again18:26
lkcli hope you said "no"18:26
markosofc18:26
markoswas my fault, I was against technical tests on people18:27
markosbecause I hate them myself, find them wrong in principle18:27
markosso when someone claims on their CV that they have 3/5 skills in C/C++ I take that as the truth18:28
markosI mean 5/5 is for Stroupstroup -or what's his spelling anyway18:28
markosor for LLVM engineers18:28
markoswith modesty I might grade myself a 3/5, maybe 3.5/5 if I'm being extremely generous, definitely not a 5/518:29
markosso I took this guy on his honesty18:29
markosboy was I mistaken18:29
markosI mean, I know they don't teach C and such languages to such depth in university anyway18:30
markosnot like they used to18:30
markosso I expected it to be a case of some training18:30
markosbut the lack of knowledge in pretty much everything was obvious after the first couple of months18:31
markosproblem is I'm also a nice guy, I thought "oh but he's a good kid, I can just train him and he'll be fine"18:31
markosthat's why I'm not a good business man on that aspect, I'm too sentimental18:32
lkclfunny how you actually need to be quite pathological18:32
markosso after 9 months of training, he *finally* was able to do a simple task18:32
markosand just when I thought "finally, we can do business"18:33
markos"hey, I'm leaving for Germany found a job there as a consultant, unless you double my salary"18:33
markossorry for the rant18:33
lkclhey not a problem :)18:33
markosbut you touched a sensitive cord there18:34
markosthere is no fucking chance I'm hiring anyone in the future unless I pass them through a painstaiking technical test myself, not just some random online crap18:34
markosthis guy just screwed the next ones18:34
lkcldoes anyone know how to specifically disable locking on glibc6?18:38
lkcli'm trying to get a simple systemcall on putchar18:38
lkclbut there's a massive stack of sys_futex syscalls wrapped on pretty-much-everything18:38
ghostmansd> turns out I was basically paying him to train him18:39
ghostmansdlol, best deal ever!18:39
ghostmansdIt's a deal, it's a steal, it's the sale of the fucking century!18:40
ghostmansdSorry markos, I could not resist to recall this Lock, Stock and Two Smoking Barrels quote18:42
ghostmansdhttps://www.youtube.com/watch?v=zxkfG7D42C818:43
markosis it a TV series or a movie? looks funny18:45
lkclit's a film.18:46
lkclif you can, get the uncensored version18:46
lkclthere were a *lot* of complaints about some of the scenes.18:46
lkcli watched the original at the cinema18:46
markosI always get the uncensored versions anyway18:46
markosseems like the censored are mostly for the US anyway right?18:47
lkclthe original had a greyhound-hare chase18:47
lkclnooo, this was a hare being killed by a greyhound18:47
markosah, a real chase?18:47
lkclyes18:47
markosoh that's a first18:47
lkclit was absolutely astounding slow-motion filming18:48
markosso not possible to claim "no animals were harmed during filming"18:48
lkclnnope18:48
markosI can imagine the outrage18:48
lkcland denial of reality, yes...18:48
markosand not accidental, ie like in a documentary or just happened to get into the camera FOV while we were filming :)18:49
markos"we were just filming this scene your honour and the f'cking hare just came out of nowhere and Bob's greyhound went berzerk and started the chase!"18:50
lkcli can't find an original cinematic version of galaxy quest, either18:52
lkclsigourney weaver *actually* said "well f*** that!!" - which was overdubbed :)18:52
lkclif you look carefully and try lip-reading, you can tell what she was really saying :)18:53
* lkcl found what i was looking for, had to use write(STDOUT_FILENO) instead18:54
lkcli want a systemcall write()18:54
*** jab <jab!~jab@user/jab> has joined #libre-soc19:14
*** jab <jab!~jab@user/jab> has quit IRC19:42
*** jab <jab!~jab@user/jab> has joined #libre-soc20:04
ghostmansdSorry folks, I committed some patches into master which were not ready yet. Should be fine now, but, please, let me know if you have issues.20:15
*** octavius <octavius!~octavius@249.147.93.209.dyn.plus.net> has joined #libre-soc20:26
lkclghostmansd, whoopsie, unit-test-running time20:43
programmerjakewell, turns out ghostmansd's latest commit left nothing new broken:21:03
programmerjakeFAILED src/openpower/sv/trans/test_pysvp64dis.py::SVSTATETestCase::test_26_sv_stq_vector_name21:03
programmerjakeFAILED src/openpower/sv/trans/test_pysvp64dis.py::SVSTATETestCase::test_4_sv_crand21:03
ghostmansdThese failed before21:03
ghostmansdso relief!21:03
ghostmansdthanks programmerjake!21:04
programmerjake= 2 failed, 336 passed, 75 skipped, 19 xfailed, 748 warnings in 1793.30s (0:29:53) =21:04
programmerjakeci did all the hard work: https://salsa.debian.org/Kazan-team/mirrors/openpower-isa/-/jobs/337825821:05
programmerjakelkcl, since stdup is supposed to be post-increment, you need to revert the MEM(EA, 8) argument back to ea21:11
*** octavius <octavius!~octavius@249.147.93.209.dyn.plus.net> has quit IRC21:27
lkclprogrammerjake, ermermerm... that's a rebase error. good catch21:36

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!