Monday, 2022-09-26

programmerjakeand no it isn't quite RB|0, it does something completely different than just supply zero as an input00:00
programmerjakeI'll add it as a TODO00:12
ghostmansd[m]markos, predicates should work, just try it00:35
ghostmansd[m]You'll likely need -mregnames switch for stuff like m=r3 or dm=r3000:36
*** lxo <lxo!~lxo@linux-libre.fsfla.org> has quit IRC00:58
*** programmerjake <programmerjake!~programme@2001:470:69fc:105::172f> has quit IRC02:42
*** programmerjake <programmerjake!~programme@2001:470:69fc:105::172f> has joined #libre-soc03:00
markosghostmansd, I'm just getting Error: unrecognized mode: 'pred1' on this line:07:57
markosori             pred1, pred1, 0b000100010001000107:57
markos        sv.add/m=pred1  *op, *ip, *ip+307:57
markoswhere ip=10, op=30, pred1=608:05
lkclmarkos, use the predicate by regname explicitly09:22
lkclr3/r10/r31/eq/lt etc.09:22
lkclit's not macro-substitutable09:23
lkclif you want macro-substitution on predicate masks use "gcc -E" or sed09:23
lkcland you can't have sv.add/m=609:23
lkclthis is the list of options:09:25
lkclhttps://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/trans/svp64.py;h=a285d1a317f43d8aad5d3abc703fb12094569102;hb=e8b9ea79376d1fe69ee747e026a366d03ad5d5e6#l44309:25
*** octavius <octavius!~octavius@202.147.93.209.dyn.plus.net> has joined #libre-soc09:26
*** jn <jn!~quassel@user/jn/x-3390946> has quit IRC09:27
*** jn <jn!~quassel@2a02:908:1065:960:20d:b9ff:fe49:15fc> has joined #libre-soc09:28
*** jn <jn!~quassel@user/jn/x-3390946> has joined #libre-soc09:28
markossame error09:32
markosunrecognized mode: '6'09:33
markostried sm= also, same result09:38
*** octavius <octavius!~octavius@202.147.93.209.dyn.plus.net> has quit IRC09:57
lkclmarkos: again, you *cannot* put *anything* other than the exact and precise list of qualifiers listed in svp64.py line 443 source code10:05
lkclm=r310:05
lkclm=~r310:05
lkclm=r1010:05
lkclm=~r1010:05
lkclm=1<<310:05
lkclm=r3110:05
lkclm=~r3110:05
lkclm=eq10:05
lkclm=ne10:05
lkclme=so10:06
lkclm=lt10:06
lkclit is not a register10:06
markosso predicate *needs* to be on these registers? I can't just use any other register10:06
markosok10:06
markosI see10:06
lkclit is a qualifier that happens to contain the word "r3", or the word "r10"10:06
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC10:06
lkclthere are only 3 bits for predicate-masks, a 4th says whether to use the GPR or CRfield10:07
markoswell, r3/r10/r31 are registers, it's a bit confusing10:07
lkclwe can't up an entire 6 bits in a precious 24-bit prefix for one single predicate source10:07
lkcland another 6 bits for a second predicate10:07
lkclthat's 50% of the 24-bits solely and exclusively dedicated to selecting predicate masks10:08
lkcleven Power ISA v3.1's new MMA instructions only have 4-bit for selection of a predicate mask10:08
lkclRVV only has *one* bit "v0 or not to use v0"10:08
markosnah, it's ok, now that I know it and why it is so, I will make a mental note not to use anything else10:09
lkclSVE/2 has only the one register (sometimes *zero* registers!) for predicates10:10
lkclthe alternative would be to have some sort of "tagging" which means a setup and a teardown instruction10:10
lkcl"set the predicate for the next instruction(s)"10:10
lkclwhich gets messy pretty quickly10:10
markosok, it compiles now, but does need -mregnames10:13
lkclsounds about right10:17
lkclprogrammerjake, if you can swap RA and RB in pcdec, the Power ISA and TestIssuer and PowerDecoder2 are all set up for RA|0 (but not RB|0)10:27
lkclyou can then use "if rb_used | (_RA = 0)"10:27
lkcland in HDL that will be a flag in the pipeline's RecordSubset "ra_zero"10:28
lkcl            comb += self.do_copy("zero_a", dec_ai.immz_out)  # RA==0 detected10:29
markoslkcl, is it possible to iterate the indices in sv instructions? ie currently I'm doing:10:53
markossv.add/dm=r10   *t, *ip, *ip+3          # a1 = ip[0] + ip[3]10:53
markos        sv.add/dm=r10   *t+4, *ip+4, *ip+710:53
markosdamn the tabs10:53
markosbut if I can just do a loop and add +4 to the registers, that would be great10:54
markosor... I could just use the predicate for that...10:55
markostrying it now10:55
markosf*cking hell, it works!10:56
markosnever mind, ignore the question10:56
*** jevinskie[m] <jevinskie[m]!~jevinskie@2001:470:69fc:105::bb3> has quit IRC11:00
*** underpantsgnome[ <underpantsgnome[!~tinybronc@2001:470:69fc:105::2:1af6> has quit IRC11:00
*** sadoon[m] <sadoon[m]!~sadoonunr@2001:470:69fc:105::1:f0fa> has quit IRC11:00
*** programmerjake <programmerjake!~programme@2001:470:69fc:105::172f> has quit IRC11:00
*** cesar <cesar!~cesar@2001:470:69fc:105::76c> has quit IRC11:00
*** programmerjake <programmerjake!~programme@2001:470:69fc:105::172f> has joined #libre-soc11:06
markosis there a sv.srdi (shift right immediate)? trying to do:  sv.srdi/dm=r10          *op+1, *op+1, 12        # op[1] >>= 1211:11
*** cesar <cesar!~cesar@2001:470:69fc:105::76c> has joined #libre-soc11:11
*** jevinskie[m] <jevinskie[m]!~jevinskie@2001:470:69fc:105::bb3> has joined #libre-soc11:11
*** sadoon[m] <sadoon[m]!~sadoonunr@2001:470:69fc:105::1:f0fa> has joined #libre-soc11:11
*** psydroid <psydroid!~psydroid@user/psydroid> has joined #libre-soc11:11
*** underpantsgnome[ <underpantsgnome[!~tinybronc@2001:470:69fc:105::2:1af6> has joined #libre-soc11:11
markosactually, this code makes as complain:11:12
markos        # op[1] = (c1 * 2217 + d1 * 5352 + 14500) >> 12;11:12
markos        sv.maddld/dm=r10        *op+1, *op+2, const2, const4    # op[1] = c1 * 2217 + 1450011:12
markos        sv.maddld/dm=r10        *op+1, *op+3, const3, *op+1     # op[1] += d1 * 535211:12
markos        sv.srdi/dm=r10          *op+1, *op+1, 12                # op[1] >>= 1211:12
markosvp8_dct4x4_real.s:44: Error: vector register cannot fit into EXTRA211:12
markosvp8_dct4x4_real.s:45: Error: vector register cannot fit into EXTRA211:12
markosvp8_dct4x4_real.s:46: Error: unrecognized opcode: `*op+1,*op+1,12'11:12
markosany suggestions?11:14
lkclmarkos, see unit tests, grep for "iota"11:34
markosalso how would you do that with asm: + (d1 != 0)11:34
lkcluse "sv.svstate"11:34
lkcl1 sec...11:35
markosfull expression:         # op[4] = ((c1 * 2217 + d1 * 5352 + 12000) >> 16) + (d1 != 0)11:35
lkclok you're limited in the range of regs for the 4-operand instructions (annoyingly) unless you use REMAP "offsets"11:35
markoswhat's the largest offset I can use?11:36
lkclis this with "spacing" on the predicate mask?  like.... r10=0b100010001000100011:36
lkcl811:36
lkclor i think 16 1 sec11:37
markosyes, predicate mask is actually the reverse: 0b000100010001000111:37
markosand VL=1611:37
lkclok then just shift it up11:38
markosah so that index is at 011:39
markosyu[11:39
markosyup even11:39
* lkcl thinking11:39
lkclyes11:39
lkclunfortunately madd* is only 1P otherwise you'd be able to use twin-predication11:39
lkclbut you need to get all the "starting" points onto even-numbered boundaries.11:40
lkclblech, you're doing this fully loop-unrolled, aren't you? :)11:41
lkclblech!11:41
markosyes, unfortunately11:41
markoswell, if11:41
markosif I can get the indices to increase as well I could convert it into a loop11:42
markosand set VL=4 and pred mask to 0b000111:42
markossv.svstep might be what I need11:43
lkclsv.svstep blats a series of indices into a thingy for you.11:44
lkclbut... you know what? you might try using REMAP Indexed-mode11:44
lkcland keep the predicate-mask to 0b0000_0000_0000_1111 (to do 4 operations only)11:44
lkcl1 se11:44
lkclc11:44
lkclhttps://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_svindex.py;hb=HEAD#l22611:46
markosI need to understand how this work11:47
markos+s11:47
lkclhttps://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_setvl.py;h=542e652bb909b9d13eff3c2d752a96b859ffcbcc;hb=101e3a30f90f567eaa2b7f5f7fd2306a04bfcad4#l100711:48
lkclbasically you can create your *own* offsets.11:48
lkclanything you like.11:48
lkclit's misnamed "permutation" pretty much right across board of computer science but of course it's not *actually* a mathematical permutation11:49
lkclhttps://libre-soc.org/openpower/sv/remap/#svindex11:50
lkclso11:50
lkclset up an area of registers you want to use as the offsets11:50
lkclbut only 4 of them11:50
markos4 registers as indices you mean11:51
lkclyes.11:51
lkclthe other 12 all zero11:51
lkcl(because you'll be using a mask of 0b0000_0000_0000_1111 so why bother)11:52
markosok, this needs some more reading to understand how it works11:52
lkclit's like "vperm"11:52
markosI might ping you later11:52
markoss/might/will certainly/ :)11:52
lkcl:)11:52
lkclv3.0C 6.8.4 p25811:53
lkcllook that up first11:53
markosI know how vperm works, I need to get my head around how such behaviour fits in this example :)11:54
*** jn <jn!~quassel@user/jn/x-3390946> has quit IRC12:04
lkclahh :)12:06
lkclhmmm in theeoryyy... you could set up the whole lot as a transposed group12:08
lkclor use the built-in transpose capability (yx=1)12:08
lkclset all the indices to 0123...15 with iota (svstep)12:09
markosI think I'll try svstep first12:10
lkclbut then use svindex SVd=4, sk=0, yx=112:10
lkcljust remember to use setvl first before doing svindex because it uses VL to *compute* the size of the 2nd dimension12:11
lkcl(not enough space for all the parameters, to be able to specify the 2nd dimension as an operand, sigh)12:11
lkclbut by using that, you should be able to issue all 16 sv.madd*s in one instruction.12:13
*** jn <jn!~quassel@2a02:908:1065:960:20d:b9ff:fe49:15fc> has joined #libre-soc12:15
*** jn <jn!~quassel@user/jn/x-3390946> has joined #libre-soc12:15
*** octavius <octavius!~octavius@202.147.93.209.dyn.plus.net> has joined #libre-soc12:16
markosok, I figured out what svstep does, now svindex12:23
markosok, I'm just beginning to realize how powerful REMAP is12:27
markosstill incredibly complex, could use with a few hands-on examples on how to use it12:27
markoslkcl, this is the function I'm trying to convert: https://chromium.googlesource.com/webm/libvpx/+/refs/heads/main/vp8/encoder/dct.c#1512:33
markosrather trivial but that's why I chose it12:33
markosbut I don't see how it can fit the DCT remap, there is no triple loop12:33
lkclblegh. they loop-unrolled it12:47
lkcllines 21-35 are the 4 row-dcts12:48
lkcllines 38-51 are the 4 column-dcts12:48
lkcllines 22-25 are the inner-butterfly12:48
lkcllines 27-28 are the outer butterfly12:48
lkcland 30-31 likewise12:49
lkclprobably with some sqrt(2) divisions (in integer) thrown-in12:49
markosso can I use DCT remap on this or is too hackish an implementation?12:57
*** jn <jn!~quassel@user/jn/x-3390946> has quit IRC12:58
lkclyou can *replace* it with QTY 2x DCT remaps13:00
lkcllines 22-25 *are* the triple-loop inner-butterfly13:01
lkclbut you can't tell that because it's only 4 operations13:01
lkclfor a 2-wide DCT the triple-loop actually degenerates to a single operation13:01
lkclbtw if it is easier, use sv.add which *can* do odd-numbered register numbers (uses EXTRA3)13:04
lkclfollowed by a mulli *=813:04
markosthat's what I'm doing, I've done the conversion up to the point of calculating op[1], op[3]13:06
lkclok. suggest doing similar for now - break it down into sv.add and sv.mulli13:07
lkclreally better off with butterfly-integer ops i feel - the ff* and fd* set - https://libre-soc.org/openpower/isa/svfparith/13:09
lkclbut it's, well, hoo they're expensive.  3-in 2-out13:10
ghostmansd[m]lkcl, actually predicates can be substituted13:47
ghostmansd[m]I added support for it13:47
ghostmansd[m]But yes, numbers are not allowed.13:47
ghostmansd[m]I'm not sure this works for all scenarios, though13:48
markosok, got around that by using mulld+add, however, I'm still having trouble with sv.srdi, I get unrecognlized opcode14:13
markossv.srdi/dm=r10          *op, *op, 1214:13
markosghostmansd[m], ^^14:31
ghostmansd[m]Is srdi an alias?14:55
markosah yes14:55
markosequivalent to : rldicl ra,rs,64-n,n14:55
markosok, I confirm that the original works, still it would be nice to have the aliases work also15:09
ghostmansd[m]Yeah this is one of the things we need to do15:11
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc15:49
*** jn <jn!~quassel@2a02:908:1065:960:20d:b9ff:fe49:15fc> has joined #libre-soc16:00
*** jn <jn!~quassel@user/jn/x-3390946> has joined #libre-soc16:00
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC16:23
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.40.7> has joined #libre-soc16:24
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.40.7> has quit IRC17:08
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc17:11
markosis there a 16-bit multiplication instruction for Power or do I have to just and the result?17:58
lkclno, only 32- and 64-17:58
lkclsigh elwidth-overrides, sigh, still to be done, sigh17:59
programmerjakemulli is technically 64x16-bit multiplication, probably not what you were thinking of though...18:00
markosno, I need to truncate the values to 16-bit as it skews computation and I'm getting different results18:04
lkclyep just AND.18:05
lkclandi18:06
markosright18:06
*** tplaten <tplaten!~isengaara@55d44a52.access.ecotel.net> has joined #libre-soc18:18
tplatenyesterday I had a look at the output path of litedram and gram, today I will have a look at the input path18:21
lkcltplaten, cool.18:37
*** octavius <octavius!~octavius@202.147.93.209.dyn.plus.net> has quit IRC18:47
tplatennext will be comparing the bitslip thing19:12
markosI must be misunderstanding something wrt predicates19:13
markosI have these 2 lines:19:13
markosori                     pred, pred, 0b110011001100110019:13
markossv.mulli/dm=r10         *t2, *t, 221719:13
markospred =r1019:13
tplatenI already found one difference, there is a DELAYG in litedram, with configurable cmd_delay19:13
markoswhat I expect is to have t2 registers as follows: 0, 0, (*t+2)* 2217, (*t+3)*2217, 0, 0, (*t+6)*2217, (*t+7)*2217, etc19:14
markosfor VL=1619:15
markosbut I also get the elements for pos=0, 4, 8, etc19:15
lkclthat's ORing *into* the old value of pred19:30
lkclyou probably want ori pred, 0, 0b110011001100110019:30
lkclcheck a dump of what pred is.19:30
lkclhooraaay after several weeks i have a first pack/unpack unit test running19:32
tplatenin gram the whole bitslip thing seems to be missing19:34
markoslkcl, gaaaaah, it's embarrassing when you're right all the time :D19:34
markosit's been bothering me for the past hour and I was trying to figure out where have I gotten it wrong with all those instructions!19:37
lkclmarkos, the penalty is: i don't do much else :)19:41
tplatenIn gram/common.py I found class BitSlip(Elaboratable), the class is not used anywhere. So I add this to ecp5ddrphy.py19:41
*** tplaten <tplaten!~isengaara@55d44a52.access.ecotel.net> has quit IRC20:09
markosghostmansd[m], lkcl just a curiosity, -mregnames is an optional argument, but it's required for predicates to work, should it not be allowed to use numeric values instead of r3/r10/r31, etc?20:45
ghostmansd[m]Not really. How would you encode `1<<r3` without marking it as a register? `1<<3` makes a totally different meaning to me, nothing related to predicates.20:46
ghostmansd[m]But, well, if we can come up with names different to registers, I can support these.20:46
ghostmansd[m]All other uses for registers assume -mregnames, other than disassembly.20:47
markosghostmansd[m], right20:47
markosis it possible to enable it by default for -mlibresoc then?20:48
ghostmansd[m]Assembly needs it. To me, frankly speaking, allowing `add 0,1,2` is a bigger mistake than forcing -mregnames to allow `add r0,r1,r2`.20:48
markosI don't disagree on this, but people don't always enable it20:48
ghostmansd[m]Yeah it's possible. I kept it for our assembly, which doesn't support register names. :-)20:48
ghostmansd[m]See, that's kinda curriculum uitiosum.20:49
markosfair enough, just a note for future discussion :)20:49
ghostmansd[m]Hm, curriculus vitiosus20:49
markoswhen the bugs arrive from a horde of angry developers that cannot compile their assembly :D20:49
ghostmansd[m]It's been a while with Latin20:49
markosnever studied tbh, I've had my fill of ancient languages with ancient greek :)20:50
ghostmansd[m]Frankly, I'd enforce explicit register names everywhere.20:50
ghostmansd[m]That I had too :-)20:50
ghostmansd[m]Are you from Greece?20:51
ghostmansd[m]Because there's virtually no other reason to learn it20:51
ghostmansd[m]Other than being from Greece or studying the classical philology20:51
ghostmansd[m]I had the latter20:51
ghostmansd[m]And apparently you had the former, eh?20:52
markosyes, but not really20:52
markosancient greek is mandatory for all children right up to the point where they have to choose a direction for their studies20:52
markosso, around 16?20:53
ghostmansd[m]Wow, really?20:53
ghostmansd[m]This is really cool20:53
markosyes, that doesn't mean everyone can read Plato, because the way it's taught is pretty sterile20:53
ghostmansd[m]Well to me ancient Greek was one of the most difficult languages I ever encountered20:53
markosit is difficult even for us20:53
ghostmansd[m]Well likely you study some kind of koine20:53
ghostmansd[m]Xenophon perhaps too20:53
ghostmansd[m]We started with Anabasis20:53
markosXenophon, Plato, Lysias, Herodotus20:54
ghostmansd[m]This is kinda "classic of classis"20:54
markosand some Homer but that one is really hard20:54
ghostmansd[m]Homer is extremely difficult20:54
markosit's archaic20:54
ghostmansd[m]Hell they even had digamma these days20:54
ghostmansd[m]I mean, if you try to keep the rhyme20:54
markosyes, I have the book of Iliad and I'm reading it and it's taking me ages to finish a single page20:55
ghostmansd[m]You know in his times they had it20:55
markoswith the translation next even :D20:55
ghostmansd[m]Well Odyssea is simpler20:55
markostrue20:55
ghostmansd[m]Anthra moi ennepe Mousa20:55
markosPolytropon os mala polle20:55
ghostmansd[m]Planthe epei Troies20:55
markosor something similar, my memory is not as good20:55
markoshahaha20:55
ghostmansd[m]No that was correct20:56
markosok you probably know more than I do20:56
ghostmansd[m]You just forgot "ton" before polytropon :-)20:56
ghostmansd[m]But hey, still correct20:56
ghostmansd[m]I still recall the beginning of Anabasis20:56
programmerjakemy dad learned ancient greek and hebrew as part of school for becoming a pastor because that's what the bible was mostly originally written in...so there are other reasons to learn ancient greek20:57
ghostmansd[m]Dareiou kai Parysatidai gignontai paides dyo, presbyteros men Artaxerxes, neoteros ho de Kyros20:57
markosthankfully, the bible has much simpler form of ancient greek, hellenistic20:57
markoswhich is much closer to the modern syntax20:57
markosand I can read it without much effort20:58
ghostmansd[m]Yeah this is kinda even more stable than koine20:58
markosI have a whole bookshelf of such books20:58
lkclwe studied both at stonyhurst college (roman catholic jesuit boarding school) but honestly i sucked at both of them20:58
markosI am grateful for learning even that little ancient greek that I know, because they give you a strong basis for etymology20:58
ghostmansd[m]But yeah, Greek is hard, no joking. All these time forms.20:59
ghostmansd[m]I literally hated aoristos and perfectum forms.20:59
ghostmansd[m]IIRC we had to learn 6 forms for some verbs.20:59
ghostmansd[m]I mean, 6 initial forms.20:59
markosaoristos is the coolest form tbh, or should I say was :)20:59
ghostmansd[m]Not even mentioning persons and plurals.21:00
programmerjakewell, imho learning to read/write chinese/japanese is waay harder...21:00
markosyes, it becomes impossible to remember21:00
ghostmansd[m]Well we had aoristos in Slavic languages too.21:00
ghostmansd[m]This is really cool form.21:00
markosprogrammerjake, japanese is a very simple language, the writing is hard, but there is literally one single form for a verb21:00
markosit's hard because you have to learn a few thousand symbols21:00
programmerjakehence why I said reading/writing, not speaking21:01
markosbut it's pretty simple in terms of number of forms, want to transform a phrase into a question? add 'ka' at the end21:01
ghostmansd[m]Well ancient Greek helps a lot with writing modern Greek correctly.21:01
ghostmansd[m]All these sounds you converted to sound like "i"...21:02
markosI'd say japanese is the closest thing to state machine. Person + verb + time , there you go21:02
markosa single word for each :)21:02
ghostmansd[m]All these oi, ei, i, H, y...21:02
programmerjakewell, if you think japanese is simple, try learning formal japanese...i know a bit due to watching too much anime, but formal japanese is nearly incomprehensible...21:02
markosprogrammerjake, not saying it's easy, far from it21:03
markosbut it's easier than eg. ancient greek or even german21:03
markosexcluding the writing difficulties21:03
markosI've actually just enrolled to start japanese in a few days21:03
programmerjakefor me, chinese is even harder, because my brain is programmed to mostly ignore tone21:04
markosprogrammerjake, no argument there :)21:04
markoswhat is it, 70k glyphs?21:04
markosand 7 variations of 's'?21:04
markosmy oldest son wants to learn chinese, and my youngest russian21:05
programmerjakeand chinese has distinctions that I think of as exactly the same sound...qi vs chi and xi vs shi...21:05
markosI chose japanese and my wife arabic, with english we have covered pretty much the world population :D21:05
programmerjakewell, if you just choose spanish that covers like half the world anyway...21:06
markosone of the kids wants to learn spanish eventually, but they're simple to learn, at least just to communicate21:06
ghostmansd[m]Russian is difficult too21:07
markosghostmansd[m], but it's still easier than chinese :)21:08
markoswhen we see something incomprehensible, we say 'it looks chinese to me' - rest of the world says 'it looks greek to me' but obviously we can't use that :)21:08
lkclyyeah the spanish version of Fawlty Towers had Manuel be portuguese... :)21:11
markoshahaha21:11
markos'never mention the War!'21:11
jn"Das kommt mir Spanisch vor" -- when germans don't understand a text21:13
ghostmansd[m]markos, well we also say "Chinese", but this is not related to knowledge of Greek :-)21:16
*** octavius <octavius!~octavius@202.147.93.209.dyn.plus.net> has joined #libre-soc21:24

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!