Thursday, 2022-09-08

*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC00:17
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc00:22
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC01:54
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc02:09
sadoon[m]<programmerjake> "in theory mounting a ext4..." <- True, I'm constructing a disk image which should work fine05:06
ghostmansd[m]lkcl, could you, please, remind, why do we "split" registers by type (s and d) in remap?06:38
ghostmansd[m]The records are in form s:X;b:Y.06:39
*** ghostmansd <ghostmansd!> has joined #libre-soc06:40
*** ghostmansd <ghostmansd!> has quit IRC07:06
*** ghostmansd <ghostmansd!> has joined #libre-soc08:14
ghostmansdlkcl, I've started checking svshape208:59
programmerjakelkcl: discovered we forgot fp min/max, see #92308:59
ghostmansdMy first remark is, offs is a bad name, it's way too broad and likely will interfere with something in binutils.09:00
programmerjakei'm going to try to get to bed sorta on-time, so gn all09:00
ghostmansdHow about SVo or SVO?09:00
ghostmansdI'll use this form in binutils for now.09:05
ghostmansdIt's a pity that 23 bit for yx ix occupied...09:07
ghostmansdWe split XO field. This is not something traditionally present; I only managed to find XX3 instructions from VSX which do it. I suspect this was already discussed and this is the only option, but still: could this be contiguous?09:26
ghostmansdSVM2 has this form: |0     |6     |10|11      |16    |21 |24|25 |26    |31  |09:34
ghostmansdThere's nothing though which uses bit 31. Is it reserved for future extensions?09:34
programmerjakesetvl uses that for Rc iirc...maybe it's reserved for consistency?09:36
programmerjakei'd have to look09:36
ghostmansdPerhaps. I think it's just and adopted from some other insn.09:43
ghostmansdOr reserved for future extension purposes.09:43
ghostmansdWe also lack all bitmanip instructions from opcode 22. Should these be added, too?09:50
ghostmansdAlso, do we have some way to identify whether the 32-bit instruction is present only in SVP64? I guess it's "unofficial" field, right?09:53
*** markos <markos!> has joined #libre-soc11:45
lkclghostmansd, "s:" means "source" and "d:" means "destination".  source will be in1/2/3 or CRin, dest will be out1/out2 or CRout *or* CR0/CR1 (for Rc=1)11:59
lkclprogrammerjake, yep indexed/offset makes sense for DSP/AV level12:00
ghostmansd[m]I know what these mean :-)12:01
ghostmansd[m]What do these affect?12:01
lkclahh :)12:01
ghostmansd[m]I always traverse all operands12:02
ghostmansd[m]Regardless of in/out12:02
lkclok, so this needs some explanation of Twin-Predication12:02
lkclyou know about predicate masks in other Vector/SIMD ISAs?12:02
ghostmansd[m]Not at all12:02
lkclso you have a sequence of operations (on elements)12:02
lkclyou want to skip some of them12:02
lkclhow would you do that?12:02
ghostmansd[m]That's why I asked for high level introduction to common vector ISA concepts12:02
ghostmansd[m]What do you mean by skip?12:03
ghostmansd[m]Let's perhaps choose some insn, say add12:03
lkclnot write out element 5 for example12:03
lkclsv.add/pm=r3 *RT,*RA,*RB12:03
lkclwhere VL=812:03
lkclat *runtime* you know that you *do not* want to write out to element 5.12:03
ghostmansd[m]And let's literally translate what we mean here?12:04
lkclyou want it left alone12:04
ghostmansd[m]What does this insn do?12:04
ghostmansd[m]Add all elements from vector RA to vector RB and put to RT?12:04
lkclwhen VL=8 it *conditionally* performs 8 adds12:04
ghostmansd[m]So we have vec of 8 elements12:04
lkclchecking *each bit* of r3 to decide whether the *element* operation is allowed to be carried out12:05
lkclfor i in range(VL):12:05
lkcl   if r3[1<<i]: GPR(RT+i) = GPR(RA+i) + GPR(RB+i)12:05
lkclthe crucial bit is that *conditional* check12:05
ghostmansd[m]Aha, good12:05
lkclin SIMD ISAs this is termed "PredicatedSIMD"12:06
ghostmansd[m]So that's mask what to check12:06
lkclit's absolutely standard fare for pretty much every single 3D GPU on the planet.12:06
ghostmansd[m]If bit X is set, do op on bit X12:06
ghostmansd[m]It's not like that I used it :-)12:06
lkclif bit X is set do op on *element* X12:06
ghostmansd[m]Seriously, I lack fundamental concepts12:06
ghostmansd[m]Oops, yes, element X12:06
ghostmansd[m]Sorry, I meant this12:07
lkclnot a problem, it's pretty easy to explain.12:07
lkclso the next step up from that - which you will *not* find in any other ISA - is: Twin-Predication12:07
ghostmansd[m]Aha, OK12:07
lkclfor that you have now effectively *two* for-loops, twinned/entwined inside each other12:07
lkclone predicate mask is for the SOURCE operands12:08
lkclthe other predicate mask is for the DESTINATION operands12:08
lkcland they *both* can "skip", completely independently12:08
lkcllet's say that we have sv.add/sm=r3/dm=r10 *RT,*RA,*RB12:08
lkclthat VL=412:08
lkclthat r3=0b011112:09
lkcland r10=0b110112:09
lkclthe *SOURCE* indices will be12:09
lkclhang on12:09
lkclthe for-loop to VL will go12:09
lkcl[0 1 2 3]12:09
lkclthe SOURCE indices will be12:09
lkcl[0 1 2 X] where X is "skip"12:10
lkclthe DESTINATION indices will be12:10
lkcl[0 X 2 3] (again X is "skip")12:10
lkclwhich thunks down to12:10
lkclsrc: [0 1 2]12:10
lkcldst: [0 2 3]12:10
lkcland thus you will have the following operations performed12:10
lkclADD r0,r0,r012:11
lkclADD r1,r2,r212:11
lkclADD r2,r3,r312:11
ghostmansd[m]That's really cool12:11
lkclit basically combined "VGATHER" and "VSCATTER" into one12:12
lkclthe src has been "GATHERED"12:12
ghostmansd[m]Funny though that someone who has that little knowledge about vectors does incorporate their support to binutils, eh?12:12
lkclthe dst has been "SCATTERED"12:12
lkcluh-huhn :)12:12
lkclwell it's surprisingly little actually needed, because it's just data-manipulation12:13
lkclok so are you with me so far?12:13
ghostmansd[m]Ok, so we basically need these for predicates?12:13
ghostmansd[m]Because, well, remap doesn't use it12:13
ghostmansd[m]I mean src/dst12:13
lkclfor supporting the src-indexing and dst-indexing in hardware and in Simulators as a Finite State Machine12:13
lkclyou need *TWO* separate indices12:13
lkclone called srcstep12:14
lkclthe other called dststep12:14
lkclyou can't possibly do Twin Predication without keeping track separately of the amount that src is incrementing (or skipping) - where src is in the for-loop12:14
lkclfrom dst12:15
lkclfinally to answer your question12:15
lkcld: refers to the need to use *dststep*12:15
lkcland s: refers to the need to use *srcstep*12:15
lkclin hardware.12:15
lkclwell... ahh... you _say_ REMAP doesn't need/use predication... but... ah... :)12:16
lkclit's emerged very recently that the different REMAP types might need it. but it will be in SPRs.12:16
lkcli take it you are presently running around screaming as your brain has melted12:18
*** ghostmansd <ghostmansd!> has quit IRC12:27
ghostmansd[m]Aaaaah no please not yet another remap journey12:37
ghostmansd[m]But other than that... it's really cool12:37
*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc12:42
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC12:44
*** ghostmansd[m] <ghostmansd[m]!> has quit IRC12:50
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has joined #libre-soc12:53
lkclbrain-overload is a common syndrome with SV. including me.12:54
lkclit's only sheer bloody-minded persistence of development/review of accumulated concepts that's got this far12:55
lkclso, ghostmansd[m], transcendentals is probably the most sensible thing to look at.12:55
lkclit's pretty boring / by-the-numbers12:55
lkcl(which is a refreshing change)12:56
ghostmansd[m]Ok will check13:03
ghostmansd[m]lkcl could you please also check this thread too? I think I will complete this today.13:04
lkclSVo is great.13:05
ghostmansd[m]This task to me is like hanging to an investigators13:05
ghostmansd[m]Ok I'll update it when I get to it13:09
ghostmansd[m]Don't waste time on this :-)13:10
lkclif you can add these assertions into binutils-svshape13:10
lkcl    assert SVrm not in [0b1000, 0b1001], \13:10
lkclthen that's one task closed and an RFP can be submitted13:10
ghostmansd[m]Sure, this will need a custom callback IIRC13:13
ghostmansd[m]I will get to this in the evening13:13
*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has quit IRC14:26
*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc14:27
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has quit IRC15:15
*** ghostmansd[m] <ghostmansd[m]!> has joined #libre-soc15:16
*** ghostmansd[m] <ghostmansd[m]!> has quit IRC15:19
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has joined #libre-soc15:19
*** tplaten <tplaten!> has joined #libre-soc16:00
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has quit IRC16:24
*** ghostmansd[m] <ghostmansd[m]!> has joined #libre-soc16:27
*** ghostmansd <ghostmansd!> has joined #libre-soc16:43
ghostmansd[m]lkcl, I'm confused. Task 911 is about svshape2, but there's no SVrm field at all. Do you mean another task?16:45
*** octavius <octavius!> has joined #libre-soc17:17
*** markos <markos!> has quit IRC17:28
*** markos <markos!> has joined #libre-soc17:41
lkclghostmansd, sorry - svshape2 sits *inside* the two reserved values of svshape's SVRM=[0b1000, 0b1001].17:43
lkcltherefore it is *svshape* that must be checked to stop ob1000 and 0b1001 from being allowed17:43
lkcl(and svshape2 must set the XO to 0b100---1somethingsomething]17:44
ghostmansdsvshape needs changes17:52
ghostmansdHoly fuck how I hate MSB017:52
ghostmansdSooo annoying to count these bits and at the same time convert ranges to mask17:53
ghostmansdYou know what? I won't do it in the future. This is already a sufficient reason to generate the code.17:53
ghostmansdThe split XO makes it particularly difficult.18:00
ghostmansdOnly if I put the svshape2 opcode before svshape it works in disassembly.18:01
ghostmansdI will keep it like this and add checks in callbacks.18:02
ghostmansdJust to clarify. The field SVrm is present only in svshape for now, and I will ban 0b1000 and 0b1001 for this field entirely.18:03
ghostmansdIn the unlikely scenario that we have SVrm in another instruction where these values are permitted, we'll have to introduce another field, preferably named differently. I already had to rename yx field to yx10, due to the fact binutils already have yx field with different description.18:05
ghostmansdLuckily we're not the first to employ this trick.18:06
ghostmansdOK I think this is done, I'll submit the patch. It's tricky a bit due to split XO field and these 0b1000/0b1001 overlaps, but otherwise is pretty straightforward.18:36
lkcli have no problem renaming yx to something else (xy?)18:38
lkcland there was that other one, SVo18:38
ghostmansdCalled it yx1018:38
* lkcl head spinning writing an rfc18:39
ghostmansdSVo is renamed, I'm almost done with the patch for openpower-isa, but I'll push it to standalone branch to make sure nothing's broken.18:39
ghostmansdFor yx10, we're not the only ones playing such dirty games with idiotic names, so I think we're safe.18:40
lkclmmm i'd just rather not have them18:40
ghostmansdHm. Luke lists also inv field.18:41
ghostmansdSo the range for SVo is not 6..9, but rather 6..8.18:41
ghostmansdI think fields.text is the higher priority and the docs are simply not updated yet. Is this assumption correct?18:42
ghostmansdHow about calling yx for SVM2 SVM2d?18:44
ghostmansdOr SVM2yx18:44
ghostmansdChose SVM2yx for now18:49
ghostmansdlkcl, openpower-isa branch svshape2-fixup19:02
*** ghostmansd <ghostmansd!> has quit IRC19:07
lkcla bit long19:10
*** octavius <octavius!> has quit IRC19:12
*** ghostmansd <ghostmansd!> has joined #libre-soc19:38
*** octavius <octavius!> has joined #libre-soc19:52
ghostmansd[m]Ideas are welcome :-)20:06
ghostmansd[m]I've already submitted to binutils, but I have a feeling that I'll have to update the patch anyway20:07
ghostmansd[m]And, even if not, should the good name come to our heads, I'll update anyway20:07
*** ghostmansd <ghostmansd!> has quit IRC21:05
*** octavius <octavius!> has quit IRC21:11
*** ghostmansd <ghostmansd!> has joined #libre-soc22:09
*** octavius <octavius!> has joined #libre-soc22:15
lkclSVyx and SVo is good.  btw you forgot to alter the pseudocode. variables used that don't exist: baaaad22:26
lkcltest_caller_*.py underway, 15min22:26
ghostmansdAh yes22:43
ghostmansdCR_index = 7-(BA>>2)      # top 3 bits but BE22:43
ghostmansdbit_index = 3-(BA & 0b11) # low 2 bits but BE22:43
ghostmansdCR_reg = CR{CR_index}     # get the CR22:43
ghostmansdI cannot find equivalent for this in src/openpower/decoder/power_svp64_extra.py22:44
ghostmansdIt seems that by the time we enter SVP64CRExtra spec is already 3-bit22:44
ghostmansdOn the other hand, I think the pseudocode below is complete22:45
ghostmansdreturn ((BA >> 2)<<6) | # hi 3 bits shifted up22:45
ghostmansd          (spec[1:2]<<4) | # to make room for these22:45
ghostmansd          (BA & 0b11)      # CR_bit on the end22:45
ghostmansd# scalar constructs "00 spec[1:2] BA[0:4]"22:45
ghostmansdreturn (spec[1:2] << 5) | BA22:45
ghostmansdI'll use that one22:46
ghostmansdA question on this: how to check? Some particular insn you have in mind?22:46
ghostmansdlkcl ^22:46
lkclghostmansd, 1 sec let me check.... probably isel?23:27
lkclisel RT,RA,RB,BC23:28
lkclyep, that's BC23:28
lkclBC there is *5* bit, remember, though23:28
lkclso for that you take the top 5 bits...23:29
ghostmansdI got that from pseudocode :-)23:29
lkcl1 sec... on svp64extra...23:29
lkclah right - i cheated :)23:29
lkclbecause they're identical (BA 5-bit and BFA 3-bit) if you drop the bottom 2 bits23:30
lkclcan use SVP64CRExtra by chopping off 2-bits from BA, running the top 3-bits through SVP64CRExtra, then putting the bottom 2-bits *back* on the LSBs23:30
lkclall good on svshape2-fixup btw (rebased into master already)23:32
ghostmansdhow long ago this has been tested?23:34
ghostmansd  File "/home/ghostmansd/src/openpower-isa/src/openpower/sv/trans/", line 657, in crf_extra23:34
ghostmansd    (rname, str(extras[extra_idx]))23:34
ghostmansdNameError: name 'rname' is not define23:34
ghostmansdtried pysvp64asm on this code23:34
ghostmansdsv.isel 4, 1, 2, 3123:34
ghostmansdsv.isel 4, 1, 2, 6323:34
ghostmansdsv.isel 4, 1, 2, *3123:34
ghostmansdsv.isel 4, 1, 2, *6323:34
lkcl1 sec23:35
ghostmansdaha, I know23:35
ghostmansdI'll push the fix23:35
ghostmansdmissing arg23:36
lkclah you see what happened, there?23:36
lkcl31 has LSBs that cannot fit into the numbering of EXTRA2/323:36
lkcl13.3 and 13.423:37
lkclok sv.isel is almost certainly an EXTRA2, because it is 4-operand23:37
lkclso that means 13.4 CR EXTRA223:38
ghostmansdOK, and something extra3?23:38
ghostmansd(these are the same with GPR/FPR though, the code is shared)23:38
ghostmansdor, well, the same23:38
lkcland the scalar numbering can *only* go up to CR1523:38
lkclyou have to chop off the bottom 2 LSBs of the number 3123:39
lkcl31>>2 = 7!23:39
lkclso that should be perfectly fine23:39
lkclgood .long 0x05400000; isel 4, 1, 2, 31 # sv.isel 4,1,2,3123:39
lkclgood .long 0x05400040; isel 4, 1, 2, 31 # sv.isel 4,1,2,6323:40
ghostmansdI assume you already pushed it?23:40
lkclbad sv.isel 4, 1, 2, *3123:40
lkcllet's work it out23:40
ghostmansdI mean pysvp64asm fix23:40
lkclno, haven't changed anything.23:40
ghostmansdif you managed to obtain the assembly23:40
lkclok so that is a 31.23:41
lkclshifted down by 2 is 7.23:41
lkcli know23:41
lkclNote that Vectors may only start from CR0, CR8, CR16, CR24, CR32...23:41
lkclis CR7 in that list?23:41
lkclno it is not23:41
lkclso this is a correct assertion.23:41
ghostmansdah OK the same as with regs23:41
lkclyou need to use CR823:41
lkclyes but lovingly-confusingly you have to shift CR8 *back* up by 2, then put the 2 LSBs on (you picked 0b11)23:42
lkclso... 3523:42
lkclif you try sv.isel 4,1,2,*35 it should work23:42
lkcl.long 0x054000c0; isel 4, 1, 2, 3 # sv.isel 4,1,2,*3523:42
lkclnow let's try 63 - same logic23:42
ghostmansdhow do you manage to get asm working??23:43
lkcl63>>2 = 1523:43
lkclby avoiding the bug :)23:43
lkcl1 sec23:43
ghostmansdah OK23:43
ghostmansdI pushed the fix to master23:44
lkclok so let's go back to sv.isel 4, 1, 2, *6323:46
lkcl15 is not in the list CR0,CR8,CR16,....23:46
lkclso let's pick CR16 instead23:46
lkclshift that up by 223:46
lkcland OR with 0b1123:46
lkcllet's try that23:46
lkcl.long 0x05400080; isel 4, 1, 2, 7 # sv.isel 4, 1, 2, *6723:47
ghostmansdthank you!23:47
ghostmansdI'll try some code I have on these23:47
ghostmansdfuck insndb can't find isel23:50
ghostmansdor well, strictly speaking, match the opcode23:51
ghostmansdisel 0x54000007c8117de FieldsOpcode(value=0x7c00001e, mask=0xfc0007fe)23:52
ghostmansdthe first hex is what we want to match, then goes what we had for isel23:52
ghostmansdOK I think I'll sort it out tomorrow, 2 AM here23:53

Generated by 2.17.1 by Marius Gedminas - find it at!