Tuesday, 2022-08-16

*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc01:29
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC01:29
*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has quit IRC04:32
*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc04:37
*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has quit IRC07:03
*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc07:03
*** yambo <yambo!~yambo@184.166.145.119> has quit IRC07:27
*** yambo <yambo!~yambo@184.166.145.119> has joined #libre-soc07:39
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC09:44
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.109> has joined #libre-soc09:44
lkclprogrammerjake, what you describe in https://bugs.libre-soc.org/show_bug.cgi?id=908 will not work / is not the right area10:56
lkclwhat you describe *is* however how predicate-result has to work: mixing of the condition-register (Rc=1) test into the predicate mask (actually, the write-enable lines on the regfile)10:57
lkclhowever10:58
lkclthat is completely inappropriate for Indexed REMAP because it is for actually literally dynamically changing which results required which registers10:58
lkclit's right the way back at the Dependency Matrices.10:59
lkclthere is no other scheme that will help there other than to convert Simple-V to Cray-style Vector Registers10:59
lkcldynamic shuffles are normally isolated to within such Vector Registers.  VSX, RVV, AVX512, NEON, SVE/2, they all have Vector Registers, the shuffling is done within them, and the Dependency Hazards are dead-easy11:01
lkclRead Hazard on the source to be shuffled11:01
lkclRead Hazard on the source of the shuffle-indices11:01
programmerjakewell, imho it would be better to go back to having mv.x instead of indexed remap, then each output element is known and is written once (unless predicated off). dynamic shuffle where the dest is indexed instead of the src is *extremely* uncommon, and waay more complex to implement in hardware11:01
lkclRead Hazard on the masks11:01
lkclWrite Hazard on the destination11:02
lkclyep that's not happening either, i've done the assessment (took several weeks)11:02
programmerjakeread hazard on ra..ra+vl ... it's fine if that's slow and takes multiple cycles11:03
lkcland it's actually pretty much the same identical issue11:03
lkclyes, that's the start of the Hazard solution - actually, extended conceptually - for Indexed REMAP11:03
lkclexcept SPR.SVGPR for ra11:03
lkcland MAXVL for vl11:03
lkclso the range to set the read hazard on is from SPR.SVGPR .. SPR.SVGPR+MAXVL11:04
lkcl(in the dumb/naive version)11:04
lkclbut because of the rules i set11:04
lkclthe Indices may be read in advance then cached11:04
lkcland a bitmap created11:05
programmerjakebut indexed remap makes it waay harder by trying to shove a dynamic shuffle into *every* input/ouput...mv.x only shuffles the input11:05
lkclin a Deterministic and Cached fashion, where the Hazards needed may be set extremely easily from a bitmap, yes.11:05
programmerjakeno bitmap needed, just block until all elements in the input vectors are available11:06
lkclagain: you are conflating the elements in the input vector with the indices which tell you where those elements actually are11:06
lkcland for that reason i closed the bugreport as invalid11:07
lkcli've thought it through, and i gave you two opportunities to help write some assembler that would usefully demonstrate how to use this11:07
programmerjakeno, you block for all elements of *both* input vectors, both indexes and indexee11:07
lkclagain: i repeat: the indices may be cached11:08
lkclyes, initially, there would be blocking on indexes and indexees11:08
lkclhowever the indexes would be read immediately as a top priority11:08
lkcland once available they are cached11:08
programmerjakecaching the indexes is unhelpful because basically *every* shuffle will have a different pattern, you're just making it more complex and wasting gates for no reason11:09
lkclfrom there those indices go straight into the *existing standard infrastructure that we have to have in place anyway for the rest of REMAP*11:09
lkcloverlap with what i just wrote11:09
lkclread what i just wrote11:09
lkclwhat you are viewing as "complex" has to exist anyway, for Matrix REMAP, DCT and FFT REMAP11:09
lkclthat performs indexing-shuffling on both read and writes anyway11:10
programmerjakei did, i already knew you wanted to put the indexes into the issue fsm...imho that's a bad idea11:10
lkclDCT performs indexing-shuffling in *Gray* Code (!!)11:10
lkcltough11:10
lkclit's the entire premise on which SV is founded!11:11
lkcl(actually it goes in between decode and issue)11:11
lkclthis *has* been in the spec for over two years11:11
programmerjakegray code is simpler than reading from arbitrary registers and gray code is the same evety time, shuffle is different every time so now it has to take more cycles and more instructions and more spec. oddities because you put the shuffle in the wrong spot11:12
lkclonce again11:12
lkclfrom the top11:12
lkclfrickin hellfire how many times do i have to repeat this11:12
lkclthe rules11:12
lkclthat i set11:12
lkclspecifically allow for the cacheing of the indices11:13
lkclsuch that those indices may drop into the exact same location that the rest of REMAP uses11:13
programmerjaketelling me the same thing again about how indexed remap is user-specifable remap doesn't mean it's any more suitable for dynamic shuffle where caching indexes is detrimental11:13
programmerjakedetrimental because they're different nearly every time and the extra cycles taken to cache it are wasted11:14
lkclplease take the time to understand it11:14
lkclyes they will change11:15
lkcli do not perceive that to be a problem11:15
lkclby "cache" i mean "extremely short-lived cache useable pretty much only by the Dependency Matrices"11:16
lkclthere are other such caches of registers11:16
lkclsuch as MSR.11:16
lkclMSR is cached, even in TestIssuer11:16
lkclPC is cached11:16
lkclSVSTATE is cached11:16
programmerjakethen why not just have mv.x without all the weird unmodifiable register or you get UB restrictions? it takes less instructions, less cycles, and less sw (and to some degree hw too) complexity11:18
programmerjakesv.mv.x *rt, *ra, *rb has simple dependencies(assuming rb is indexes): block on ra..ra+vl and rb+i, and write to rt+i where i is the sv loop counter11:21
lkclalready been through it.  it is the same problem11:21
lkclthere was something else.11:22
programmerjakethe writes can run in parallel if rt doesn't overlap ra or rb or if rt=rb and doesn't overlap ra11:22
lkcli cant recall what it was11:22
programmerjakewhereas remap indexed can have to serialize operations when writing to a remapped dest even if none of the vectors overlap, because of WaW hazards where you can have duplicate indexes11:26
programmerjakewaay more complex11:26
programmerjakewell, please see if you can find where you wrote down the problem with mv.x11:28
programmerjakeah, found it: https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-June/004900.html11:33
programmerjakeyour concern was mv.x doesn't make sense as a scalar operation -- i pointed out it can be justified as a cryptographic s-box operation. also (pointing out now), there are operations such as setvl that are inherently part of SV and shouldn't need to make sense as an operation separated from SV, therefore imho having mv.x require being sv-prefixed is perfectly fine and justifiable11:37
lkclmv.x is not justifiable as an independent scalar instruction, sorry.11:38
lkclalso the problem that you think is there (overlaps) is not there in Indexed REMAP11:38
programmerjakeit is...crypto s-box11:38
lkclthe hazard management is too massive a step up for a simple scalar instruction11:38
lkclthe reason why overlaps do not occur in Indexed REMAP is precisely the reason why overlaps do not occur in the rest of REMAP11:39
lkclIndexed REMAP is simply a generalisation of the Deterministic Scheduling11:40
programmerjakeno, it's there...every dest-indexed-shuffle has it...since two indexes can specify to write to the same output. the reason overlaps don't occur in the rest of remap is because there aren't duplicate indexes11:40
lkclthink it through11:40
lkclwhat you've said is incorrect11:41
lkclplease think through why i am saying that what you've said is incorrect.11:41
lkcli leave it with you11:41
programmerjakeso, if your dest shuffle is [1, 2, 3, 4, 3], how can you avoid having overlap when output element 3 is written to twice?11:41
programmerjakeunless you explicitly defined that to be UB, you'll have that problem...you need to realize that11:42
programmerjakemv.x bypasses that because tge indexing is on the src, not the dest11:43
lkclREMAP does not work in the same way as sv.mv.x11:44
programmerjakeand reading from the same input twice is no problem because RaR hazards aren't a thing11:44
lkclhint: twin-predication ==> back-to-back VGATHER-VSCATTER11:44
lkclin effect the "read" items go into a "queue" from which "write" items get their data11:45
programmerjakeoh...really....vscatter has that overlap issue too, just memory is designed to handle that11:45
lkcl... ok11:46
* lkcl thinking11:46
programmerjakeregisters can handle overlap too, just they need a bunch of extra hw we don't want to need...11:46
lkclyep, got it.11:46
lkclthen that needs to go in the notes.11:46
lkcland that the hardware shall follow strict "Program Order"11:47
lkclthank you for being persistent in raising this11:48
lkcli have to deal with a priority response to one of RED's Directors11:48
programmerjakeah, ok. i have to deal with it being nearly 4am here and needing a better sleep schedule...hope it goes well for you11:49
programmerjake:)11:49
programmerjakettyl11:49
cesarGot an invite to try out Dall-E 2 (AI image generator) and typed “a cartoon of a CPU chip running and holding a pencil and a ruler”. What I got:12:07
cesarhttps://labs.openai.com/s/kV42IQklfKmFhapwCsuoMZ8812:07
cesarhttps://labs.openai.com/s/KCKGbC1HUAtZfEWmV1ZCLCvL12:07
cesarLiked how, in the second one, arms and legs derive from the SMT pins, clever.12:10
cesarThe idea was to try making it generate a new mascot for Libre-SOC...12:17
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.109> has quit IRC13:03
*** octavius <octavius!~octavius@172.147.93.209.dyn.plus.net> has joined #libre-soc13:03
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.109> has joined #libre-soc13:04
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.109> has quit IRC13:09
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc13:09
*** ckie <ckie!~ckie@user/cookie> has quit IRC13:58
*** ckie <ckie!~ckie@user/cookie> has joined #libre-soc14:00
lkclcesar, cool!14:22
*** octavius <octavius!~octavius@172.147.93.209.dyn.plus.net> has quit IRC15:05
*** choozy <choozy!~choozy@75-63-174-82.ftth.glasoperator.nl> has joined #libre-soc15:19
*** choozy <choozy!~choozy@75-63-174-82.ftth.glasoperator.nl> has quit IRC17:58
*** octavius <octavius!~octavius@172.147.93.209.dyn.plus.net> has joined #libre-soc18:20
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC20:26
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc20:59
ghostmansdlkcl, new changes are available in pysvp64dis branch. I wanted to re-use our special mapping classes (SVP64PrefixFields, SVP64RMFields), but found that I should extend and refactor them to make things work correctly.22:42
ghostmansdSince these changes touch some common code at selectable_int.py, I decided this code is not a candidate for a master branch yet.22:43
ghostmansdThis stuff enables things like this:22:43
ghostmansdinsn = SVP64Instruction(b"\x05\x40\x00\x00", b"\x7c\x41\x02\x14")22:45
ghostmansdprint(insn.prefix.major)22:45
ghostmansdprint(insn.prefix.pid)22:45
ghostmansdprint(insn.prefix.rm)22:45
ghostmansdThis will print 1, 3 and RM(value=0x0, bits=24) respectively.22:46
ghostmansdNotice that 'rm' is overloaded. In fact, any of these fields can now be overloaded in the child classes (previously it couldn't work due to __getattr__ tricks).22:47
ghostmansdSo, the next stage is to extend the prefix.rm stuff with special methods which allow to re-construct the original stuff (e.g. notice the sketch property sv_mode).22:48
ghostmansdThis took almost the whole day to find out why the fuck MappingSelectableInt couldn't work as is. But at least we have real properties for any fields-like class, even shown in help!22:49
ghostmansdAnyway, enough for today. Sorry, had to post and flood the chat, really needed to share this crap with anyone. :-) Feel free to skip.22:50
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC22:54
*** octavius <octavius!~octavius@172.147.93.209.dyn.plus.net> has quit IRC23:45

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!