Tuesday, 2022-05-24

*** kylel1 is now known as kylel07:10
ghostmansdI've been looking at CRs again today. I think binutils code is wrong: both fields.text and spec mention that BC field specifies a bit in CR, but binutils doesn't have PPC_OPERAND_CR_BIT flag present, like it has with BA and BB fields.13:59
ghostmansdI'm going to prepare the corresponding patch and raise this topic at binutils.13:59
ghostmansd(Well, we even don't have BC field in the first place, but we have CRB instead)14:01
ghostmansdlkcl, programmerjake, FYI, irclog search is broken14:36
lkclghostmansd, yes i know. i have to (manually) install cgi-bin capability, which is making me nervous15:29
lkcli tried making sure to keep track of the irc discussions as much as possible15:30
lkclhttps://bugs.libre-soc.org/show_bug.cgi?id=55015:30
octaviuslkcl, thanks for drilling the WB point15:57
octaviustook me far to long, even with the printed spec in front of me XD15:57
octavius*too long15:57
lkclmarkos, hi, i'd be interested in your take on conflictd https://libre-soc.org/openpower/sv/vector_ops/17:29
lkcli don't feel it's worth adding as an explicit instruction, because of crrweird https://libre-soc.org/openpower/sv/cr_int_predication/17:30
lkclyou can do a staggered (triangular) sv.cmpi which produces the Vector-against-scalar compares src1[i] == src2[j]17:31
lkclthen *transfer* those Vector-of-CR-Field-Results into a single 64-bit integer using crrweird17:32
lkclthen OR those together and you've synthesised conflictd17:32
lkclthe really nice thing is, crweird can do multi-bit combinations of the sv.cmpi CR checks17:33
lkclso you can do "if src1[i] >= src2[j]" just as easily17:34
markosis it actually needed?17:44
markosI mean from what I read, conflictd is for vectorizing loops, but SVP64 solves the problem differently17:44
markosI mean not using SIMD17:45
lkclthat's what i'm thinking, but if crweird didn't exist it would be damn hard to do17:45
markosgather scatter needs this on AVX51217:45
markosbut on SVP64? it's just a load17:45
markosload/store17:45
markosor a bunch of those anyway17:45
markosI'm not sure what to think17:46
markosI like simple17:46
lkclyes, it's a lot of load/stores, but there's still the same underlying problem17:47
markosperhaps, in order to emulate SIMD behaviour?17:47
lkcli do need to understand more about what the hell they're trying to solve17:47
markoswell I can understand the gather/scatter case17:47
markosavoiding issuing multiple loads/stores to the same addresses17:48
markosie, gather takes a base address plus offsets and steps, but there is no guarantee that these will not overlap17:48
markosso it's possible that one or more loads/stores might be to the same addresses17:49
markosbut that's the least of the problems gather/scatter has on avx51217:49
markosbasically it sucks17:50
lkcl:)18:07
lkcli liked that conflictd can be used for histogram counting18:11
lkclmarkos, if i understand the stackexchange question correctly we will have exactly the same issue20:02
lkclhttps://stackoverflow.com/questions/39913707/how-do-the-conflict-detection-instructions-make-it-easier-to-vectorize-loops20:03
lkclthe instructions will be different names but ultimately the same20:03
lkcl* load indices20:03
lkcl* detect conflicts20:04
lkcl* create mask20:04
lkcl* use as gather-mask on load20:04
lkclone nice thing though, doing popcount on the conflicts is easy because just use popcnt. duh20:07
ghostmansd[m]lkcl, sent the patch about the BC field21:08
ghostmansd[m]It also obviously affects the disassembly but it seems that in a good way21:09
lkclcurious as to why it's been missing21:45
programmerjake meeting in 13min21:47
ghostmansd[m]lkcl, no fricking idea, but, likely, since isel seems to be an old opcode, could it happen that it was different in elder revisions?22:11
ghostmansd[m]Or, well, perhaps nobody bothered: it looks like not that many use explicit CRs. I mean, most people would use suffixed version.22:12
ghostmansd[m]Anyway... https://youtu.be/pWdd6_ZxX8c22:13
ghostmansd[m]I'm kinda pissed that I have no explicit "God bless you with this 16-bit index" statement, though22:14
ghostmansd[m]It'd also be great to have some feedback from NLnet as well :-)22:15
ghostmansd[m]lkcl, please confirm that at least you received the mails :-)22:15

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!