lkcl | programmerjake, sorry about the misunderstanding on the reg-format (int madd vs FP fmadd) | 02:25 |
---|---|---|
lkcl | i'd not looked closely at madd, it's unimplemented at the moment | 02:25 |
lkcl | "sv.ori." is neat. it only does the MSB however. | 02:27 |
lkcl | grevlut gets a surprisingly large number of combinations of bit-patterns: 0xaaaaa, 0x3333, 0x969696, 0xaaaaffff, i spent 30 mins experimenting and only explored < 0.5% of the possibilities | 02:29 |
*** alMalsamo is now known as lumberjack123 | 02:47 | |
programmerjake | pmovmask only does the sign bits, so sv.ori. only getting MSBs is exactly what we want | 04:14 |
lkcl | for that task, yes. | 05:51 |
lkcl | there are thousands if not tens potentially hundreds of thousands of constants that can be generated | 05:51 |
lkcl | saying "the instruction is worthless because one of those possible constants can be covered by another instruction" is missing a huge number of opportunities | 05:52 |
lkcl | i had this kind of nightmare conversation with the RISC-V Founders | 05:53 |
programmerjake | i never said the instruction was worthless, i meant we need a different motivation than "emulates pmovmaskb" | 05:55 |
programmerjake | unless it does a better job somehow than sv.ori. or equivalents | 05:56 |
lkcl | a list of additional tasks that it's suited to will help | 05:58 |
lkcl | that it takes 6 instructions to create any given arbitrary constant [without a LD] is a good start | 06:00 |
lkcl | (addi, addis, rlwimi, ori, oris, something-else) | 06:01 |
lkcl | that sets the context / benchmark for having a single instruction that can do [part-of-a-job-of-] six | 06:02 |
lkcl | don't ask me how btw, but it can also do 0x222222..., 0x77777.. and many others | 06:06 |
lkcl | 0x202020... 0x200020002000200... | 06:07 |
programmerjake | paddi, sldi, paddi: 3 instructions for a 64-bit constant | 06:07 |
lkcl | paddi? | 06:07 |
programmerjake | add a 34-bit immediate | 06:07 |
lkcl | ok so that's still 64 bits 32 bits 64 bits | 06:08 |
lkcl | where this is one (single) 32-bit (not prefixed, not 64-bit) instruction | 06:09 |
programmerjake | yup! | 06:09 |
lkcl | there's a lot of overlap, but it's going to be somewhere of the order of... 2^8 * 2^6 * 2 potential constants | 06:10 |
lkcl | 2^15 | 06:10 |
programmerjake | sv.addi/elwid=16 r5.v, r0.v, 0x1234 gives 0x1234123412341234 | 06:11 |
programmerjake | if vl=1, or there's probably a way to do it with subvl=4 and scalar | 06:12 |
lkcl | only when setvl has also been called, so that's 64 bit not 32 bit | 06:12 |
lkcl | sorry... | 06:12 |
lkcl | sv.xxx is 64-bit | 06:12 |
lkcl | ew=32/subvl=4 yes | 06:12 |
lkcl | still 64-bit though | 06:13 |
lkcl | ew=16/subvl=4 sorry | 06:13 |
programmerjake | in any case, it gives some nice constants! makes me wish there was a "gimme a powerpc rotate mask" instruction | 06:14 |
lkcl | how would that work? | 06:15 |
* lkcl curious | 06:15 | |
lkcl | you mean, "if ya gone to all the trouble in rlwimi to create a rotate mask, gimme it"? | 06:16 |
programmerjake | you know MASK from the pseudo-code? (RT) = MASK((RA)[57:63], (RB)[57:63]) or some of those could be immediates | 06:16 |
lkcl | well, we're doing a Draft bitmanip, if there's a good reason hey what the heck, let's add it :) | 06:17 |
lkcl | next to (or in) the bm* group | 06:18 |
lkcl | i can totally (intuitively) see it being valuable | 06:19 |
programmerjake | only reason i can come up with at the moment is: lookie at all the pretty masks! we have the hardware anyway...why not use it?! | 06:20 |
programmerjake | note that a fully immediate version is: `addi r3, r0, -1; rldimi r3, r3, A, B` | 06:23 |
programmerjake | wait, that's wrong. rldic is correct | 06:26 |
lkcl | programmerjake, found the "recommended" sequence by IBM | 10:41 |
lkcl | https://git.libre-soc.org/?p=microwatt.git;a=blob;f=hello_world/head.S;h=63576063f040c707d307a6c0ea4216e16f3f2da9;hb=882ace781e457db791a70ecfa536e45c7b9d942b#l33 | 10:41 |
*** alMalsamo is now known as lumberjack123 | 16:00 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!