*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 02:54 | |
*** sauce <sauce!~sauce@hollandaise.sauce.icu> has joined #libre-soc | 03:52 | |
programmerjake | lkcl: maybe mention extsb -- imho by the logic applied to *w insns word -> XLEN/2 it would sign extend the least significant XLEN/8 bits, which imho would be useful, similarly extsh, extsw extend XLEN/4 and XLEN/2 bits | 09:29 |
---|---|---|
programmerjake | signing extending 1,2, and 4 bit fields in one instruction seems useful to me | 09:30 |
markos | sign-extend (8->16, 16->32, 32->64-bit) sounds like the movl instructions in Arm these are extremely useful when needing to increase accuracy in certain calculations, and are usually paired by the movn (narrowing) | 09:48 |
markos | eg. the video data is in 8/16-bit but the calculations need to be done in 32-bit, so movl is used, then narrowed down with movn | 09:49 |
markos | and because it's arm, you have a ton of those instructions, eg. qmovn (move with saturation), qmovun (move with saturation to unsigned), etc etc | 09:50 |
programmerjake | powerisa already has scalar sign extension instructions extsw/h/b, i'm just speculating how we might change them when given different srcelwid | 09:50 |
programmerjake | currently we picked to always have extsb extend from 8 bits ignoring XLEN | 09:51 |
markos | yes, I agree | 09:51 |
programmerjake | which i think might be unnecessarily tying our hands | 09:51 |
markos | you might get packed data of 16-bit elements | 09:52 |
programmerjake | if you have 16-bit data and want to sign extend the lower 8-bits, with my speculative change you'd just use extsw with XLEN=16 | 09:53 |
programmerjake | if you have 16-bit data and want to sign extend the lower 4-bits, with my speculative change you'd just use extsh with XLEN=16 | 09:53 |
programmerjake | if you have 32-bit data and want to sign extend the lower 16-bits, with my speculative change you'd just use extsw with XLEN=32 | 09:54 |
markos | 4-bits? | 09:54 |
programmerjake | yeah, sometimes (bitfields and other stuff) you have 4 bit data | 09:54 |
markos | tbh I haven't seen such a use case, but if it's easy to do why not | 09:55 |
markos | most of the problems I've worked on are rather simpler | 09:55 |
markos | eg. 8->16, 16->32, 32->64 | 09:55 |
markos | and you would need 2 instructions for those, or an extra option to take the lower/upper half of the original word | 09:56 |
programmerjake | traditionally 4-bit data would use (x << 60) >> 60 or similar to sign extend | 09:56 |
markos | intel even has 8->32, 8->64 | 09:57 |
markos | in our case, one could create a set of 8 vector registers, straight from a single extend of 8->64 of a single source register with 8x8-bit elements | 09:58 |
programmerjake | well, scalar extsb is 8 -> 8/16/32/64, and extsh is 16 -> 16/32/64 since Power stores smaller ints in 64-bit regs | 09:58 |
programmerjake | my idea is to just proportionally shrink all sizes when using them on vectors with smaller element sizes | 09:59 |
markos | yes, but that's scalar, what if you have packed 8-bit/16-bit data and want to sign-extend these elements? both intel and Arm provide such instructions for SIMD | 09:59 |
markos | since we don't do SIMD, we could/should extend the functionality of the scalar functions to create vectors in this manner | 10:00 |
markos | arbitrarily sized for that matter | 10:00 |
programmerjake | SVP64 allows setting different src/dest elwid, so just do whatever move/sign-extend op with different src/dest elwid | 10:00 |
markos | so it's already possible? | 10:01 |
programmerjake | not 4-bit sign extension, but 8/16/32/64 -> any of 8/16/32/64 | 10:01 |
markos | eg. say I have 16x16-bit elements in packed mode, in 4 x 64-bit registers and I want to sign-extend them to 32-bit, that means I will need 8x 64-bit registers if they are still in packed mode | 10:03 |
markos | is that possible now? | 10:03 |
programmerjake | yes | 10:04 |
markos | nice | 10:04 |
markos | wow | 10:04 |
programmerjake | even in-place if you set reverse mode | 10:04 |
markos | I'm constantly amazed by what SVP64 can do | 10:04 |
programmerjake | :) | 10:06 |
programmerjake | so, lkcl, what do you think of changing extsb/h/w to be: | 10:09 |
programmerjake | RA <- EXTSXL(RS, XLEN/N) for N = 8/4/2 rsspectively | 10:09 |
programmerjake | instead of: RA <- EXTSXL(RS, N) for N = 8/16/32 respectively? | 10:09 |
programmerjake | that will strictly increase functionality | 10:10 |
programmerjake | actually i'll open a bug so we don't lose the idea | 10:11 |
markos | programmerjake, I don't think it's possible to "sign" extend 1-bit numbers :) | 10:24 |
markos | it would just have to be plain extend in such cases, even 4-bit is pushing it | 10:25 |
programmerjake | it totally is, you get either -1 or 0 | 10:25 |
markos | well possible in the mathematical sense, yes, but the usefulness of it is questionable though | 10:26 |
programmerjake | since imho a 1-bit 2s complement number has just the sign bit and nothing else | 10:26 |
programmerjake | it's quite useful, existing simd code really likes to use -1/0 for masks | 10:27 |
programmerjake | this makes a quick and easy way to generate those | 10:27 |
markos | it's just 1 and 0 in masks, you don't treat the sign bit as a separate case | 10:27 |
markos | don't misunderstand me, I don't think it's a bad idea, far from it, I'm just referring to the "sign" naming | 10:28 |
programmerjake | well, 1-bit to 8-bit sign extension is basically just testing the lsb and splatting it to all 8 bits, whatever you call it it is commonly useful for traditional simd code | 10:28 |
markos | yes I agree on that | 10:29 |
markos | it's a way to emulate movemasks as on Intel | 10:29 |
programmerjake | also, do note the sign extension matches what verilog and probably vhdl do with 1-bit signed wires | 10:29 |
programmerjake | except intel takes it from the msb | 10:29 |
programmerjake | iirc arm uses the lsb for something like that | 10:30 |
markos | bbl | 10:34 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 11:26 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.232> has joined #libre-soc | 11:26 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.232> has quit IRC | 11:41 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 11:42 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC | 13:39 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 14:39 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 14:40 | |
lkcl | programmerjake, i like it. in fact i like it that much i'm going to include it in ls005 :) | 15:57 |
lkcl | https://bugs.libre-soc.org/show_bug.cgi?id=1061 | 15:57 |
lkcl | although grevlut(i) will do a lot more than extsb/ew=8 | 16:19 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 17:16 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.168.147> has joined #libre-soc | 17:16 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.168.147> has quit IRC | 17:36 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 17:36 | |
*** octavius <octavius!~igloo@78.143.218.138> has joined #libre-soc | 19:22 | |
*** octavius <octavius!~igloo@78.143.218.138> has quit IRC | 19:25 | |
programmerjake | lkcl: note that imho you waay overcomplicated the extsb pseudocode, i think it should just be: RT <- EXTSXL((RA), XLEN/8) | 19:51 |
programmerjake | it sign extends from the LSB 1/2/4/8-bits, not from the msb 1/2/4/8-bits of the lsb byte | 19:53 |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 20:40 | |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 22:39 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 22:40 | |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 23:13 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 23:13 | |
lkcl | no. | 23:18 |
lkcl | that is not its purpose | 23:18 |
lkcl | within the scope of this rfc. | 23:18 |
programmerjake | i added the expanded pseudocode so you can easily see which bits go where, the short pseudocode is so people don't complain that it's much longer than necessary | 23:20 |
programmerjake | i'll note you still have a syntax error with ]] | 23:20 |
programmerjake | imho having `in` as a temporary variable makes it more confusing hence why i removed it | 23:21 |
programmerjake | e.g. you wrote if XLEN = 8 then RT <- in[7]] * 8 which has too many ] | 23:23 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!