lkcl | programmerjake, i'm doing the bare minimum proof-of-concept so that other people can take over | 00:55 |
---|---|---|
programmerjake | ok, then, ghostmansd can you do that? | 01:09 |
*** tplaten <tplaten!~tplaten@195.52.147.211> has joined #libre-soc | 06:13 | |
*** tplaten <tplaten!~tplaten@195.52.147.211> has quit IRC | 07:01 | |
*** ghostmansd[hexch <ghostmansd[hexch!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 08:43 | |
*** ghostmansd[pc] <ghostmansd[pc]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 08:46 | |
*** ghostmansd[hexch <ghostmansd[hexch!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 08:58 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 11:27 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.162.3> has joined #libre-soc | 11:28 | |
*** ghostmansd <ghostmansd!~ghostmans@178.18.209.122> has joined #libre-soc | 13:31 | |
ghostmansd | programmerjake, sure, I'll handle the separate temporary files | 13:31 |
*** ghostmansd <ghostmansd!~ghostmans@178.18.209.122> has quit IRC | 13:47 | |
*** tplaten <tplaten!~tplaten@195.52.147.211> has joined #libre-soc | 15:17 | |
*** tplaten <tplaten!~tplaten@195.52.147.211> has quit IRC | 15:22 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.162.3> has quit IRC | 16:29 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 16:29 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 16:34 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 16:35 | |
markos_ | fun fact, so, av1 has this Super Resolution mode, where encoding is done at a lower resolution (downscaled from the original) and then upscaled in the final decoding. This is done mostly by a complex 2D convolution function (<100 C lines) but takes ~64% of total cpu time | 17:04 |
markos_ | this is for low bandwidth scenarios | 17:05 |
markos_ | the NEON optimization takes ~41%, and gives a total 38% in total encoding time | 17:05 |
markos_ | but at the cost of being very complex and much longer (~610 C lines) | 17:06 |
markos_ | I'm really anxious to see how SVP64 implementation would compare, both in code size and performance-wise | 17:07 |
markos_ | I spent days trying to track down an overflow bug in the NEON code, because I used vmull_s16/vmlal_s16 instructions which do sign-extend the inputs to larger values, and then I realized that the input values were unsigned :-/ | 17:09 |
markos_ | anyway, just ranting as I spent 3 weeks on this function because of this overflow error... | 17:10 |
markos_ | lkcl, btw, those convolutions are *full* of VF loops :) | 17:23 |
lkcl | joy! :) | 19:14 |
lkcl | ghostmansd[m], needs an option to ISACaller (target file) as the file is not a temporary | 19:15 |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 22:09 | |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 23:50 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 23:50 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!