Thursday, 2023-05-11

lkclprogrammerjake, i'm doing the bare minimum proof-of-concept so that other people can take over00:55
programmerjakeok, then, ghostmansd can you do that?01:09
*** tplaten <tplaten!~tplaten@195.52.147.211> has joined #libre-soc06:13
*** tplaten <tplaten!~tplaten@195.52.147.211> has quit IRC07:01
*** ghostmansd[hexch <ghostmansd[hexch!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc08:43
*** ghostmansd[pc] <ghostmansd[pc]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC08:46
*** ghostmansd[hexch <ghostmansd[hexch!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC08:58
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC11:27
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.162.3> has joined #libre-soc11:28
*** ghostmansd <ghostmansd!~ghostmans@178.18.209.122> has joined #libre-soc13:31
ghostmansdprogrammerjake, sure, I'll handle the separate temporary files13:31
*** ghostmansd <ghostmansd!~ghostmans@178.18.209.122> has quit IRC13:47
*** tplaten <tplaten!~tplaten@195.52.147.211> has joined #libre-soc15:17
*** tplaten <tplaten!~tplaten@195.52.147.211> has quit IRC15:22
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.162.3> has quit IRC16:29
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc16:29
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC16:34
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc16:35
markos_fun fact, so, av1 has this Super Resolution mode, where encoding is done at a lower resolution (downscaled from the original) and then upscaled in the final decoding. This is done mostly by a complex 2D convolution function (<100 C lines) but takes ~64% of total cpu time17:04
markos_this is for low bandwidth scenarios17:05
markos_the NEON optimization takes ~41%, and gives a total 38% in total encoding time17:05
markos_but at the cost of being very complex and much longer (~610 C lines)17:06
markos_I'm really anxious to see how SVP64 implementation would compare, both in code size and performance-wise17:07
markos_I spent days trying to track down an overflow bug in the NEON code, because I used vmull_s16/vmlal_s16 instructions which do sign-extend the inputs to larger values, and then I realized that the input values were unsigned :-/17:09
markos_anyway, just ranting as I spent 3 weeks on this function because of this overflow error...17:10
markos_lkcl, btw, those convolutions are *full* of VF loops :)17:23
lkcljoy! :)19:14
lkclghostmansd[m], needs an option to ISACaller (target file) as the file is not a temporary19:15
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc22:09
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC23:50
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc23:50

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!