*** octavius <octavius!~octavius@92.40.168.173.threembb.co.uk> has joined #libre-soc | 10:38 | |
*** octavius <octavius!~octavius@92.40.168.173.threembb.co.uk> has quit IRC | 11:29 | |
*** octavius <octavius!~octavius@92.40.168.168.threembb.co.uk> has joined #libre-soc | 13:26 | |
*** josuah <josuah!~josuah@46.23.94.12> has quit IRC | 13:28 | |
*** josuah <josuah!~josuah@46.23.94.12> has joined #libre-soc | 13:29 | |
markos_ | lkcl, from #talos-workstation channel, https://sleef.org/ | 14:47 |
---|---|---|
markos_ | pytorch switched to using it | 14:47 |
markos_ | now that's a good project to port to SVP64 | 14:48 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 16:22 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.10> has joined #libre-soc | 16:22 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.10> has quit IRC | 17:01 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 17:01 | |
lkcl | let's take a look... ooo | 19:25 |
lkcl | markos_, you saw i worked out how to do polynomials? | 19:26 |
lkcl | thx to jacob's new parallel-prefix-sum REMAP schedule the creating 1 x x^2 x^3 x^4 .... is really easy. | 19:26 |
lkcl | https://sleef.org/ppc64.xhtml | 19:27 |
lkcl | haaaa, the *accuracy* is specified as part of the function name. yowser | 19:27 |
lkcl | that is one hell of a lot of work, that. | 19:27 |
*** WhyNotHugo <WhyNotHugo!bc7d0f0b52@2604:bf00:561:2000::28> has quit IRC | 19:28 | |
*** alethkit <alethkit!23bd17ddc6@sourcehut/user/alethkit> has quit IRC | 19:28 | |
programmerjake | lkcl, not really, all functions are implemented in terms of a few base simd operations such as add, sub, type cast, select | 19:29 |
programmerjake | we only need to make svp64 versions of that base simd operation layer | 19:29 |
programmerjake | this does mean we'd want to wait until we have svp64 compiler support however | 19:30 |
*** WhyNotHugo <WhyNotHugo!bc7d0f0b52@2604:bf00:561:2000::28> has joined #libre-soc | 19:31 | |
*** alethkit <alethkit!23bd17ddc6@sourcehut/user/alethkit> has joined #libre-soc | 19:31 | |
programmerjake | since sleef critically relies on inlining and compiler intrinsics for its speed | 19:31 |
*** octavius <octavius!~octavius@92.40.168.168.threembb.co.uk> has quit IRC | 19:34 | |
markos_ | that's true | 19:47 |
markos_ | technically, we could do a *working* port, but an inefficient one | 19:52 |
markos_ | ie, the assembly functions would just not be inlined | 19:52 |
programmerjake | well, tbh that sounds like people would try to benchmark that and declare SVP64 to be inefficient and not worth investigating | 19:55 |
markos_ | how could they do that? run it on the simulator? :) | 19:55 |
markos_ | by the time we have hardware we *should* also have compiler support | 19:55 |
markos_ | at least I would hope we do | 19:56 |
programmerjake | assuming you mean having all the base simd functions written as separate non-inlineable functions in assembly | 19:56 |
markos_ | yes | 19:56 |
markos_ | I'm obviously not suggesting we run pytorch on the simulator | 19:56 |
programmerjake | well, iirc ghostmansd is currently writing an in-order hw cycle-accurate simulator based on the simulator | 19:57 |
markos_ | but we could have a working port that would show how we could do it, and when intrinsics are available we just convert assembly to C | 19:57 |
markos_ | that's by definition inefficient and no one in their right mind would compare such a cpu with other implementations | 19:57 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 19:58 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.174.152> has joined #libre-soc | 19:58 | |
programmerjake | but the simulator is supposed to give you an idea how fast the code would run by telling you how many clock cycles it takes, so you'd look at the cycle count, not raw runtime of the python code | 19:59 |
markos_ | on a *reference* in-order hardware implementation | 19:59 |
programmerjake | or are you saying the assembly impl of sleef is inefficient? | 20:00 |
markos_ | no, I'm saying that benchmarking against in-order cpus is wrong | 20:00 |
markos_ | to compare performance | 20:00 |
markos_ | or rather to compare architectures | 20:01 |
markos_ | anyway, meeting time now isn't it? | 20:01 |
programmerjake | ok, but people may do it anyway...also in-order cpus are one of svp64's design targets and people will expect decent perf on them | 20:01 |
programmerjake | at least as good as neon on smaller arm in-order cores | 20:02 |
markos_ | I'm sure people will understand | 20:02 |
markos_ | that performance will be suboptimal due to no compiler | 20:02 |
markos_ | in any case I'm not suggesting we do it *now* | 20:02 |
markos_ | but that we add it in the list | 20:02 |
programmerjake | ok, as long as it's after basic compiler support | 20:03 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.174.152> has quit IRC | 20:03 | |
markos_ | one more reason I would suggest we do it now is to use its unit tests to make sure our implementations produce the required precision/accuracy | 20:03 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.42.164> has joined #libre-soc | 20:04 | |
programmerjake | i'll be a bit late to the meeting, sorry | 20:04 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.42.164> has quit IRC | 20:13 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 20:13 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!