*** josuah <josuah!~josuah@46.23.94.12> has quit IRC | 00:52 | |
*** josuah <josuah!~josuah@46.23.94.12> has joined #libre-soc | 00:53 | |
programmerjake | lkcl: note that you broke src/openpower/decoder/isa/test_caller_setvl.py::DecoderTestCase::test_svstep_inner_loop_8_jl | 02:41 |
---|---|---|
programmerjake | so the test needs to be updated | 02:41 |
programmerjake | found a weird bug, for some reason when decoding nego. the decoder says CR0 isn't written afaict (cr_out == NONE/0), get_cr_out returns None, False | 04:00 |
programmerjake | details in https://bugs.libre-soc.org/show_bug.cgi?id=972#c7 | 04:06 |
programmerjake | ah, found it after 1h debugging the decoder...I missed that the cr out column in the csv should have been CR0, but it's set to NONE, fixing that. | 04:59 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 06:18 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.124> has joined #libre-soc | 06:19 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.160.124> has quit IRC | 09:52 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 09:53 | |
*** josuah <josuah!~josuah@46.23.94.12> has quit IRC | 10:35 | |
*** josuah <josuah!~josuah@46.23.94.12> has joined #libre-soc | 10:35 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 11:22 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.41.165> has joined #libre-soc | 11:24 | |
*** guest_ <guest_!~igloo@mob-5-91-160-35.net.vodafone.it> has joined #libre-soc | 11:24 | |
*** guest_ <guest_!~igloo@mob-5-91-160-35.net.vodafone.it> has quit IRC | 11:37 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.41.165> has quit IRC | 11:40 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 11:40 | |
lkcl | ahh good catch - but wait did you remember to re-run "pywriter noall simplev"? | 11:41 |
* lkcl checking myself | 11:41 | |
lkcl | i modified the behaviour of setvl in simplev.mdwn, and although i did actually run "pytest-3 -v -n 8" in the openpower/decoder/isa directory i can't remember if i properly checked the results, whoops | 11:43 |
lkcl | i only have a vague recollection of doing so | 11:43 |
lkcl | File "test_caller_setvl.py", line 775, in test_svstep_inner_loop_8_jl | 11:44 |
lkcl | self.assertEqual(sim.svstate.vl, 12) | 11:44 |
lkcl | ah there we go | 11:44 |
lkcl | ok i'll sort that. | 11:44 |
lkcl | it's down to changes in behaviour of setvl | 11:44 |
lkcl | ah wait... no... | 11:45 |
lkcl | it's down to changes in *DCT*. | 11:45 |
lkcl | mode 2 is now deprecated | 11:45 |
lkcl | i'll have to pick a different function | 11:45 |
lkcl | wasn't expecting that | 11:45 |
* lkcl wheeee :) | 11:48 | |
*** guest_ <guest_!~igloo@mob-5-91-165-85.net.vodafone.it> has joined #libre-soc | 13:31 | |
sadoon[m] | markos: I can't find any documentation on how to tell mini-buildd to build certain packages, the only format it seems to accept is *.changes, any idea how to deal with this? | 14:41 |
sadoon[m] | The manual shows an example of sending a *.dsc file which would be what I need but it's not accepting that.. | 14:42 |
markos | mini-buildd is just like the normal buildd in that aspect, you have to upload a source package (with it's *.changes and *.dsc files) to a particular upload directory | 14:47 |
markos | it will constantly scan this directory, when it finds any new package it will send it to the first available builder in the farm | 14:47 |
markos | ofc first it will check gpg signatures, if dependencies are satisfied etc | 14:48 |
sadoon[m] | See the thing is, there's no .changes because I have no changes over the original sources | 14:51 |
markos | changes is produced when you build the source package | 14:51 |
markos | eg. with dpkg-buildpackage -S iirc | 14:52 |
sadoon[m] | Ah so it doesn't do something like "apt source" by itself? | 14:52 |
markos | but I think you can tell it to just pick the .dsc and rebuild a package | 14:52 |
sadoon[m] | It keeps giving me errors when I do | 14:53 |
markos | it's been a few years since I last worked with mini-buildd | 14:53 |
markos | what kind of errors? | 14:53 |
sadoon[m] | Well to start | 14:53 |
sadoon[m] | "Not in *.changes format!" | 14:53 |
markos | :) | 14:53 |
sadoon[m] | When using dput and dput-ng | 14:53 |
markos | yes, dput is the tool that is doing the upload, like dupload, etc | 14:54 |
sadoon[m] | So it doesn't accept .dsc at all even though there's an example of that in the manual | 14:54 |
sadoon[m] | Similar error when using mini-buildd-dput | 14:54 |
markos | it's 2 different things | 14:55 |
markos | the dsc is part of the source package itself | 14:55 |
sadoon[m] | Yeah I got that part, my problem I guess is with the expectation | 14:56 |
markos | and it basically says "the source package consists of the following (2 or 3) files, here are the checksums/sizes" | 14:56 |
markos | the changes file though is the descriptor of the actual upload, and it can consist of a source and/or multiple binaries for one or more architectures | 14:56 |
markos | so with a changes file you can upload in a single move all binaries as well -if you can | 14:57 |
markos | however that's discouraged | 14:57 |
markos | and now only source uploads are accepted | 14:57 |
markos | anyway | 14:57 |
markos | let me check a bit | 14:58 |
sadoon[m] | Unironically my janky script that rebuilds a list of packages (by default fetches the full repo) is more helpful in this case which is.. Funny. | 14:59 |
markos | check the advanced topics "Porting packages (“automatic no-changes ports”)" | 14:59 |
markos | basically you don't want to compile a new package, but you want to "port" existing packages, already in the archive to your local buildd | 15:00 |
sadoon[m] | Alright, building the hello package to test | 15:07 |
sadoon[m] | Not a terrible option but still needs to be scripted from my end :( | 15:07 |
sadoon[m] | Unless.. I use the mirror I have and just apply the API call to literally every .dsc :D | 15:07 |
markos | well, this is really not to trigger mass-rebuilds | 15:09 |
sadoon[m] | It's the only option afaict though? | 15:11 |
markos | I think what we ended up doing is indeed uploading each of the core packages in sequence and triggering a rebuild | 15:11 |
markos | so yes | 15:11 |
markos | we scripted that | 15:12 |
sadoon[m] | Not a terrible idea and they should be parallelized | 15:12 |
markos | and we only did it for "only" about 3k packages | 15:12 |
sadoon[m] | And not even very difficult to script | 15:12 |
markos | packages would not build until all the dependencies were satisfied anyway | 15:12 |
markos | but in order for this work there is a tricky part | 15:13 |
markos | in the beginning you will have the "target" repo, which is the sffs one | 15:13 |
markos | initially this will be empty | 15:13 |
markos | in order for any package to build, you will need to have a source repo with the original ppc binaries | 15:13 |
markos | otherwise no dependencies will be satisfied | 15:14 |
sadoon[m] | Of course | 15:14 |
markos | sorry, source *and* binary repo | 15:14 |
markos | so for each package built you will need to have 2 binary repos available | 15:14 |
markos | the sffs one with high priority | 15:14 |
markos | and the original ppc one to satisfy the dependencies | 15:14 |
sadoon[m] | Ahhh | 15:15 |
markos | as packages get build the sffs one will start filling | 15:15 |
sadoon[m] | So the one you're building will be used as the high priority one, makes sense | 15:15 |
markos | yes | 15:15 |
sadoon[m] | So it pulls from there when it can | 15:15 |
markos | however after a point- when vital packages are built- sffs repo will be sufficient | 15:15 |
markos | at which point you will have to remove the extra ppc one | 15:16 |
sadoon[m] | Yup yup | 15:16 |
markos | and when all is done, just to make sure, trigger a full rebuild of all packages with only sffs repo as source | 15:16 |
markos | apt source that is | 15:16 |
markos | at that point you will have a completely sufficient and self-contained sffs repo :) | 15:16 |
sadoon[m] | Awesome | 15:17 |
sadoon[m] | Thanks! | 15:17 |
markos | it's a lot of manual work in the beginning but it should get easier as you go | 15:17 |
markos | np, glad to help | 15:17 |
sadoon[m] | So now | 15:17 |
sadoon[m] | 1- script the rebuild with API calls | 15:17 |
sadoon[m] | 2- find out how to apply CFLAGS on all builds | 15:17 |
sadoon[m] | 3- profit | 15:17 |
markos | I've lost many hours in building debian packages in the past :)\ | 15:18 |
markos | ah for 2, you need dpkg-buildflags :) | 15:18 |
markos | for most packages that should work | 15:18 |
markos | some will need manual overrides though | 15:18 |
sadoon[m] | Awesome | 15:18 |
markos | the most tricky ones will be -unsurprisingly- the compilers, gcc/llvm and glibc | 15:19 |
sadoon[m] | Unfortunately we'll hardly be able to tell which ones don't work until we encounter a "illegal instruction" after the fact | 15:19 |
markos | true | 15:19 |
sadoon[m] | markos: On gentoo those were fine | 15:19 |
markos | on gentoo :) | 15:19 |
markos | good luck with gcc packages on Debian :) | 15:19 |
markos | if there is an award for most complicated package in Debian, that should definitely go to gcc, imho | 15:20 |
sadoon[m] | Ahh :') | 15:20 |
markos | but maybe it will be simpler now, who knows | 15:21 |
markos | anyway | 15:21 |
markos | if you need help with anything, just let me know | 15:21 |
sadoon[m] | Thanks again :) | 15:31 |
sadoon[m] | http://mini-buildd.installiert.net/extra/mini-buildd/documentation/_static/man/mini-buildd-super-portext.html#lbAB | 15:32 |
sadoon[m] | Aha! | 15:32 |
markos | yes, portext is the one! | 15:48 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 15:51 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.164.160> has joined #libre-soc | 15:51 | |
sadoon[m] | But this is super portext! | 16:29 |
sadoon[m] | There's a warning that it's experimental so maybe not the best option but it's there | 16:30 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.164.160> has quit IRC | 16:48 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 16:50 | |
lkcl | likely because gentoo is a pure cross-compiler... and consequently gcc itself *disables* most self-testing because it goes "oh you're cross-compiling, let me not run certain tests" | 18:23 |
lkcl | whereas this is "oh you're running native testing, you asked me to create a gcc native IBM POWER9 compiler, let me just compile up using native IBM POWER9 VSX instructions and then try running that at you" | 18:24 |
lkcl | which is whyyyyy the defaults | 18:24 |
lkcl | are always changed | 18:24 |
lkcl | on any native /usr/bin/gcc | 18:24 |
lkcl | to be | 18:24 |
lkcl | those | 18:24 |
lkcl | of the host | 18:24 |
lkcl | system | 18:24 |
lkcl | that you want to run on | 18:24 |
lkcl | i.e. | 18:25 |
lkcl | you can't make a cross-compiler the native /usr/bin/gcc on a debian native compiled system | 18:25 |
markos | lkcl, remember the 20 line function for convolutions? wanna take a guess at how long the NEON implementation is? not finished yet, some tests fail, but almost there | 18:26 |
markos | actually let me just give the number, 1350 | 18:27 |
markos | I used to be a fan of SIMD | 18:27 |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 19:03 | |
lkcl | markos, oh god. | 19:31 |
lkcl | 20 lines of c | 19:31 |
lkcl | can i possibly lessen the screaming going on inside your brain by first inviting you to write it in RISC-V RVV assembler? | 19:31 |
lkcl | it will help as a transition-point to make the shock of SVP64 much less by comparison | 19:32 |
lkcl | i'm actually serious! | 19:32 |
lkcl | the good news being it's only a 20 line function | 19:32 |
markos | I would! if I could actually get a f*cking RVV capable board! | 19:33 |
lkcl | well, there's always spike | 19:33 |
programmerjake | well, if you only need lengths <= minimum guaranteed by V extension, it won't need loops due to unknown VL | 19:33 |
markos | I'm not using yahe (yet another hardware emulator)! | 19:33 |
lkcl | which is properly feature-complete for running user-space binaries | 19:33 |
lkcl | https://github.com/riscv-software-src/riscv-isa-sim | 19:34 |
lkcl | it works - fully. 100%. | 19:34 |
markos | no, waiting for pypowersim is enough, previously I was doing SVE2 code on qemu | 19:34 |
markos | not fun at all | 19:34 |
lkcl | no waiting needed | 19:34 |
markos | also no time | 19:34 |
lkcl | it *already has* RVV support - has for years | 19:34 |
lkcl | ah that'll do it | 19:34 |
markos | RVV is on the list, but not right now | 19:34 |
lkcl | i was going to say i'm happy to put a budget towards it but if it's not... yeah the learning-curve *alone* would be too great for the time you've got available | 19:35 |
markos | I am juggling way too many projects at the moment, am close to a burn out | 19:35 |
lkcl | i forget i know RVV already (to a large extent) | 19:35 |
lkcl | and the spec's *massively* long | 19:35 |
lkcl | 192 instructions or something | 19:35 |
lkcl | so yeah, sadly, scratch that :) | 19:35 |
markos | yes, I'm having trouble learning SVE/SVE2 as it is, don't need yet another platform right this instant | 19:36 |
markos | having said that, I do intend to learn it, when it's stable | 19:36 |
lkcl | well the *assembler* is stable | 19:36 |
lkcl | it's the *intrinsics* that aren't | 19:36 |
markos | ... and there is hardware available :D | 19:36 |
programmerjake | imho RVV is too much of a distraction to justify our putting a budget to writing a complex algorithm in it | 19:36 |
lkcl | the assembler's been stable for... err... 2+ years? | 19:36 |
lkcl | programmerjake, indeed - that's why i suggested convolution because at its heart it will be literally one instruction in a loop | 19:37 |
lkcl | (20 lines) | 19:37 |
markos | also, the truth is that no matter how I dislike huge implementations, that's what pays the bills right now (NEON/SVE* optimizations) | 19:37 |
lkcl | but the learning-curve of RVV is itself too steep for markos right now | 19:37 |
lkcl | :) | 19:37 |
programmerjake | since we are making PowerISA hw, not RVV | 19:37 |
lkcl | yep, don't knock it | 19:38 |
markos | from that aspect, I'm glad it's complicated as heck because I can charge more :) | 19:38 |
lkcl | haha | 19:38 |
markos | but the romantic programmer in me wishes things would go back to simpler architectures, where you didn't need to learn 3k+ page ISA manuals to know which instruction to use | 19:39 |
markos | that's why I *loved* SVP64 | 19:39 |
markos | I don't need to remember every f*cking instruction in the ISA to implement an algorithm | 19:39 |
markos | I am pretty sure I could optimize *all* of libvpx in a 3-4 months compared to more than a year that it took by me *and* a few dedicated Arm engineers/specialists to finish the NEON port | 19:41 |
markos | possibly less than that once I get the hang of it and have actual hardware or FPGA | 19:41 |
markos | so imagine the time/cost saving when porting a project to SVP64, that's huge money saving factor for a company | 19:42 |
programmerjake | well, at least you don't have to write it in INTERCAL -- https://en.wikipedia.org/wiki/INTERCAL | 19:42 |
markos | :D | 19:42 |
markos | yeah, I did some COBOL many many years ago, not much fun at all :D | 19:43 |
markos | though I hear they are getting paid insane amounts of money | 19:43 |
markos | makes sense, they sold their soul! | 19:44 |
programmerjake | COBOL was actually designed to be used, INTERCAL is designed to annoy programmers to no end | 19:46 |
markos | indeed | 19:48 |
markos | I remember there was a contest once to create the most annoying/obfuscated programming language | 19:48 |
markos | I forget which was the winner | 19:48 |
markos | I remember there was one that used only whitespace characters | 19:50 |
programmerjake | well, currently reading how one dialect of javascript is a strong contender: https://en.wikipedia.org/wiki/JSFuck | 19:50 |
markos | and unsurprisingly it's called 'whitespace' :) | 19:51 |
programmerjake | oh, yeah, the programming language named "whitespace" iirc | 19:51 |
markos | and ofc brainfuck :) | 19:51 |
markos | https://en.wikipedia.org/wiki/Esoteric_programming_language | 19:52 |
markos | because some people obviously have too much time in their hands | 19:52 |
programmerjake | well, there are implementations of a minsky register machine in game of life...a programming language soo simple it only has increment and decrement | 19:54 |
programmerjake | though it's turing complete | 19:55 |
gnucode | so you are are working on more graphics related things to power9 right now? I've jumped in the middle of the convo and have no idea what is being discussed. | 20:02 |
gnucode | also the power9 laptop when that ships should be cool. | 20:02 |
programmerjake | http://www.igblan.free-online.co.uk/igblan/ca/ | 20:03 |
programmerjake | we were discussing esoteric programming languages, since they'd be more terrible than what markos has to do for his job, create a simd convolution for SVE, which is taking 100x as many insns as svp64 | 20:05 |
gnucode | simd convolution for SVE -> what does that mean? | 20:06 |
markos | imagine a for loop in C | 20:06 |
markos | actually 3 nested for loops | 20:06 |
markos | with a bunch of math expressions | 20:06 |
markos | about 20 lines | 20:06 |
markos | not very complicated either | 20:07 |
markos | now imagine writing specialized versions of it for a SIMD engine | 20:07 |
markos | in particular for Arm NEON | 20:07 |
markos | in C the loop is almost exactly the same, whether one of the loops has 6 or 8 or 12 iterations | 20:08 |
markos | the inner loop in particular | 20:08 |
programmerjake | https://en.wikipedia.org/wiki/Convolution -- the math that markos is implementing | 20:08 |
markos | yeah, that's the generic math theory behind it | 20:09 |
markos | in C the number of filters coefficients does not change the code at all | 20:09 |
markos | using NEON, I have to provide 3 different implementations depending if the number of filter coefficients is 6 or 8 or 12 | 20:09 |
markos | each implementation has to cater for the small widths (<=4) and larger | 20:10 |
markos | and 20 line C function becomes about 1350 lines with a ton of helper functions | 20:10 |
markos | fun | 20:10 |
gnucode | markos: may I ask who is paying you for work on ARM ? | 20:10 |
markos | Arm | 20:10 |
gnucode | sounds like a great time | 20:11 |
markos | our company is contracted by Arm to do NEON optimizations in a couple of projects, right this time it's libaom, previously it was libvpx | 20:11 |
markos | well it works, the resulting code can be many times faster than the original C | 20:12 |
markos | but that doesn't mean it's easy to write | 20:12 |
markos | and definitely not short | 20:12 |
gnucode | you are writing in assembly then? | 20:12 |
markos | gnucode, basically that's what we do, SIMD optimizations | 20:13 |
markos | not even, C intrinsics | 20:13 |
markos | thankfully! | 20:13 |
markos | though I have done some asm | 20:13 |
markos | I prefer not to | 20:13 |
markos | unless I have no alternative | 20:13 |
markos | in the past it was because the compiler could not use the right instructions | 20:13 |
markos | ie missing intrinsic, buggy code generation, etc | 20:14 |
markos | but right now things are much better | 20:14 |
markos | and that's a fundamental difference with SVP64 | 20:14 |
markos | I've already written a few algorithms in SVP64, mostly small functions but about the same size, 10-20 lines of C | 20:15 |
markos | and I can translate that to SVP64 in *assembly* at a fraction of the time I have to spend to convert the same algorithm to NEON or SSE/AVX in *C* | 20:16 |
markos | which for me is a game changer | 20:16 |
markos | not to mention the difference in debugging, it's much easier | 20:17 |
markos | and we don't even have hardware yet :D | 20:17 |
gnucode | ok. spv64 is the extention for openpower that luke and his team created. ok | 20:18 |
markos | yes, exactly | 20:18 |
markos | there is many great features, but in summary as a developer a) I don't have to learn thousands of new instructions, it's essentially a handful of instructions to convert Power ISA to a vector architecture | 20:19 |
markos | b) you don't need the SIMD implementation to convert to SVP64, it's actually easier -in the cases I tested at least- to start from the C implementation and convert that to SVP64 | 20:20 |
markos | c) you can do clever tricks with the register topology that are just impossible on SIMD, eg. remap, vertical-first, etc | 20:21 |
markos | on SIMD you have to load and arrange your data in the form you want | 20:21 |
programmerjake | that's not always the case though, for utf-8 validation, i did use a simd implementation translated to svp64 | 20:21 |
markos | yes, it's not in every case, but it does apply to many cases | 20:22 |
markos | with SVP64 you can load the data and just remap the indices to the structure you want, almost without any data rearrangment, in register or in memory | 20:23 |
markos | this is a tricky feature to understand at first, but once you figure out how it works, it's pretty powerful | 20:23 |
markos | it took me a while at least | 20:24 |
markos | finally the fact that it offers 128 registers is just beyond cool! | 20:24 |
markos | I know lkcl and programmerjake could go on with cool features in SVP64, but for me that's a good start :) | 20:24 |
gnucode | markos: will we see SVP64 in non-power hardware ? | 20:27 |
markos | I doubt it | 20:28 |
markos | but then again you never know what the future holds, but at least I don't expect it to happen in the next 5 years let's say :) | 20:28 |
gnucode | markos: you expect to see it in power inside 5 years? That's awesome! | 20:32 |
markos | I hope, but what I said is I don't expect it to see it in non-power in the next 5 years :) | 20:34 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!