lkcl | programmerjake, the damage that you are doing to the reputation of Libre-SOC by continuing to systematically ignore my technical assessments is not something i can tolerate much longer | 00:18 |
---|---|---|
lkcl | i gave an *extremely detailed* analysis as to why GPRs are unacceptable for crbinlut and you completely and utterly disregarded and ignored it | 00:19 |
lkcl | to then request to put in a note to the ISA WG, without thinking of the potential for damage that would cause by *even requesting* to put in an "objection" when it is BLOODY OBVIOUS to anyone reading the bugreport that you've IGNORED MY ASSESSMENT | 00:20 |
lkcl | is making us look REALLY bad. | 00:20 |
lkcl | you HAVE TO STOP THIS | 00:20 |
lkcl | i can't tolerate it much longer | 00:21 |
lkcl | when | 00:21 |
lkcl | i | 00:21 |
lkcl | say | 00:21 |
lkcl | STOP | 00:21 |
lkcl | it | 00:21 |
lkcl | FUCKING | 00:21 |
lkcl | well means | 00:21 |
lkcl | SSSTTTT OOOOOO PPPPP | 00:21 |
lkcl | when i say drop the matter immediately | 00:21 |
lkcl | it FUCKING well means DROP THE FUCKING MATTER FUCKING IMMEDIATELY | 00:21 |
lkcl | when i say NO it FUCKING WELL MEANS NO | 00:21 |
lkcl | i've said FIVE TIMES now that GPRS ARE NOT GOING INTO CRBINLUT | 00:22 |
lkcl | i have given you rational explanations, you have ignored them | 00:22 |
programmerjake | I did not ignore your assessment, I responded in detail with technical justification why I think your assessment is flawed. | 00:23 |
lkcl | where? | 00:23 |
lkcl | where is your response to the technical evaluation of IBM Power 9/10 Hardware that i gave? | 00:23 |
programmerjake | i'm not saying you have to add GPRs, but that we considered it...just a sec while I search | 00:23 |
lkcl | which explained that IBM will *already have* a layout | 00:23 |
lkcl | there is a TIMING issue associated with pipelines that will already be in IBM's design that we CANNOT damage | 00:24 |
lkcl | i said NO on GPRs and that really is the end of the matter | 00:24 |
lkcl | please listen | 00:24 |
lkcl | i said NO | 00:24 |
lkcl | that's the END of the discussion | 00:24 |
lkcl | i said that 60 bits is wasted on a GPR, you did not listen | 00:25 |
lkcl | i said that the IBM Hardware team will have a design that GPRs will disrupt due to two register files being needed, you did not listen | 00:26 |
programmerjake | https://bugs.libre-soc.org/show_bug.cgi?id=1017#c19 | 00:26 |
lkcl | that is *not* a valid argument unfortunately | 00:27 |
lkcl | pipelines can be out in different areas and have different timings | 00:27 |
lkcl | it's not uniform | 00:28 |
lkcl | we have no idea if IBM *actually* puts the CR regfile right next to the GPRs. | 00:29 |
lkcl | and in fact they put warnings saying "if you use Rc=1 (or OE=1) it may significantly degrade performance on some systems" | 00:30 |
lkcl | ok? | 00:30 |
lkcl | search for it in the specification. | 00:30 |
lkcl | i forget the exact words used, but it's there. | 00:30 |
programmerjake | it seems entirely reasonable to me that they would have to or at least have a fast path between CRs/GPRs due to the huge number of compares and branches in programs | 00:30 |
lkcl | or | 00:31 |
lkcl | they load up the Reservation Stations with speculative instructions and take the hit | 00:31 |
programmerjake | those have other valid reasons why Rc=1/OE=1 are slow, such as needing to read/write SO which prevents instruction-level parallelism | 00:31 |
lkcl | we have no idea of telling | 00:31 |
lkcl | that all comes out in the wash by loading up with enough speculative execution that people simply can't tell | 00:32 |
programmerjake | speculative execution doesn't solve dependency chains which still have to execute one instruction at a time | 00:32 |
programmerjake | unless you're also doing value speculation | 00:32 |
programmerjake | which isn't on all cpus, hence why they may be slow | 00:33 |
lkcl | i mean: you can have a dependency chain that is single-instruction-at-a-time but that does not prevent you from having multiple *other* chains that *can* be executed *in parallel* | 00:34 |
lkcl | you can't keep extending the number of RSes infinitely however | 00:34 |
programmerjake | that's true, to some extent. an instruction running slow doesn't necessarily affect other instructions, however that instruction still is running slow | 00:35 |
lkcl | but 1000+ RSes (1000 in-flight instructions) hides a lot of such single-instruction-at-a-time chains | 00:35 |
lkcl | it averages out / gets hidden - that's the point of having such vast numbers of Reservation Stations | 00:35 |
programmerjake | also OE=1 instructions aren't very common, so some cpus may only have 1 ALU capable of running them | 00:35 |
lkcl | if you end up with a critical loop which has such a chain, then yes, tough titty. | 00:36 |
programmerjake | or something like that | 00:36 |
lkcl | no - they're so bad that they're just avoided entirely except in unit tests (that's information from paul mackerras) | 00:36 |
lkcl | which is one of the reasons why i put in SVP64 that OE=1 is ignored | 00:37 |
programmerjake | Rc=1 instructions are commonly used by llvm whenever they can replace a cmp. so it's reasonable to think cpus optimize for that | 00:37 |
lkcl | we genuinely have no way of telling. | 00:37 |
lkcl | we have no idea if it has a single-clock-cycle penalty, or none | 00:38 |
programmerjake | we can look at a2o... | 00:38 |
lkcl | A2O is 12+ years old and IBM cares very little about it. it's nowhere indicative of what went into Bill Starke's design (POWER9, POWER10) | 00:39 |
programmerjake | it's better than nothing. also, we can look at the gcc/llvm scheduler models which are supposed to be more accurate models of the cpus | 00:39 |
lkcl | with their focus on VSX, we don't even know if they *care* about the performance of Scalar instructions! | 00:39 |
programmerjake | they'd be stupid not to focus on scalar instructions, it's what most programs use a lot of, e.g. database and web stuff | 00:40 |
programmerjake | s/focus/optimize | 00:40 |
lkcl | btw, you do realise, that after all of this assessment and analysis (of llvm etc), the answer is still "no"? | 00:40 |
lkcl | what i am trying to get you to realise is that there is no point in pursuing one particular technical path when there are other paths that already eliminate a particular decision | 00:41 |
programmerjake | yes, but I think we should still put the note that we considered it, since the ISA WG may want that option instead | 00:41 |
lkcl | "60 bits wasted when 4 bits is all that's needed" was enough | 00:42 |
lkcl | i'll argue - and vote - no on that. | 00:42 |
lkcl | because the purpose of crbinlut and crternlogi are to be CR-based, not GPR-based | 00:42 |
lkcl | to be contained within the CR pipeline | 00:43 |
lkcl | as close to the CR regfile as possible | 00:43 |
lkcl | *without* compromising performance by requiring timing-dependent linkage to the GPR regfile, and *without* requiring extra read-ports on the GPR regfile | 00:44 |
lkcl | or increasing the Dependency Matrix sizes by bringing in a GPR | 00:44 |
programmerjake | well, imho the purpose is to do bitwise ops on CRs, the selection of which bitwise op to do doesn't need to also be in a CR, instead it should be taken from the most logical set of registers, which is imho GPRs. | 00:44 |
lkcl | i know. | 00:45 |
lkcl | noted | 00:45 |
programmerjake | i'd expect there to be other ops that operate on CRs and use GPRs as an input | 00:45 |
lkcl | the crweirds i specifically added so as to increase the communication bandwidth between CRs and GPRs. | 00:46 |
lkcl | but that pipeline will (obviously) sit half-way between GPRs and CRs in terms of timing (wires) | 00:47 |
programmerjake | so crbinlog can likely just share the dependency matrix with crweirds and other similar insns, it likely wouldn't need extra dependency-management hardware | 00:47 |
lkcl | it doesn't quite work that way | 00:48 |
lkcl | the DMs are going to be massive. | 00:49 |
lkcl | it's... complicated | 00:49 |
lkcl | the total number of rows/columns has to match the total outstanding operations expected. | 00:50 |
lkcl | if you want 1,000 instructions outstanding you need (on a naive calculation) a MILLION-entry Dependency Matrix - 1,000 rows x 1,000 columns | 00:51 |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 00:53 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 00:53 | |
lkcl | if however some instructions are completely unrelated - the results of the output from one are never the input to others (or vice-versa) then there is *never* going to be a Dependency | 00:53 |
programmerjake | well, we're likely to want to run a whole bunch of crweird ops in sequence, just like we might want a whole bunch of crbinlog ops in sequence, so you e.g. have 8 entries for the GPR/CR ALU, so you can run 8 crweirds at a time or 8 crbinlogs at a time... | 00:54 |
lkcl | and the "cell" for those instructions is empty | 00:54 |
programmerjake | yeah, so that means the dependency matrix is more sparse -- takes less hardware | 00:54 |
lkcl | therefore you *don't* want inter-mixed instructions reading/writing to/from multiple register files | 00:54 |
lkcl | or if they are, you want the absolute bare minimum of operands. aka "mv" instructions | 00:55 |
lkcl | (mv or convert) | 00:56 |
lkcl | mtcr, mfocr, mv, fmv, fcvt, etc. | 00:56 |
lkcl | please understand: i have a really bad memory, it takes considerable effort (and in some cases a lot of stress) to extract information that it sounds perfectly reasonable to expect to provide immediately | 00:58 |
lkcl | ... but i can't | 00:58 |
lkcl | i get a *subconscious* "ping" - an echo - of why something is wrong/right | 00:59 |
lkcl | but it's very vague, and very faint, due to the memory problems i have | 00:59 |
lkcl | it's sometimes taken me *weeks* to properly recall something, sufficient to answer a question or an issue "properly" | 01:00 |
programmerjake | don't worry, you're not the only one who sometimes can't remember things unless prompted | 01:01 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 01:13 | |
programmerjake | lkcl, you changed my mind: https://bugs.libre-soc.org/show_bug.cgi?id=1017#c28 | 01:22 |
programmerjake | i'm guessing you missed my earlier message because the email server is going really slow...I got some bugzilla emails out of order even though the comments were posted >40min apart! | 01:29 |
programmerjake | (referring to the comment #19 one) | 01:29 |
programmerjake | (as the one you missed -- the ones I got out of order are later messages) | 01:30 |
programmerjake | luke, thank you for being persistent and explaining stuff! | 01:31 |
programmerjake | example of llvm generating a Rc=1 instruction rather than using cmp: https://clang.godbolt.org/z/4YxeTWdb3 | 01:45 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 01:47 | |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 01:48 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 02:35 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 02:37 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 04:59 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 05:23 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 08:09 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.172.224> has joined #libre-soc | 08:10 | |
programmerjake | found an interesting article on rotors and bivectors -- a much easier to understand way to handle rotations than quaternions: https://marctenbosch.com/quaternions/ | 10:15 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.172.224> has quit IRC | 10:32 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.172.224> has joined #libre-soc | 10:33 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.172.224> has quit IRC | 10:59 | |
*** markos <markos!~Konstanti@static062038151250.dsl.hol.gr> has quit IRC | 11:31 | |
*** markos <markos!~Konstanti@static062038151250.dsl.hol.gr> has joined #libre-soc | 11:31 | |
lkcl | julia longtin tried explaining some of this to me, once :) | 13:34 |
sadoon[m] | So, small update: I think my next step is to create the new sffs target in gcc (and clang afterwards) before trying to build bookworm and realize not all packages respect CFLAGS | 13:50 |
sadoon[m] | For gentoo we already know what the relevant flags are. | 13:50 |
sadoon[m] | We can call gentoo done basically because of that | 13:50 |
markos | sadoon[m], I would advise not changing the triplet | 13:50 |
markos | it would make things simpler | 13:50 |
markos | we did discuss this here previously | 13:51 |
markos | it's much easier keeping the same triplet and just changing the specs | 13:51 |
markos | because the changes required to the platform detection in all of the packages would be absolutely overwhelming | 13:52 |
sadoon[m] | My memory is a little shot, pardon me :) | 13:52 |
sadoon[m] | It does make my job a hell of a lot easier, as long as you're sure that's the right move | 13:52 |
markos | it's simple really, in pretty much all packages, esp those that have different configuration flags in configure.ac/etc scripts you would have to add an extra entry for the sffs triplet | 13:53 |
markos | in a few packages it does make sense | 13:53 |
markos | ie, in those that -maltivec is enabled for example or similar flags | 13:53 |
markos | but those are the minority | 13:53 |
markos | so it's easier to fix those few packages with the extra configuration to enable runtime detection when it's possible | 13:54 |
markos | rather than fix 20k packages with a possible new addition to the triplet configuration in each script | 13:55 |
markos | I did that once, adding a new triplet for armhf | 13:55 |
markos | and I had to send bug reports to 100s of packages | 13:55 |
markos | just to add armhf triplet to the configure scripts | 13:55 |
markos | it's trivial, but quite annoying and tedious | 13:56 |
markos | so it's much easier to just keep the same triplet, recompile with new specs | 13:56 |
markos | and fix the few packages that pose problems manually | 13:56 |
sadoon[m] | brb, on the road | 13:58 |
sadoon[m] | Alright, markos: sounds good to me | 15:38 |
sadoon[m] | Perhaps an -mcpu patch for gcc/clang would be in order then? | 15:38 |
sadoon[m] | One that enables these options | 15:39 |
markos | yes | 15:39 |
sadoon[m] | We can discuss this in the meeting to hear what the rest of the team has to say too | 15:39 |
markos | agreed | 15:39 |
sadoon[m] | I bet an mcpu would be very easy to implement | 15:40 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 15:42 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has quit IRC | 15:46 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.174.70> has joined #libre-soc | 15:47 | |
*** choozy <choozy!~choozy@75-63-174-82.ftth.glasoperator.nl> has joined #libre-soc | 16:41 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 18:04 | |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 18:18 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 18:18 | |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 18:52 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 18:53 | |
*** markos <markos!~Konstanti@static062038151250.dsl.hol.gr> has quit IRC | 18:59 | |
*** markos <markos!~Konstanti@static062038151250.dsl.hol.gr> has joined #libre-soc | 19:05 | |
programmerjake | lkcl, toshywoshy, markos, etc.: meeting in 16min | 19:44 |
gnucode | I really wish I could listen in to said meeting...but I'm at work. and I left my headphones at home. | 20:31 |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 20:32 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 20:33 | |
lkcl | gnucode, aw doh | 20:36 |
gnucode | :( | 20:37 |
*** awilfox <awilfox!~awilfox@kelsey.foxkit.us> has quit IRC | 21:04 | |
*** awilfox <awilfox!~awilfox@kelsey.foxkit.us> has joined #libre-soc | 21:05 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.174.70> has quit IRC | 22:26 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 22:26 | |
*** choozy <choozy!~choozy@75-63-174-82.ftth.glasoperator.nl> has quit IRC | 23:03 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!