programmerjake | the OpenPower Summit NA videos were just posted: https://youtube.com/playlist?list=PLEqfbaomKgQrYjscb-2cQt_S1v_xbg9Cq | 00:50 |
---|---|---|
lkcl | oh fantastic | 00:51 |
lkcl | will put that on the wiki | 00:51 |
programmerjake | i misread that as "will you put that on the wiki" and proceeded to discover that you're editing is faster than me trying to find the right page :) | 00:54 |
lkcl | :) | 00:55 |
lkcl | oh whoops | 00:55 |
lkcl | https://hardware.slashdot.org/story/21/11/15/2233229/ddr4-memory-protections-are-broken-wide-open-by-new-rowhammer-technique | 00:55 |
lkcl | i have a whole stack of stuff in browser url history | 00:55 |
lkcl | so i just typed "openpower2021" and it came up | 00:56 |
programmerjake | well, what do you expect, with intel saying ecc is only for servers... | 00:56 |
lkcl | which is why the google chromeos team is going apeshit | 00:56 |
programmerjake | at least ddr5 includes ecc by default | 00:57 |
lkcl | because google is legally responsible - under contract - for *seven years* - to provide secure OSes on 3rd party chromebook products | 00:57 |
lkcl | which they can't damn well do if the actual memory - at the hardware level - is borked | 00:58 |
lkcl | hence the initiative by google to actually make their own chromebook SoC | 00:58 |
lkcl | because none of the suppliers of SoCs can be bothered to get this right | 00:58 |
programmerjake | well...google can sue intel? idk | 00:58 |
lkcl | that'd be as fun as watching SCO try to take on IBM. for 20 years | 00:59 |
lkcl | oh: can you keep the TODO list up-to-date on the ternary bugreport? | 01:00 |
programmerjake | well, at least google would have some concrete flaw that they can point to that intel should have known about ... they had the same issue with ddr3 | 01:00 |
lkcl | edit the comment field | 01:00 |
programmerjake | sco was basically all hot air | 01:00 |
lkcl | tired. 1am. need rest | 01:01 |
programmerjake | yeah, i can do that | 01:02 |
sadoon_albader[m | Installing symbiflow-arch-defs ate up my laptop's 8GB of RAM and is now devouring the swap lmao | 02:27 |
sadoon_albader[m | I had built it already on my workstation and then transferred the files and used make install which should install the already built files bit for some reason it's building them from scratch | 02:28 |
octavius | Morning lkcl, saw the changes to the code and was able to run. I'll be busy first half of the day, so I'll continue working on it after that. | 08:04 |
lkcl | programmerjake, star. it's a surprisingly long list, but important to maintain as it's a measure of progress that we can guage whether the contract is likely to be met on tim | 09:46 |
lkcl | (and cut - or add - functionality as appropriate in order to do so) | 09:46 |
lkcl | sadoon_albader[m, yyeah 8GB is nowhere near enough for VLSI :) | 09:59 |
sadoon_albader[m | <lkcl> "sadoon_albader, yyeah 8GB is..." <- I'm finding out heh | 10:24 |
sadoon_albader[m | But I do wanna be able to use the laptop for this kind of work even if it's just for studying the code and maybe some light edits | 10:25 |
lkcl | yehyeh editing locally then rsync to a server for testing / running is perfectly reasonable. | 10:43 |
sadoon_albader[m | <lkcl> "yehyeh editing locally then..." <- That too xD | 12:26 |
sadoon_albader[m | But if I only wanted editing I would use the PowerBook G4 <3 | 12:26 |
sadoon_albader[m | I guess what I'm trying to achieve is having most of the needed software on my laptop and seeing what can be run there | 12:27 |
sadoon_albader[m | For example maybe simulation is fine but synthesis is not, etc | 12:27 |
lkcl | that sounds plausible if kept to nmigen / verilator / cocotb | 12:55 |
lkcl | friickin ellfire the data structures for hazard detection on the regfiles are complicated | 14:04 |
programmerjake | hmm, i didn't look at what you have yet, but wouldn't a table of write-counts per register work? once it works, maybe convert to instead be a cam so you don't need 384 entries? | 14:57 |
lkcl | the maximum number of outstanding write-counts per register that is permitted is: one | 15:22 |
lkcl | therefore a single bit suffices | 15:23 |
lkcl | the design of the unary regfiles permits multiple enable/set lines. | 15:23 |
lkcl | by a nice coincidence there is no problem of data overlap, because the input data is 1 bit. | 15:24 |
lkcl | effectively it is creating a bank of single-bit DFFs | 15:24 |
programmerjake | that's not completely true, some instructions can write to multiple registers, which could be set to the same register | 15:24 |
lkcl | each addressable with their own unique enable wire | 15:24 |
lkcl | fortunately the potentially-perceived problem you're describing won't occur | 15:25 |
lkcl | because the request to set the exact same register bit twice will be ORed together and result in only the one request. | 15:25 |
lkcl | because there is only one bit (one "count") | 15:26 |
lkcl | all of this is easy | 15:26 |
lkcl | it's the fact that the regfiles are entirely managed via dictionaries of lists | 15:26 |
lkcl | and associated PriorityPickers | 15:26 |
lkcl | and that i wrote this code 18 months ago in under 2 weeks flat and haven't come back to it since | 15:27 |
programmerjake | well, as long as you don't: decrement the count each time you write then when something writes twice you end up with a zero count before the second write | 15:28 |
lkcl | the Power ISA already prohibits those circumstances | 15:30 |
lkcl | there are very few double-output register write situations | 15:30 |
lkcl | one of them iisss... LD/ST-with-update | 15:31 |
lkcl | i have a vague recollection of RT==RA being prohibited | 15:31 |
lkcl | RS==RA. something like that | 15:31 |
programmerjake | yeah, sounds right...idk if our decoder detects that tho | 15:31 |
lkcl | again, a vague recollection, of it being "undefined" behaviour | 15:32 |
lkcl | If RA=0 or RA=RT, the instruction form is invalid. | 15:33 |
lkcl | v3.0C p51 | 15:33 |
lkcl | yep, prohibited (invalid) | 15:33 |
lkcl | not undefined | 15:33 |
programmerjake | undefined behavior should not be an escape hatch that accidentally allows instructions to see things they shouldn't cuz our cpu assumes they won't happen... | 15:34 |
lkcl | write_cr.ok is not getting through to the write flags, and i don't know why. argh | 15:34 |
programmerjake | so, ra=rt *is* decoded as a trap, then? | 15:35 |
lkcl | write_o.ok - for the output to the INT regfile - is fine | 15:35 |
lkcl | don't know - am in the middle of trying to track down something else at the momet | 15:35 |
lkcl | can you raise it as a bug to investigate? link to #690? | 15:36 |
programmerjake | k, i'll open a bug report so we remember to check | 15:36 |
lkcl | need to examine microwatt decoder. | 15:38 |
programmerjake | https://bugs.libre-soc.org/show_bug.cgi?id=747 | 15:41 |
lkcl | star | 15:42 |
lkcl | argh i worked it out. | 16:20 |
lkcl | frickin ellfire. it's because write data has a data field and an ok field | 16:22 |
lkcl | i'd returned the full field | 16:22 |
lkcl | sigh | 16:22 |
lkcl | frickin 'ell that was a tough bug to find | 16:27 |
programmerjake | if you used pyls it would have highlighted that as a type mismatch making it trivial to find... | 16:33 |
lkcl | yeh no chance. this was a single bit assignment from a Record: it's a perfectly legal and legitimate .eq assignment | 18:16 |
lkcl | also it was a codepath that was not necessary because of using a FSM | 18:28 |
octavius | lkcl, what is the distinction between a pin and a port? I guess that a port has some extra meta-data, whereas a pin just a signal/record? One of the issues you mentioned was the duplication of "pin" and "port" which you then changed to "padpin" and "padport". | 18:30 |
octavius | Printing the "pin" and "port" objects, I get: | 18:34 |
octavius | (rec uart_0__rx i) <- pin | 18:34 |
octavius | (rec uart_0__rx io) <- port | 18:34 |
octavius | So I guess the port contains extra signals (o, oe)? | 18:35 |
octavius | Tried printing port.io.i (as well as .o and .oe), and got AttributeError (the object doesn't have them) | 18:37 |
lkcl | pin is supposed to be "a physical pin". in ASIC terminology: an IO pad | 18:37 |
lkcl | port is the wires connecting *to* that pin (pad) | 18:37 |
octavius | printing port.io gives me a port.io signal | 18:38 |
octavius | ok | 18:38 |
lkcl | print out the layout | 18:38 |
lkcl | port.layout | 18:38 |
octavius | gives me Layout([('io', 1)]) | 18:38 |
octavius | so there's one signal called "io" | 18:39 |
octavius | but what is "io"? | 18:39 |
lkcl | excellent, then there is a Signal of size 1 named "io" | 18:39 |
lkcl | search the code for the word "io" | 18:39 |
lkcl | up at line 64, oh look! Subsignal("io", ...) | 18:40 |
lkcl | but hmmm | 18:40 |
octavius | It's a subsignal with 3 pins though... | 18:40 |
octavius | ah but it has width 3 | 18:40 |
lkcl | yeah the bi-directional one is | 18:40 |
lkcl | yes | 18:40 |
lkcl | so that isn't it | 18:40 |
lkcl | but | 18:40 |
lkcl | sigh | 18:40 |
lkcl | there is a limitation of nmigen's codebase here | 18:41 |
octavius | so from the point of .eq statement, does nmigen allow multiple drivers? | 18:41 |
lkcl | ResourceManager has not been designed to cope with ASICs - only FPGAs | 18:41 |
octavius | ok | 18:41 |
lkcl | no. | 18:41 |
lkcl | the maximum number of possible bits is taken - as a linear sequence | 18:41 |
octavius | what do you mean? | 18:42 |
lkcl | anything with higher linear bit index numbers is completely ignored | 18:42 |
lkcl | s1 = Signal(3) | 18:42 |
lkcl | s2 = Signal(2) | 18:42 |
lkcl | s2.eq(s1) will take *two* bits only | 18:42 |
octavius | yeah, makes sense | 18:42 |
lkcl | s1.eq(s2) will.... take 2 bits only. | 18:42 |
octavius | so assigning io to something will take fewer bits | 18:43 |
lkcl | "as many bits as possible" | 18:43 |
octavius | but you could do a combined .eq where you form a record right? | 18:43 |
lkcl | btw i think get_input and get_output are ok | 18:44 |
octavius | assign one bit from io to 3 diff signals | 18:44 |
lkcl | Records are treated as simply... a linear sequence of bits | 18:44 |
lkcl | which makes for some absolutely bloody awful yosys graphs | 18:44 |
octavius | yeah | 18:44 |
lkcl | because the assignments are literally done by zip(LHS, RHS) iterating bit-by-bit | 18:45 |
lkcl | followed by individual assignment at the bit-level | 18:45 |
lkcl | but hey | 18:45 |
octavius | hahaha | 18:45 |
octavius | why did you add the tribuffers back to get_tristate and get_input_output? | 18:46 |
lkcl | commented out | 18:46 |
lkcl | so that if ever coriolis2 does not do the IOpad cells itself, this code can be used to do it | 18:46 |
lkcl | without having to go, "err what was that complex instantiation of an external cell again?" | 18:46 |
octavius | in the "no JTAG" path, tribufs are *not* commented out. Shall I fix that? | 18:47 |
octavius | And as for get_tristate and get_input_output, "pin" and "port" are still duplicated, shall I correct those with the "pad" suffix as well? | 18:48 |
lkcl | yes. | 18:48 |
octavius | sure | 18:48 |
lkcl | get_input_output is the important one for now | 18:49 |
octavius | ok | 18:49 |
lkcl | which should be easy (but laborious) to construct by literal cut/paste from get_input and get_output | 18:51 |
lkcl | BUT | 18:51 |
lkcl | butbutbut | 18:51 |
lkcl | note how i have already split out the 3 parts of the port - the 3-long Signal - into i, o and oe | 18:51 |
octavius | yes | 18:51 |
lkcl | so rather than try to assign a 3-wide Signal to a single-bit pad (which, as explained above, ain't gonna produce the right thing but will SILENTLY SUCCEED) | 18:53 |
lkcl | use the temp variables | 18:53 |
octavius | sure | 18:53 |
lkcl | HA. hazard bitvector is being set and cleared | 18:54 |
lkcl | bout frickin time | 18:54 |
octavius | lkcl, can you check if this my understanding is correct in this diagram: https://ibb.co/7KJ4y5W | 19:05 |
lkcl | looks about right | 19:13 |
lkcl | basically it's a literal cut/paste job of the code from get_input but using io[0] instead of just "io" | 19:14 |
lkcl | combined with a literal cut/paste job of the code from get_output but using io[1] (or its temp assignment) instead of just "io" | 19:14 |
lkcl | and blindingly-obviously-the-same for oe | 19:15 |
lkcl | cesar, hooray, i have regfile write bitvectors reasonably working | 19:15 |
lkcl | there is one known (expected) bug at the moment | 19:15 |
octavius | of course, I just wanted to make sure I understood the abstraction of a "port" (which seems like a 'wire' or 'bus' to me) and a "pin" (the start/end point of a signal) | 19:16 |
lkcl | when the list of registers to write to is created by PowerDecoder2, and regspec_decode_write() turns those into appropriate tuple ("yes please write" and "to this register number") | 19:16 |
lkcl | it is *not* necessarily *guaranteed* that the ALU will, actually, write to that regfile port | 19:17 |
lkcl | e.g. XER.so only needs to be written to... *sometimes* | 19:17 |
lkcl | so the ALU is allowed to *decide* whether to set ospec().someoutputreg_data.ok | 19:18 |
lkcl | this means that the write port will not be requested (no wr.req_o), which of course saves on regfile writes | 19:18 |
lkcl | the PriorityPicker will not be enabled, etc. etc. | 19:18 |
lkcl | which is the whole point of letting the ALU decide | 19:19 |
lkcl | BUT | 19:19 |
lkcl | butbutbut | 19:19 |
lkcl | the problem is: *the bitvector write bit was raised for that register* | 19:19 |
lkcl | and if the ALU does not request to write to it, it will never be cleared. | 19:19 |
lkcl | so | 19:19 |
lkcl | the situation, when the ALU has finished, but the "data.ok" is False and yet the wrmask has HI for that bit, | 19:20 |
lkcl | this has to be taken as a request to *immediately* clear that write-vector bit | 19:20 |
lkcl | gaah | 19:20 |
lkcl | octavius, sorry was in the middle of that train of thought | 19:21 |
octavius | no worries, it was entertaining XD | 19:22 |
octavius | ;) | 19:22 |
octavius | lkcl, I guess it's time for me to work on unit tests XD | 21:52 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!