Tuesday, 2021-11-16

programmerjakethe OpenPower Summit NA videos were just posted: https://youtube.com/playlist?list=PLEqfbaomKgQrYjscb-2cQt_S1v_xbg9Cq00:50
lkcloh fantastic00:51
lkclwill put that on the wiki00:51
programmerjakei misread that as "will you put that on the wiki" and proceeded to discover that you're editing is faster than me trying to find the right page :)00:54
lkcloh whoops00:55
lkcli have a whole stack of stuff in browser url history00:55
lkclso i just typed "openpower2021" and it came up00:56
programmerjakewell, what do you expect, with intel saying ecc is only for servers...00:56
lkclwhich is why the google chromeos team is going apeshit00:56
programmerjakeat least ddr5 includes ecc by default00:57
lkclbecause google is legally responsible - under contract - for *seven years* - to provide secure OSes on 3rd party chromebook products00:57
lkclwhich they can't damn well do if the actual memory - at the hardware level - is borked00:58
lkclhence the initiative by google to actually make their own chromebook SoC00:58
lkclbecause none of the suppliers of SoCs can be bothered to get this right00:58
programmerjakewell...google can sue intel? idk00:58
lkclthat'd be as fun as watching SCO try to take on IBM. for 20 years00:59
lkcloh: can you keep the TODO list up-to-date on the ternary bugreport?01:00
programmerjakewell, at least google would have some concrete flaw that they can point to that intel should have known about ... they had the same issue with ddr301:00
lkcledit the comment field01:00
programmerjakesco was basically all hot air01:00
lkcltired. 1am. need rest01:01
programmerjakeyeah, i can do that01:02
sadoon_albader[mInstalling symbiflow-arch-defs ate up my laptop's 8GB of RAM and is now devouring the swap lmao02:27
sadoon_albader[mI had built it already on my workstation and then transferred the files and used make install which should install the already built files bit for some reason it's building them from scratch02:28
octaviusMorning lkcl, saw the changes to the code and was able to run. I'll be busy first half of the day, so I'll continue working on it after that.08:04
lkclprogrammerjake, star. it's a surprisingly long list, but important to maintain as it's a measure of progress that we can guage whether the contract is likely to be met on tim09:46
lkcl(and cut - or add - functionality as appropriate in order to do so)09:46
lkclsadoon_albader[m, yyeah 8GB is nowhere near enough for VLSI :)09:59
sadoon_albader[m<lkcl> "sadoon_albader, yyeah 8GB is..." <- I'm finding out heh10:24
sadoon_albader[mBut I do wanna be able to use the laptop for this kind of work even if it's just for studying the code and maybe some light edits10:25
lkclyehyeh editing locally then rsync to a server for testing / running is perfectly reasonable.10:43
sadoon_albader[m<lkcl> "yehyeh editing locally then..." <- That too xD12:26
sadoon_albader[mBut if I only wanted editing I would use the PowerBook G4 <312:26
sadoon_albader[mI guess what I'm trying to achieve is having most of the needed software on my laptop and seeing what can be run there12:27
sadoon_albader[mFor example maybe simulation is fine but synthesis is not, etc12:27
lkclthat sounds plausible if kept to nmigen / verilator / cocotb12:55
lkclfriickin ellfire the data structures for hazard detection on the regfiles are complicated14:04
programmerjakehmm, i didn't look at what you have yet, but wouldn't a table of write-counts per register work? once it works, maybe convert to instead be a cam so you don't need 384 entries?14:57
lkclthe maximum number of outstanding write-counts per register that is permitted is: one15:22
lkcltherefore a single bit suffices15:23
lkclthe design of the unary regfiles permits multiple enable/set lines.15:23
lkclby a nice coincidence there is no problem of data overlap, because the input data is 1 bit.15:24
lkcleffectively it is creating a bank of single-bit DFFs15:24
programmerjakethat's not completely true, some instructions can write to multiple registers, which could be set to the same register15:24
lkcleach addressable with their own unique enable wire15:24
lkclfortunately the potentially-perceived problem you're describing won't occur15:25
lkclbecause the request to set the exact same register bit twice will be ORed together and result in only the one request.15:25
lkclbecause there is only one bit (one "count")15:26
lkclall of this is easy15:26
lkclit's the fact that the regfiles are entirely managed via dictionaries of lists15:26
lkcland associated PriorityPickers15:26
lkcland that i wrote this code 18 months ago in under 2 weeks flat and haven't come back to it since15:27
programmerjakewell, as long as you don't: decrement the count each time you write then when something writes twice you end up with a zero count before the second write15:28
lkclthe Power ISA already prohibits those circumstances15:30
lkclthere are very few double-output register write situations15:30
lkclone of them iisss... LD/ST-with-update15:31
lkcli have a vague recollection of RT==RA being prohibited15:31
lkclRS==RA.  something like that15:31
programmerjakeyeah, sounds right...idk if our decoder detects that tho15:31
lkclagain, a vague recollection, of it being "undefined" behaviour15:32
lkclIf RA=0 or RA=RT, the instruction form is invalid.15:33
lkclv3.0C p5115:33
lkclyep, prohibited (invalid)15:33
lkclnot undefined15:33
programmerjakeundefined behavior should not be an escape hatch that accidentally allows instructions to see things they shouldn't cuz our cpu assumes they won't happen...15:34
lkclwrite_cr.ok is not getting through to the write flags, and i don't know why. argh15:34
programmerjakeso, ra=rt *is* decoded as a trap, then?15:35
lkclwrite_o.ok - for the output to the INT regfile - is fine15:35
lkcldon't know - am in the middle of trying to track down something else at the momet15:35
lkclcan you raise it as a bug to investigate? link to #690?15:36
programmerjakek, i'll open a bug report so we remember to check15:36
lkclneed to examine microwatt decoder.15:38
lkclargh i worked it out.16:20
lkclfrickin ellfire.  it's because write data has a data field and an ok field16:22
lkcli'd returned the full field16:22
lkclfrickin 'ell that was a tough bug to find16:27
programmerjakeif you used pyls it would have highlighted that as a type mismatch making it trivial to find...16:33
lkclyeh no chance.  this was a single bit assignment from a Record: it's a perfectly legal and legitimate .eq assignment18:16
lkclalso it was a codepath that was not necessary because of using a FSM18:28
octaviuslkcl, what is the distinction between a pin and a port? I guess that a port has some extra meta-data, whereas a pin just a signal/record? One of the issues you mentioned was the duplication of "pin" and "port" which you then changed to "padpin" and "padport".18:30
octaviusPrinting the "pin" and "port" objects, I get:18:34
octavius(rec uart_0__rx i) <- pin18:34
octavius(rec uart_0__rx io) <- port18:34
octaviusSo I guess the port contains extra signals (o, oe)?18:35
octaviusTried printing port.io.i (as well as .o and .oe), and got AttributeError (the object doesn't have them)18:37
lkclpin is supposed to be "a physical pin".  in ASIC terminology: an IO pad18:37
lkclport is the wires connecting *to* that pin (pad)18:37
octaviusprinting port.io gives me a port.io signal18:38
lkclprint out the layout18:38
octaviusgives me Layout([('io', 1)])18:38
octaviusso there's one signal called "io"18:39
octaviusbut what is "io"?18:39
lkclexcellent, then there is a Signal of size 1 named "io"18:39
lkclsearch the code for the word "io"18:39
lkclup at line 64, oh look!  Subsignal("io", ...)18:40
lkclbut hmmm18:40
octaviusIt's a subsignal with 3 pins though...18:40
octaviusah but it has width 318:40
lkclyeah the bi-directional one is18:40
lkclso that isn't it18:40
lkclthere is a limitation of nmigen's codebase here18:41
octaviusso from the point of .eq statement, does nmigen allow multiple drivers?18:41
lkclResourceManager has not been designed to cope with ASICs - only FPGAs18:41
lkclthe maximum number of possible bits is taken - as a linear sequence18:41
octaviuswhat do you mean?18:42
lkclanything with higher linear bit index numbers is completely ignored18:42
lkcls1 = Signal(3)18:42
lkcls2 = Signal(2)18:42
lkcls2.eq(s1) will take *two* bits only18:42
octaviusyeah, makes sense18:42
lkcls1.eq(s2) will.... take 2 bits only.18:42
octaviusso assigning io to something will take fewer bits18:43
lkcl"as many bits as possible"18:43
octaviusbut you could do a combined .eq where you form a record right?18:43
lkclbtw i think get_input and get_output are ok18:44
octaviusassign one bit from io to 3 diff signals18:44
lkclRecords are treated as simply... a linear sequence of bits18:44
lkclwhich makes for some absolutely bloody awful yosys graphs18:44
lkclbecause the assignments are literally done by zip(LHS, RHS) iterating bit-by-bit18:45
lkclfollowed by individual assignment at the bit-level18:45
lkclbut hey18:45
octaviuswhy did you add the tribuffers back to get_tristate and get_input_output?18:46
lkclcommented out18:46
lkclso that if ever coriolis2 does not do the IOpad cells itself, this code can be used to do it18:46
lkclwithout having to go, "err what was that complex instantiation of an external cell again?"18:46
octaviusin the "no JTAG" path, tribufs are *not* commented out. Shall I fix that?18:47
octaviusAnd as for get_tristate and get_input_output, "pin" and "port" are still duplicated, shall I correct those with the "pad" suffix as well?18:48
lkclget_input_output is the important one for now18:49
lkclwhich should be easy (but laborious) to construct by literal cut/paste from get_input and get_output18:51
lkclnote how i have already split out the 3 parts of the port - the 3-long Signal - into i, o and oe18:51
lkclso rather than try to assign a 3-wide Signal to a single-bit pad (which, as explained above, ain't gonna produce the right thing but will SILENTLY SUCCEED)18:53
lkcluse the temp variables18:53
lkclHA. hazard bitvector is being set and cleared18:54
lkclbout frickin time18:54
octaviuslkcl, can you check if this my understanding is correct in this diagram: https://ibb.co/7KJ4y5W19:05
lkcllooks about right19:13
lkclbasically it's a literal cut/paste job of the code from get_input but using io[0] instead of just "io"19:14
lkclcombined with a literal cut/paste job of the code from get_output  but using io[1] (or its temp assignment) instead of just "io"19:14
lkcland blindingly-obviously-the-same for oe19:15
lkclcesar, hooray, i have regfile write bitvectors reasonably working19:15
lkclthere is one known (expected) bug at the moment19:15
octaviusof course, I just wanted to make sure I understood the abstraction of a "port" (which seems like a 'wire' or 'bus' to me) and a "pin" (the start/end point of a signal)19:16
lkclwhen the list of registers to write to is created by PowerDecoder2, and regspec_decode_write() turns those into appropriate tuple ("yes please write" and "to this register number")19:16
lkclit is *not* necessarily *guaranteed* that the ALU will, actually, write to that regfile port19:17
lkcle.g. XER.so only needs to be written to... *sometimes*19:17
lkclso the ALU is allowed to *decide* whether to set ospec().someoutputreg_data.ok19:18
lkclthis means that the write port will not be requested (no wr.req_o), which of course saves on regfile writes19:18
lkclthe PriorityPicker will not be enabled, etc. etc.19:18
lkclwhich is the whole point of letting the ALU decide19:19
lkclthe problem is: *the bitvector write bit was raised for that register*19:19
lkcland if the ALU does not request to write to it, it will never be cleared.19:19
lkclthe situation, when the ALU has finished, but the "data.ok" is False and yet the wrmask has HI for that bit,19:20
lkclthis has to be taken as a request to *immediately* clear that write-vector bit19:20
lkcloctavius, sorry was in the middle of that train of thought19:21
octaviusno worries, it was entertaining XD19:22
octaviuslkcl, I guess it's time for me to work on unit tests XD21:52

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!