Wednesday, 2021-09-29

lkclBen: we've a GPIO mux (aka pinmux) on the list for Libre-SOC. there's room for collaboration / re-use there07:19
lkclBen: for clock gating, you need an actual clock gating Cell, and you need the tools to understand that cell.07:20
lkclAnton: we've been told by Tim Edwards that they're not interested in supporting any project of any kind that wishes to do replacements for the Skywater / Google Open MPW Management Core.07:22
lkclit is considered "too much effort"07:22
lkclAnton: for Libre-SOC we used C4M-JTAG which provides a full JTAG TAP Finite State Machine. it works extremely well and is written in nmigen07:23
lkclhttps://gitlab.com/Chips4Makers/c4m-jtag/-/blob/master/c4m/nmigen/jtag/tap.py07:23
lkclwith full unit tests (in cocotb)07:24
lkclit also supports boundary scan chains.07:24
lkclalready07:24
lkclthe Libre-SOC 180nm ASIC when it comes back next month we will be able to do both a full IO boundary scan07:26
lkcl*and* still be able to test the read and write functionality of the peripherals via the JTAG port if any of the individual IOpads happen to be damaged / non-functional during manufacture07:27
lkcli have support this already integrated into litex.07:27
lkcli cannot recommend underestimating how complex it is to integrate a full boundary scan with litex and a core into an ASIC07:28
lkclit was approximately two months to write and about *six to eight* months to fully debug07:28
lkclfighting litex (and the fragility inherent in migen) every single damn step of the way.07:29
lkclBen: both clang and gcc work when building yosys. not for any particular reason, out of curiosity i tend to build it with clang :)07:31
lkclin Libre-SOC, i've already done the ASIC-level integration of litex, behind a boundary-scannable set of IO pads.07:32
lkclit was... complex.  made even more difficult by the fact that florent was, i have to point out, extremely hostile and obstructive.07:33
lkcltechnically, you have to *replace* the front-end litex peripheral classes - which with very few exceptions do not have their IO pads abstracted out - with non-FPGA-specific variants07:34
lkclthis in many cases (particularly those that are at bit-banging-level) actually means duplicating and replacing the entirety of the FPGA-only-supporting litex peripheral07:35
lkclone of the rare exceptions is litex DRAM.07:35
lkclthat is because it has supported so many different systems that its front-end connectivity was abstracted out with a base "IO" class a loooong time ago07:36
lkcli was therefore able to quite easily write a replacement "ASIC" IO front-end class that conformed to the API, which i was able to deduce, then hook that into the boundary-scan-system i developed07:37
lkcland it all "worked"07:37
lkcllikewise, interestingly, SD/MMC also turned out to be similarly abstracted out, although the abstraction is quite extensive, because (like DRAM) it involves clocks07:38
lkclI2C, SPI, GPIO, PWM and UART, these were not so fortunate.07:38
lkcli had to pretty much literally duplicate the entire functionality of all of those litex peripherals (including their CSR register portions)07:39
lkclbecause the IO - which assumes FPGA - was so directly tied *to* the actual code that implements the CSRs, because they're all barely above bit-banging level07:39
lkclhere's the resultant code:07:42
lkclhttps://git.libre-soc.org/?p=libresoc-litex.git;a=blob;f=ls180soc.py;hb=HEAD07:42
lkclone of the more complex issues of doing boundary scanning is, you need to keep in sync **FOUR** separate and completely distinct pieces of code with the **EXACT** same pin definition and pin order.07:43
lkcltrying to do that manually would clearly be insane, and result in costly duplication of effort even just to change one pin (four sets of changes), and potentially sync-up mistakes07:44
lkcltherefore i auto-generated them all07:45
lkclthis is the program where the specifications for pinouts start:07:45
lkclhttps://git.libre-soc.org/?p=pinmux.git;a=blob;f=src/spec/ls180.py;hb=HEAD07:45
lkclit can indeed cope with being able to specify GPIO pinmuxes, Ben07:46
lkclalthough the work there was initially done (over 2 years ago now) to help IIT Madras University, so the only peripheral interconnect and pinmux code-generator-backend that works is the Bluespec one07:47
lkcl(that's the work that inspired OpenTITAN to develop earl-grey)07:47
lkcl(which is another auto-generated peripheral-interconnect-auto-generator, using PHP-style templated verilog code-snippets)07:48
lkcl(do the math on that one :) )07:48
lkclthe ls180 pinmux generates simple straightforward machine-readable text files as its output, which could potentially be read by other tools.07:49
lkclthere is a corresponding "reader" (in python) which picks up those machine-readable simple text files (and JSON files) and understands the pinmux format07:49
lkclinside the Libre-SOC HDL, because i in absolutely no way wanted to do this in migen / litex, i used the ls180 *pinmux* to create the connectivity to Staf (Chips4Makers) JTAG Boundary scanning07:51
lkclyou have to create *pairs* of pins.07:51
lkclone set of IO connections (in, out, in-out-enable) to connect to the actual IO-pads07:51
lkclone set which goes into the JTAG TAP boundary scan07:52
lkclthat's a *pair* of connections *per pin*07:52
lkclwhere each IOpad can have up to *three* wires *each*07:52
lkcland - this is where it was absolute hell - of course the direction (its I/O) changes depending on whether you are routing it *in* to the JTAG tap or *out* to the IO pad07:53
lkclins become outs and outs become ins07:53
lkclhorribly confusing :)07:53
lkclsome of the bugs were only caught after the coriolis2 tools - at the *transistor* level - reported them!07:54
lkcllitex was *completely* incapable of helping out to catch errors... because it uses migen, and migen has absolutely zero checking of its pin directions of any kind07:54
lkclhence why the debugging was spread out over something mad like an 8-10 month period07:55
lkclso the 2nd location was inside the Libere-SOC HDL07:57
lkclthe *third* location which needed to understand the pinouts, in order to do boundary scan, was litex.07:58
lkcli had to use the pinmux-reader system to actually *dynamically* generate a set of IO specifications, which are normally done statically in litex07:59
lkclcorrection, sorry: https://git.libre-soc.org/?p=libresoc-litex.git;a=blob;f=libresoc/ls180.py;hb=HEAD08:00
lkcllooks like i defined the *external* pins "by hand"08:00
lkcli _meant_ to auto-generate that :)08:00
lkcli did however have to create a second ConstraintManager08:01
lkclhttps://git.libre-soc.org/?p=libresoc-litex.git;a=blob;f=libresoc/core.py;h=f22925bbd7c045f7483d65579956a9f5faa9db0c;hb=42f7357660b245c4491297d24eebc28b4ac2c21f#l32508:01
lkclremember you have *two duplicate* sets of wires!08:02
lkclone goes straight from the IO-pads directly to the JTAG module08:02
lkclthe other comes **OUT** of the JTAG module and goes directly to the peripherals08:02
lkcllitex has to be made to understand that!08:03
lkclgiven that the front-end for IO is handled by a ConstraintManager, i had to therefore have a duplicate ConstraintManager08:03
lkclit was a frickin lot of work08:04
lkcland i can categorically state, with confidence, that litex - and its developers - are in no way prepared to cope with this level of complexity.08:05
lkclmithro: i know you like litex. i am aware that you've stated many times, "nobody *in your experience* has ever had serious problems with litex"08:09
lkclthe reason is because nobody has ever attempted anything as complex a third party modification / use-case (in such a short timeframe) as what i did08:09
lkclthey've always either used *what is already there*08:10
lkclor they've done it with Florent's help (and paid him EUR 100/hr)08:10
lkclwhich is five or greater times the rate i've been working on (around EUR 1500 a month) for the past four years.08:12
lkcloh, nearly forgot: the 4th location is the actual IO ring of the ASIC08:17
lkclyou also need to specify the IO pads, their positions, and their pin-types08:17
lkcland ensure that the corona matches up perfectly (in the correct, expected order) with the HDL08:17
lkclearlier versions of this code were horribly complex (manually spelled out)08:19
lkclhttps://git.libre-soc.org/?p=soclayout.git;a=blob;f=experiments9/tsmc_c018/coriolis2/ioring.py;hb=HEAD08:19
lkclbut i managed to persuade Jean-Paul to let the pinmux auto-generator handle it08:19
lkcland we committed the JSON file to the repository, along with the verilog, in order to not have any build-tool dependency problems08:20
lkclah here we go:08:22
lkclhttps://git.libre-soc.org/?p=soclayout.git;a=blob;f=experiments9/tsmc_c018/doDesign.py;h=30206b63647c64b170ac2ba3a7996231585a1f66;hb=c6c9c87c733c82657fd9f11935219f838469f81708:22
lkclthis is where Jean-Paul decided to do the IO pads manually08:22
lkclduplicating the auto-generated work08:22
lkcli left his code in (for posterity) but you can see here, it's *not used*:08:24
lkclhttps://git.libre-soc.org/?p=soclayout.git;a=blob;f=experiments9/tsmc_c018/doDesign.py;hb=HEAD#l21708:24
lkclinstead that reads the auto-generated IOpad spec, from the auto-generated JSON file, that matches *EXACTLY* with the litex peripheral set, matches *EXACTLY* with the JTAG boundary-scan peripheral set08:25
lkclbecause they all use the exact same JSON file, generated by the pinmux specification program08:25
lkclit's.... absolutely mental to think that the majority of Industry-standard commercial ASIC designers do this entirely by hand.08:26
openpowerbot[slack] <Benjamin Herrenschmidt> What IRC channel is this ?09:34
openpowerbot[slack] <Benjamin Herrenschmidt> I wouldn't recommend using a LiteX UART ...09:35
openpowerbot[slack] <Benjamin Herrenschmidt> but a standard 16550 instead, there are plenty of these around09:35
lkcl#microwatt on irc.libera.chat, thanks to toshywoshy's IRC-slack-mattermost bridge09:45
lkclyeah we ran out of time / resources to use a 16550 compliant UART in ls180.09:46
lkclturns out that the litex uart is read-compatible with a 16550 uart.09:46
lkcljust not interrupt-compatible or write-compatible09:47
lkclAnton: it would be great not to have to use the "Management Core" of SkyWater 130nm.  we have a PLL and IOpads that we'd like to test: we cannot confirm that the IO pads will be functional...15:21
lkcl... because what's the point of laying out an IO pad Cell if you can't actually access it via the external pins, and it doesn't have a bond-wire?15:21
lkclif you're forced to use the "Management Core" (caravel?), you'll still have the same type of "corona" - it's the one defined as containing the 48 IO pad connections, around the edge15:23
lkclthat will still need to "declare" in some fashion that the tools that you use understand at the netlist level15:23
lkclAnton: with OpenLANE did you happen to run into high fan-out driving issues?16:35
lkclone pin drives 64 or greater outputs, and of course the current required to do so is enormous, so it impacts the timing?16:36
lkclJean-Paul very kindly added High-Fanout to coriolis2 so that it can cope with that16:36
lkclwe have in some cases a massive fanout - well over 128 "drivers"16:36
lkclhe therefore added an automatic system for creating buffer fan-out cascades, using a Standard 1-in 8-out Buffer Cell16:37
lkcland - and this is the kicker - made sure that *all* netlists then had the exact same number of buffer chains (even if they were only 1-in 1-out)16:38
lkclin some cases (128+) we had to have 3-deep buffer fan-outs16:38
lkcltherefore, Jean-Paul's work added 3-deep buffer fan-outs to *all* netlists16:38
lkclthen tied that in properly to the H Clock-Trees as well16:38
lkclit's one of the reasons why we were delayed by about... mmm... 8 months16:39
lkclbecause after doing that he also had to add Antenna Diodes (to stop ESD)16:39
lkclwhich again are fully automated in coriolis216:39
lkcland - oh - repeater-buffers as well.16:40
lkclsome signals are so long, they have to have repeaters16:40
lkclwhiiiichh theeeen of course mean that the gate delays are now increaaased16:40
lkclwhiiich means that then *all* gate delays have to be increased by the exact same amount :)16:41
lkclamazing how much work he did. so impressed with it.16:41
openpowerbot[slack] <mithro> If Coriolis had SKY130 support, then I could potentially fund them -- but I'm not interested in funding stuff which only works with closed source PDKs18:01
lkclmithro: we've funding to do exactly that - https://bugs.libre-soc.org/show_bug.cgi?id=58918:16
lkclfrom the Skywater 130nm PDK rules, Staf can do an automated build of FlexLib --> FlexLibC4MSky13018:17
lkclhe's done FreePDK45 already (not under NDA, but it's not practical / useable, obviously, except for parallel builds / demos)18:18
lkclhttps://gitlab.com/Chips4Makers/c4m-pdk-freepdk45/-/releases18:18
lkclthe Imec TSMC 180nm build of FlexLib was obviously under NDA18:19
lkclFlexLib will, if given the Skywater 130nm PDK, produce output that's useable by *both* coriolis2 and the OpenLANE (magic) toolchain18:20
lkclso that's really good to hear. really encouraging.  also, if it's planned far enough in advance, and there's a clear plan, it's quite likely that we can put in an NLnet Grant Request for it.18:22
lkclcaveat: one participant has to have a home address in the EEC. they can have minimal participation :)18:22

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!