lkcl | Ben: we've a GPIO mux (aka pinmux) on the list for Libre-SOC. there's room for collaboration / re-use there | 07:19 |
---|---|---|
lkcl | Ben: for clock gating, you need an actual clock gating Cell, and you need the tools to understand that cell. | 07:20 |
lkcl | Anton: we've been told by Tim Edwards that they're not interested in supporting any project of any kind that wishes to do replacements for the Skywater / Google Open MPW Management Core. | 07:22 |
lkcl | it is considered "too much effort" | 07:22 |
lkcl | Anton: for Libre-SOC we used C4M-JTAG which provides a full JTAG TAP Finite State Machine. it works extremely well and is written in nmigen | 07:23 |
lkcl | https://gitlab.com/Chips4Makers/c4m-jtag/-/blob/master/c4m/nmigen/jtag/tap.py | 07:23 |
lkcl | with full unit tests (in cocotb) | 07:24 |
lkcl | it also supports boundary scan chains. | 07:24 |
lkcl | already | 07:24 |
lkcl | the Libre-SOC 180nm ASIC when it comes back next month we will be able to do both a full IO boundary scan | 07:26 |
lkcl | *and* still be able to test the read and write functionality of the peripherals via the JTAG port if any of the individual IOpads happen to be damaged / non-functional during manufacture | 07:27 |
lkcl | i have support this already integrated into litex. | 07:27 |
lkcl | i cannot recommend underestimating how complex it is to integrate a full boundary scan with litex and a core into an ASIC | 07:28 |
lkcl | it was approximately two months to write and about *six to eight* months to fully debug | 07:28 |
lkcl | fighting litex (and the fragility inherent in migen) every single damn step of the way. | 07:29 |
lkcl | Ben: both clang and gcc work when building yosys. not for any particular reason, out of curiosity i tend to build it with clang :) | 07:31 |
lkcl | in Libre-SOC, i've already done the ASIC-level integration of litex, behind a boundary-scannable set of IO pads. | 07:32 |
lkcl | it was... complex. made even more difficult by the fact that florent was, i have to point out, extremely hostile and obstructive. | 07:33 |
lkcl | technically, you have to *replace* the front-end litex peripheral classes - which with very few exceptions do not have their IO pads abstracted out - with non-FPGA-specific variants | 07:34 |
lkcl | this in many cases (particularly those that are at bit-banging-level) actually means duplicating and replacing the entirety of the FPGA-only-supporting litex peripheral | 07:35 |
lkcl | one of the rare exceptions is litex DRAM. | 07:35 |
lkcl | that is because it has supported so many different systems that its front-end connectivity was abstracted out with a base "IO" class a loooong time ago | 07:36 |
lkcl | i was therefore able to quite easily write a replacement "ASIC" IO front-end class that conformed to the API, which i was able to deduce, then hook that into the boundary-scan-system i developed | 07:37 |
lkcl | and it all "worked" | 07:37 |
lkcl | likewise, interestingly, SD/MMC also turned out to be similarly abstracted out, although the abstraction is quite extensive, because (like DRAM) it involves clocks | 07:38 |
lkcl | I2C, SPI, GPIO, PWM and UART, these were not so fortunate. | 07:38 |
lkcl | i had to pretty much literally duplicate the entire functionality of all of those litex peripherals (including their CSR register portions) | 07:39 |
lkcl | because the IO - which assumes FPGA - was so directly tied *to* the actual code that implements the CSRs, because they're all barely above bit-banging level | 07:39 |
lkcl | here's the resultant code: | 07:42 |
lkcl | https://git.libre-soc.org/?p=libresoc-litex.git;a=blob;f=ls180soc.py;hb=HEAD | 07:42 |
lkcl | one of the more complex issues of doing boundary scanning is, you need to keep in sync **FOUR** separate and completely distinct pieces of code with the **EXACT** same pin definition and pin order. | 07:43 |
lkcl | trying to do that manually would clearly be insane, and result in costly duplication of effort even just to change one pin (four sets of changes), and potentially sync-up mistakes | 07:44 |
lkcl | therefore i auto-generated them all | 07:45 |
lkcl | this is the program where the specifications for pinouts start: | 07:45 |
lkcl | https://git.libre-soc.org/?p=pinmux.git;a=blob;f=src/spec/ls180.py;hb=HEAD | 07:45 |
lkcl | it can indeed cope with being able to specify GPIO pinmuxes, Ben | 07:46 |
lkcl | although the work there was initially done (over 2 years ago now) to help IIT Madras University, so the only peripheral interconnect and pinmux code-generator-backend that works is the Bluespec one | 07:47 |
lkcl | (that's the work that inspired OpenTITAN to develop earl-grey) | 07:47 |
lkcl | (which is another auto-generated peripheral-interconnect-auto-generator, using PHP-style templated verilog code-snippets) | 07:48 |
lkcl | (do the math on that one :) ) | 07:48 |
lkcl | the ls180 pinmux generates simple straightforward machine-readable text files as its output, which could potentially be read by other tools. | 07:49 |
lkcl | there is a corresponding "reader" (in python) which picks up those machine-readable simple text files (and JSON files) and understands the pinmux format | 07:49 |
lkcl | inside the Libre-SOC HDL, because i in absolutely no way wanted to do this in migen / litex, i used the ls180 *pinmux* to create the connectivity to Staf (Chips4Makers) JTAG Boundary scanning | 07:51 |
lkcl | you have to create *pairs* of pins. | 07:51 |
lkcl | one set of IO connections (in, out, in-out-enable) to connect to the actual IO-pads | 07:51 |
lkcl | one set which goes into the JTAG TAP boundary scan | 07:52 |
lkcl | that's a *pair* of connections *per pin* | 07:52 |
lkcl | where each IOpad can have up to *three* wires *each* | 07:52 |
lkcl | and - this is where it was absolute hell - of course the direction (its I/O) changes depending on whether you are routing it *in* to the JTAG tap or *out* to the IO pad | 07:53 |
lkcl | ins become outs and outs become ins | 07:53 |
lkcl | horribly confusing :) | 07:53 |
lkcl | some of the bugs were only caught after the coriolis2 tools - at the *transistor* level - reported them! | 07:54 |
lkcl | litex was *completely* incapable of helping out to catch errors... because it uses migen, and migen has absolutely zero checking of its pin directions of any kind | 07:54 |
lkcl | hence why the debugging was spread out over something mad like an 8-10 month period | 07:55 |
lkcl | so the 2nd location was inside the Libere-SOC HDL | 07:57 |
lkcl | the *third* location which needed to understand the pinouts, in order to do boundary scan, was litex. | 07:58 |
lkcl | i had to use the pinmux-reader system to actually *dynamically* generate a set of IO specifications, which are normally done statically in litex | 07:59 |
lkcl | correction, sorry: https://git.libre-soc.org/?p=libresoc-litex.git;a=blob;f=libresoc/ls180.py;hb=HEAD | 08:00 |
lkcl | looks like i defined the *external* pins "by hand" | 08:00 |
lkcl | i _meant_ to auto-generate that :) | 08:00 |
lkcl | i did however have to create a second ConstraintManager | 08:01 |
lkcl | https://git.libre-soc.org/?p=libresoc-litex.git;a=blob;f=libresoc/core.py;h=f22925bbd7c045f7483d65579956a9f5faa9db0c;hb=42f7357660b245c4491297d24eebc28b4ac2c21f#l325 | 08:01 |
lkcl | remember you have *two duplicate* sets of wires! | 08:02 |
lkcl | one goes straight from the IO-pads directly to the JTAG module | 08:02 |
lkcl | the other comes **OUT** of the JTAG module and goes directly to the peripherals | 08:02 |
lkcl | litex has to be made to understand that! | 08:03 |
lkcl | given that the front-end for IO is handled by a ConstraintManager, i had to therefore have a duplicate ConstraintManager | 08:03 |
lkcl | it was a frickin lot of work | 08:04 |
lkcl | and i can categorically state, with confidence, that litex - and its developers - are in no way prepared to cope with this level of complexity. | 08:05 |
lkcl | mithro: i know you like litex. i am aware that you've stated many times, "nobody *in your experience* has ever had serious problems with litex" | 08:09 |
lkcl | the reason is because nobody has ever attempted anything as complex a third party modification / use-case (in such a short timeframe) as what i did | 08:09 |
lkcl | they've always either used *what is already there* | 08:10 |
lkcl | or they've done it with Florent's help (and paid him EUR 100/hr) | 08:10 |
lkcl | which is five or greater times the rate i've been working on (around EUR 1500 a month) for the past four years. | 08:12 |
lkcl | oh, nearly forgot: the 4th location is the actual IO ring of the ASIC | 08:17 |
lkcl | you also need to specify the IO pads, their positions, and their pin-types | 08:17 |
lkcl | and ensure that the corona matches up perfectly (in the correct, expected order) with the HDL | 08:17 |
lkcl | earlier versions of this code were horribly complex (manually spelled out) | 08:19 |
lkcl | https://git.libre-soc.org/?p=soclayout.git;a=blob;f=experiments9/tsmc_c018/coriolis2/ioring.py;hb=HEAD | 08:19 |
lkcl | but i managed to persuade Jean-Paul to let the pinmux auto-generator handle it | 08:19 |
lkcl | and we committed the JSON file to the repository, along with the verilog, in order to not have any build-tool dependency problems | 08:20 |
lkcl | ah here we go: | 08:22 |
lkcl | https://git.libre-soc.org/?p=soclayout.git;a=blob;f=experiments9/tsmc_c018/doDesign.py;h=30206b63647c64b170ac2ba3a7996231585a1f66;hb=c6c9c87c733c82657fd9f11935219f838469f817 | 08:22 |
lkcl | this is where Jean-Paul decided to do the IO pads manually | 08:22 |
lkcl | duplicating the auto-generated work | 08:22 |
lkcl | i left his code in (for posterity) but you can see here, it's *not used*: | 08:24 |
lkcl | https://git.libre-soc.org/?p=soclayout.git;a=blob;f=experiments9/tsmc_c018/doDesign.py;hb=HEAD#l217 | 08:24 |
lkcl | instead that reads the auto-generated IOpad spec, from the auto-generated JSON file, that matches *EXACTLY* with the litex peripheral set, matches *EXACTLY* with the JTAG boundary-scan peripheral set | 08:25 |
lkcl | because they all use the exact same JSON file, generated by the pinmux specification program | 08:25 |
lkcl | it's.... absolutely mental to think that the majority of Industry-standard commercial ASIC designers do this entirely by hand. | 08:26 |
openpowerbot | [slack] <Benjamin Herrenschmidt> What IRC channel is this ? | 09:34 |
openpowerbot | [slack] <Benjamin Herrenschmidt> I wouldn't recommend using a LiteX UART ... | 09:35 |
openpowerbot | [slack] <Benjamin Herrenschmidt> but a standard 16550 instead, there are plenty of these around | 09:35 |
lkcl | #microwatt on irc.libera.chat, thanks to toshywoshy's IRC-slack-mattermost bridge | 09:45 |
lkcl | yeah we ran out of time / resources to use a 16550 compliant UART in ls180. | 09:46 |
lkcl | turns out that the litex uart is read-compatible with a 16550 uart. | 09:46 |
lkcl | just not interrupt-compatible or write-compatible | 09:47 |
lkcl | Anton: it would be great not to have to use the "Management Core" of SkyWater 130nm. we have a PLL and IOpads that we'd like to test: we cannot confirm that the IO pads will be functional... | 15:21 |
lkcl | ... because what's the point of laying out an IO pad Cell if you can't actually access it via the external pins, and it doesn't have a bond-wire? | 15:21 |
lkcl | if you're forced to use the "Management Core" (caravel?), you'll still have the same type of "corona" - it's the one defined as containing the 48 IO pad connections, around the edge | 15:23 |
lkcl | that will still need to "declare" in some fashion that the tools that you use understand at the netlist level | 15:23 |
lkcl | Anton: with OpenLANE did you happen to run into high fan-out driving issues? | 16:35 |
lkcl | one pin drives 64 or greater outputs, and of course the current required to do so is enormous, so it impacts the timing? | 16:36 |
lkcl | Jean-Paul very kindly added High-Fanout to coriolis2 so that it can cope with that | 16:36 |
lkcl | we have in some cases a massive fanout - well over 128 "drivers" | 16:36 |
lkcl | he therefore added an automatic system for creating buffer fan-out cascades, using a Standard 1-in 8-out Buffer Cell | 16:37 |
lkcl | and - and this is the kicker - made sure that *all* netlists then had the exact same number of buffer chains (even if they were only 1-in 1-out) | 16:38 |
lkcl | in some cases (128+) we had to have 3-deep buffer fan-outs | 16:38 |
lkcl | therefore, Jean-Paul's work added 3-deep buffer fan-outs to *all* netlists | 16:38 |
lkcl | then tied that in properly to the H Clock-Trees as well | 16:38 |
lkcl | it's one of the reasons why we were delayed by about... mmm... 8 months | 16:39 |
lkcl | because after doing that he also had to add Antenna Diodes (to stop ESD) | 16:39 |
lkcl | which again are fully automated in coriolis2 | 16:39 |
lkcl | and - oh - repeater-buffers as well. | 16:40 |
lkcl | some signals are so long, they have to have repeaters | 16:40 |
lkcl | whiiiichh theeeen of course mean that the gate delays are now increaaased | 16:40 |
lkcl | whiiich means that then *all* gate delays have to be increased by the exact same amount :) | 16:41 |
lkcl | amazing how much work he did. so impressed with it. | 16:41 |
openpowerbot | [slack] <mithro> If Coriolis had SKY130 support, then I could potentially fund them -- but I'm not interested in funding stuff which only works with closed source PDKs | 18:01 |
lkcl | mithro: we've funding to do exactly that - https://bugs.libre-soc.org/show_bug.cgi?id=589 | 18:16 |
lkcl | from the Skywater 130nm PDK rules, Staf can do an automated build of FlexLib --> FlexLibC4MSky130 | 18:17 |
lkcl | he's done FreePDK45 already (not under NDA, but it's not practical / useable, obviously, except for parallel builds / demos) | 18:18 |
lkcl | https://gitlab.com/Chips4Makers/c4m-pdk-freepdk45/-/releases | 18:18 |
lkcl | the Imec TSMC 180nm build of FlexLib was obviously under NDA | 18:19 |
lkcl | FlexLib will, if given the Skywater 130nm PDK, produce output that's useable by *both* coriolis2 and the OpenLANE (magic) toolchain | 18:20 |
lkcl | so that's really good to hear. really encouraging. also, if it's planned far enough in advance, and there's a clear plan, it's quite likely that we can put in an NLnet Grant Request for it. | 18:22 |
lkcl | caveat: one participant has to have a home address in the EEC. they can have minimal participation :) | 18:22 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!