Tuesday, 2021-11-09

octavius	lkcl, thanks for the comments, I'll update the pins as per the requirements.	08:49
octavius	As well as merge the wiki pages	08:51
octavius	Today will be a little busy as I'm attending the Cambridge Wireless conference (CWIC2021) online. I was thinking of dropping an email with any interesting (public) info that I come across. Which mailing list should I use for it?	08:51
lkcl	octavius, or rename it to a sub-page.	10:40
lkcl	libre-soc-dev is probably fine	10:40
sadoon_albader[m	Hi	10:42
lkcl	hi sadoon_albader[m	10:42
sadoon_albader[m	So I'm trying to get into libre-soc and I'm reading the relevant pages on the website	10:42
sadoon_albader[m	I'm really impressed with all this but also extremely intimidated and don't know where to start :')	10:42
sadoon_albader[m	I have a background in computer engineering and specifically embedded system design, I've done VHDL, Verilog, and SystemVerilog work, but this whole nmigen thing is scaring me xD	10:43
sadoon_albader[m	Any suggestions on where I should start? I've been making a small 8-bit microprocessor of my own, on an FPGA, I'm thinking of completing that first to understand the challenges that I might face. Am I on the right track?	10:45
sadoon_albader[m	Any tips and suggestions are highly appreciated	10:46
octavius	Hi Sadoon, I'd look at this page for info on nmigen. https://libre-soc.org/docs/learning_nmigen/	10:52
sadoon_albader[m	Thanks, I'll read it as soon as I finish the HDL workflow page :)	10:53
octavius	I went through Robert Baruch's tutorial series, covering the nmigen language. Now I can somewhat read nmigen (however same as you, the learning curve is quite steep XD)	10:53
sadoon_albader[m	It sounds like I'm about to learn a very different workflow from basic HDL stuff though right?	10:54
sadoon_albader[m	I guess it's part of being a computer engineer with things evolving all the time, gotta keep up heh	10:54
octavius	The difference is what you write is more of a behavioural model. nMigen isn't an HDL as much as an HDL _generator_	10:54
octavius	what you get out of it is either intermediate representation (yosys IR) or Verilog	10:55
octavius	With this workflow, HDL is treated as assembly or machine code (which you don't touch most of the time)	10:55
sadoon_albader[m	Very interesting	10:56
lkcl	sadoon_albader[m, nice!	10:56
sadoon_albader[m	Thanks everyone! :D	10:56
lkcl	yes, as a software engineer aged 51 i have been able to adapt to new things continuously for 44 years programming	10:56
lkcl	so i learned HDL like, only 2.5 years ago	10:57
lkcl	i found using yosys "show top" to be the most useful thing	10:57
sadoon_albader[m	Amazing, I'm in the virtual presence of veterans heh	10:57
lkcl	by outputting the design (verilog or ilang) to a file every time after an edit	10:58
lkcl	then running yosys "show top" i was able to see the gate-level representation, which i understood better than the python code itself	10:58
lkcl	but over a period of 6 months got used to it	10:58
sadoon_albader[m	That's familiar terriroty to me lkcl	10:58
lkcl	sadoon_albader[m, very cool	10:59
lkcl	took about 3 weeks to adapt	10:59
sadoon_albader[m	Let's see how long it takes me :D	10:59
lkcl	and yes, we use software engineering practices, so develop modules that start from "requirements"	10:59
lkcl	then unit tests for those	10:59
lkcl	then write a module that uses other modules, and write unit tests for that.	11:00
lkcl	chain-chain-chain-chain	11:00
sadoon_albader[m	Also thanks for keeping the website lightweight, I like sitting in coffee shops and using my old PowerBook G4 to do light work like reading and stuff :D	11:00
lkcl	:)	11:00
lkcl	you can git clone the wiki repo and use it offline if you like	11:00
lkcl	ooo a G4, ooo :)	11:00
lkcl	it's entirely static pages https://git.libre-soc.org/?p=libreriscv.git;a=summary	11:01
sadoon_albader[m	That workflow is very similar to what I did in uni, I designed a poly1305 hardware processor core like that, module, unit test, simulation, then hardware	11:01
lkcl	very cool	11:02
* sadoon_albader[m loves my good ol powerpc machines		11:02
octavius	How sensible? I wonder why my uni didn't focus on writing tests? Only really learned about the concept a few years ago	11:02
lkcl	sadoon_albader[m, you found this page? https://libre-soc.org/HDL_workflow/	11:04
sadoon_albader[m	Uni didn't teach me much tbqh	11:07
sadoon_albader[m	It's mostly self-learning octavius	11:07
sadoon_albader[m	lkcl: yes, I'm almost halfway through that page	11:07
lkcl	i hope you appreciate some of the dry humour in it	11:08
sadoon_albader[m	The AOL and gmail bashing is keeping me going	11:11
octavius	sadoon_albader[m: very true	11:12
sadoon_albader[m	If I'm using an OpenPOWER machine I assume I won't need qemu right?	11:12
lkcl	ah you will - until we add a runner that can set up an on-demand (command-line) Virtual Machine	11:14
lkcl	which, ironically, involves KVM, and, ironically, the easiest way to access that is... qemu	11:14
lkcl	but, we haven't used qemu for development in about.... mmm... 2 years?	11:15
lkcl	it was used very early on when developing the integer instructions, because how else would we confirm the unit tests were correct?	11:15
lkcl	we had to compare them against something	11:16
lkcl	but we weren't expecting that process to actually find obscure bugs in qemu, but it did	11:16
lkcl	a divide-overflow bug	11:16
lkcl	by running qemu single-step and extracting full registers automatically with python-gdbmi, we could compare against the HDL and the simulator, ISACaller	11:18
lkcl	i did side-by-side comparisons against microwatt in a slightly different way	11:19
lkcl	dumping the regs via the DMI interface, which was deliberately made 100% compatible with microwatt's DMI interface	11:19
sadoon_albader[m	Nice	11:21
sadoon_albader[m	I see that GHDL is part of the workflow, are you using VHDL in libre-soc as well?	11:22
octavius	https://ftp.libre-soc.org/course_18oct2021/drawing-2.svg	11:34
octavius	From what I know, we use nmigen exclusively and we have no verilog/vhdl modules that we add to the top level. Is that right Luke?	11:35
octavius	You may see VHDL at the alliance stage (before the IC layout is generated)	11:36
octavius	This presentation Luke gave for the OpenPOWER course is pretty good at summarising the overall flow: https://www.youtube.com/watch?app=desktop&v=hzbLEEjJdOI	11:37
sadoon_albader[m	Awesome, I'll look at that in a bit	11:53
lkcl	octavius, GHDL is used by cocotb	12:05
lkcl	and also microwatt, which is a critical research resource that we're tracking (in many cases by literally verbatim translating its source code to nmigen - thousands of lines of it) is in VHDL	12:06
octavius	"COroutine based COsimulation TestBench", I keep hearing about it, but haven't looked into it yet: https://docs.cocotb.org/en/stable/index.html	12:08
sadoon_albader[m	Ah I see	12:08
octavius	So you use it to verify microwatt behaviour Luke?	12:08
lkcl	octavius, yes	12:16
lkcl	https://git.libre-soc.org/?p=libresoc-litex.git;a=blob;f=sim.py;hb=HEAD	12:16
lkcl	note the "from microwatt import Microwatt"	12:17
lkcl	by (cough) commenting in/out the alternative class, and, note the use of DMI "dump" total-mess-of-a-FSM below	12:17
lkcl	$display can dump out full regfile contents after executing each instructio	12:17
lkcl	n	12:17
lkcl	so you run a program with Libre-SOC, blat, a massive debug log appears	12:18
lkcl	then comment-in microwatt, re-run it, blat, another massive debug log appears	12:18
lkcl	it's then a matter of "diff -u" to find regfile discrepancies	12:18
octavius	So when running with Libre-SOC, is cocotb used?	12:19
lkcl	find a problem, write a unit test with that exact same input, run, debug, repeat.	12:19
lkcl	mmmm no not yet. ok, long story	12:19
octavius	Well I guess it can't, right?	12:19
lkcl	yes, but only for pre-PnR extraction from coriolis2	12:19
octavius	You'd neet to compile to VHDL	12:19
octavius	yeah	12:20
lkcl	which was so insanely large for the post-PnR we didn't end up running it	12:20
lkcl	but did for a few test ASICs	12:20
octavius	hehehe	12:20
lkcl	yes, all the scripts are there	12:20
lkcl	https://git.libre-soc.org/?p=soc-cocotb-sim.git;a=summary	12:20
octavius	The joy of order-of-magnitude complexity XD	12:20
lkcl	mental	12:21
lkcl	i estimated it would be 150 days to compile the full ASIC with verilator	12:21
octavius	On the super-powerfull machine?	12:22
lkcl	that's just compiling - not "running"	12:22
lkcl	on any super-powerful modern machine with at least 128 GB of RAM	12:22
octavius	hahahaha	12:22
octavius	I'm a little short	12:22
lkcl	one of the modules required 36 GB of resident RAM, the c++ code was so large	12:22
octavius	I guess swap could work (very badly)	12:22
lkcl	not a snowball in hell's chance	12:23
octavius	Too much I/O delay?	12:23
lkcl	you'd need 2-3 orders of magnitude longer compile time	12:23
octavius	damn	12:23
lkcl	it's down to how inter-connected the c++ code is	12:23
lkcl	you'd swap out one page, only to have to re-read it back in a few ms later	12:24
lkcl	aka "thrashing"	12:24
lkcl	there's a long-standing binutils gnu-ld bug about that, which after multiple years still hasn't been addressed	12:24
octavius	http://www.thrashing.com/thrashing-in-computer-science.html	12:24
octavius	Probably not critical enough a bug?	12:24
lkcl	much as i don't like to use the word, some... idiot... went and removed Dr Stallman's in-memory algorithms from gnu-ld, in the late 90s.	12:25
lkcl	on the basis, "4gb address space is enough for anybody"	12:25
lkcl	oh it's a real serious one.	12:25
octavius	Do you remember what version of gcc that was?	12:25
octavius	2.9.5?	12:25
lkcl	it's not gcc, it's binutils (gnu ld)	12:25
octavius	ah ok	12:25
lkcl	gcc fortunately still has the in-memory restriction	12:26
lkcl	i belieeeve somebody tried to remove that too, "because it's soooo complicated, whyyy would anybody need thaaaaat"	12:26
lkcl	and of course they soon found out why	12:26
octavius	One of the first search results: https://mail.gnu.org/archive/html/bug-binutils/2018-12/msg00170.html	12:27
lkcl	yyep, that's my bugreport	12:27
lkcl	i created a repro case - a gnu ld/gold torture generator	12:27
octavius	is it on a public repo?	12:28
lkcl	it's a program (in python of course) which auto-generates random programs with a command-line specified number of files, functions, parameters-to-functions, and number of calls to other auto-generated functions	12:28
sadoon_albader[m	<lkcl> "on any super-powerful modern..." <- That's the point where I mention "hey I have that much RAM on my Talos II Lite"	12:28
lkcl	with some static arrays and stack-based arrays thrown in	12:29
lkcl	sadoon_albader[m, cooool :)	12:29
octavius	lkcl, "it's a program (in python of course)" why would I even think any different XD	12:29
lkcl	so i was able to use it to exceed 20 GB program sizes	12:29
sadoon_albader[m	Hey if you get 16GB RDIMMs for cheap, you buy a bunch of em	12:29
octavius	You have a Talos II sadoon? Very cool	12:29
lkcl	requiring over 6 GB of resident RAM at the linker phase	12:29
octavius	XD	12:30
lkcl	both gnu-ld and gnu-gold - the supposed "better" replacement - barfed	12:30
lkcl	that report was 2018 and it's still not been addressed	12:30
octavius	Why do you think that is? Not a common use-case?	12:31
lkcl	oh it's a common use-case. people here have said that they've encountered regular repeatable build failures	12:31
lkcl	when 3 or more large pieces of software end up compiling at the same time	12:31
octavius	Too difficult to solve then?	12:31
lkcl	of course because those pieces of software take a long time, they overlap regularly. 192 mb of RAM and they got hard catastrophic failures requiring a reboot	12:32
lkcl	yes, basically	12:32
octavius	So the solution is just to run one compilation job?	12:32
lkcl	it's as complex as large matrix multiply (large as in: 100,000+ sized matrices)	12:33
lkcl	no, it's much worse than that	12:33
lkcl	anyway, i have to focus	12:33
lkcl	i've an hour to get something done on the core	12:33
octavius	Thanks for the explanations luke!	12:33
lkcl	:)	12:33
lkcl	sadoon_albader[m, if you're around at UTC 22:00 (don't know your TZ) we have a jitsi meet	12:34
lkcl	octavius, could you pass on sadoon_albader[m the URL if interested?	12:34
lkcl	i leave it with you	12:34
octavius	Sure	12:34
sadoon_albader[m	I'm at UTC+3 so that'd be 1AM	12:35
sadoon_albader[m	I'll hang around if I'm up :)	12:35
sadoon_albader[m	Thanks for the invite	12:35
octavius	I do tend to find devs on libresoc stay in late more sadoon XD (I tend go to bed earlier)	12:36
sadoon_albader[m	I like to wake up a little before sunrise which is about 5:30AM around here, everyone thinks it's weird but I find it very refreshing and sets me up for a productive day	12:42
octavius	I like waking up early too, much easier to get work done when no one's awake XD, sometimes harder to do it though (especially in winter)	12:43
*** kylel1 is now known as kylel		14:16
*** kylel1 is now known as kylel		14:49
lkcl	sadoon_albader[m, if you have an email address i can add you to the calendar invite btw	15:32
lkcl	send me a message to luke.leighton@gmail.com	15:33
sadoon_albader[m	Sure, one sec	15:33
lkcl	no rush	15:37
sadoon_albader[m	I sent you the email and also a dm here	15:38
lkcl	NLnet grants cavatools-power-isa and coriolis2 improvements have been approved!	15:44
octavius	Thanks lkcl!	15:44
octavius	So how many more years of development would that fund?	15:45
lkcl	EUR 50,000 - about... 8-10 man-months or so?	15:49
octavius	Noice	15:50
lkcl	no - more like 1 year	15:50
lkcl	that's each	15:50
lkcl	1 year for cavatools-power-isa	15:50
lkcl	1 year for coriolis2.	15:50
sadoon_albader[m	!*	16:01
sadoon_albader[m	<lkcl> "NLnet grants cavatools-power-isa..." <- Awesome;	16:01
sadoon_albader[m	Did you get my email btw? lkcl	16:02
kylel	Wow, awesome news.	16:24
lkcl	sadoon_albader[m, in spam, yes	16:42
lkcl	kylel, yeah :)	16:42
sadoon_albader[m	Damnit, well at least you received it	16:43
sadoon_albader[m	Stupid domain name issues	16:43
lkcl	i'll set a filter	16:49
sadoon_albader[m	Thanks	16:53
lkcl	cesar, i just added PriorityPickers into core, on issue of instructions	17:57
lkcl	now if there are more RSes (num_rows>1) it should, in theory, be ok	17:58
cesar	Does PriorityPickers guarantee in-order retirement? Remember, on retirement, we need to update the "in use" masks...	17:59
lkcl	you'll like this: it is technically possible for a FunctionUnit to support multiple Functions! :)	17:59
lkcl	ah no	17:59
lkcl	that's not its job	17:59
lkcl	it just prioritises (picks) one (and only one) of the many inputs	17:59
lkcl	so, for example, on regfile ports, you absolutely cannot have more than one FU try to use the same regfile port	18:00
cesar	Well, maybe PriorityPicker is not the best approach... Maybe a FIFO...	18:00
lkcl	so, you add a PriorityPicker in front, and whilst many FUs try to _request_ that regfile port, only one gets actual access	18:00
lkcl	yyeah anything that selects only one at a time	18:01
lkcl	although, a FIFO requires a latch, and a PriorityPicker is entirely combinatorial	18:01
cesar	Hmm, if the instructions are conflict-free, maybe it doesn't matter the order of retirement...	18:01
lkcl	yes, for now	18:02
cesar	* hazard-free	18:02
lkcl	yes, exactly	18:02
lkcl	so we have to arrange some instructions - some unit tests - which are hazard-free, initially	18:02
lkcl	because the code exists, the next task i will do is, to add RaW Hazard vector to TestIssuer	18:02
lkcl	then throw a DIV instruction at it, which should take ages	18:03
lkcl	long enough for an ADD to also be issued	18:03
lkcl	hilarious that even the TestIssuer FSM could be converted to RaW hazards :)	18:04
lkcl	heeeave, only one instruction every 10 cycles, but hey	18:04
lkcl	but, right now, it is time to eat :)	18:06
cesar	A FIFO could record the FunctionUnit dispatch order, and select the instruction to retire (which means, write back the regfile, and clear the bit in the hazard vector), which was originally the role of the FU-FU dependency matrix, if I understand well.	18:10
lkcl	FU-REGs	18:38
lkcl	FU-FU is like a linked-list of results-connected-to-results	18:38
lkcl	a Directed Acyclic Graph, more like.	18:38
lkcl	where one FU waits for the results from another FU, and the FU-FU DM stores that relationship	18:39
lkcl	in combination with that, you have to have an FU-Regs DM which records what registers the FU needs (both read and write)	18:39
lkcl	because, whilst FU-FU records "results" relationships, it does not record which regs those results came from (or go to)	18:40
lkcl	FU-Regs was called "Q-Tables" in the original 6600 literature and the patents	18:40
lkcl	very little mention or understanding of the FU-FU matrix is made in the patent or in Academic "studies" of the 6600 design	18:41
lkcl	leading to the 6600 scoreboard system being denigrated and completely undervalued for the 50 years of its existence	18:41
cesar	So, how does one enforce in-order retirement (write-back to register files), which guarantee precise exceptions?	18:42
lkcl	Shadow Matrices	18:42
cesar	I think it was the role of the Reorder Buffer.	18:42
lkcl	actually, fascinatingly, you don't completely need in-order retirement	18:42
lkcl	you need "anything that cannot be undone" to be separate from "anything that can complete 100%"	18:43
lkcl	once committed to completing 100%, you absolutely cannot back out of that decision	18:43
lkcl	therefore, hilariously / fascinatingly, anything that is committed 100% to completion doesn't actually matter in which order it is done	18:43
lkcl	therefore, ironically, an in-order core does not actually _need_ to complete... in-order	18:44
lkcl	yes, the ROB (from Tomasulo) is an unnecessary restriction	18:44
lkcl	which is a characteristic of the DAG (from 6600) being represented as a cyclic buffer data structure (the ROB) in Tomasulo	18:45
lkcl	the DAG can complete in any order	18:45
lkcl	the ROB (cyclic buffer) has to complete - by definition - in FIFO (cyclic) order	18:45
lkcl	it is possible to make "safe-to-complete" instructions of a ROB perform their result-commits out-of-order	18:46
programmerjake	rob in tomasulo is necessary for speculation, otherwise it isn't needed	18:46
lkcl	but as best i am aware none of the literature i have seen says it is possible	18:46
lkcl	yes, there are descriptions around online of Tomasulo algorithms without a ROB.	18:47
cesar	programmerjake: I thought the ROB was needed for precise exceptions with out-of-order execution, even with no speculation...	18:48
cesar	... at least it helps...	18:48
programmerjake	precise exceptions == speculation, since your speculating that ld/st don't cause exceptions	18:52
cesar	Well, not just LD/ST cause exceptions... Could be an interrupt...	18:53
programmerjake	interrupts can easily be handled without speculation, you just tell the instruction fetch pipeline to insert a trap instruction	18:54
programmerjake	without speculation the trap would cause all later instructions to not start executing, all earlier instructions would just wait till they complete	18:55
cesar	Got it. I guess LD/ST will have to stall our in-order pipeline, just as branches will...	18:56
programmerjake	yup!	18:57
lkcl	the Solution To Everything (tm) in in-order: stall, stall, stall	19:14
lkcl	actually, the way the PowerDecoder2 works is: any interrupts make the instruction (the current instruction) be interpreted as an OP_TRAP	19:14
lkcl	you don't insert an actual trap instruction: the PowerDecoder2 ignores the current instruction entirely	19:15
lkcl	https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/power_decoder2.py;h=edf2893b3dec4749822db7d926efb4eaa0eea9b2;hb=HEAD#l1478	19:16
lkcl	everything before that was "current incoming instruction"	19:16
programmerjake	that works too	19:16
lkcl	everything after is optional and entirely erases what was done previously	19:16
lkcl	where anything that is an interrupt is converted to a type of trap	19:17
lkcl	for LD/ST, it means that when exc_happened=1, all that is needed is to hit the "exc_happened" flag in the PowerDecoder2 and then re-run the exact same instruction	19:18
lkcl	on the 2nd iteration it gets done as... a trap	19:18
lkcl	it's confusingly simple	19:19
*** kylel1 is now known as kylel		20:00
lkcl	meeting 10m	21:50
lkcl	programmerjake, lx0 sadoon_albader[m octavius jn rsc klys_ kylel cesar Veera[m] mikolajw	21:51
sadoon_albader[m	I need just a few minutes	21:57
lkcl	wifi gone funny here	23:44

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!