lkcl | first microwatt mmu.bin test works (test 1), now tackling test2 which is the one where a PTE has been added | 15:48 |
---|---|---|
lkcl | i may have to revive the microwatt-simulation-runner and get a full debug dump, compare instruction-for-instruction what the hell is going on | 15:48 |
lkcl | this is a frickin lot of work | 15:49 |
mikolajw | I'm moving all crtl processes to a single file -- it's nonsensical to have one file per process because they all share the slots | 18:06 |
mikolajw | (I was redefining the slots array in each file) | 18:14 |
programmerjake | you may still want to eventually split it into separate files if they're too big cuz that allows parallelization | 18:21 |
programmerjake | that can be put off till later tho | 18:21 |
programmerjake | the slots array can be defined in a .h file and #included (standard practice for C) | 18:22 |
mikolajw | yeah, that will be done later if it will be necessary | 18:23 |
programmerjake | or just passed in as a function argument (probably better in the long run) | 18:23 |
programmerjake | :) | 18:23 |
mikolajw | I'm just figuring things out now, and it's actually quite shameful it's taking so long, this simulator is really simple | 18:24 |
programmerjake | no problem, it has a lot of non-obvious complexity | 18:24 |
lkcl | mikolajw, it sounds perfectly reasonable to have one single file | 18:33 |
lkcl | right up to the point where you try running the libre-soc core | 18:33 |
lkcl | at which point the file is over 500,000 lines of c code | 18:34 |
lkcl | and requires 128 GB of resident RAM to compile and link | 18:34 |
lkcl | putting the slots into their own file may be initially a good idea, but the functions definitely not | 18:35 |
programmerjake | for comparison, the generated spir-v parser for Kazan is a single 36kloc, 1.3MB rust source file, and it doesn't require an excessive amount of ram to compile (icr how much but i'd guess a few GB at most) | 18:41 |
lkcl | that's completely irrelevant and misleading | 18:45 |
lkcl | verilator and cxxrtl both produce absolutely insanely massive programs | 18:45 |
lkcl | i just had my 8-core i9 laptop hit a loadavg of 420 when compiling a 15 mbyte verilog file with verilator | 18:46 |
lkcl | when i extracted the VHDL netlist from coriolis2 and compiled just the one module it required 22 GB resident RAM and was still compiling 16 hours after i started it | 18:47 |
lkcl | simulating of HDL designs is a well-known insanely CPU-intensive task | 18:48 |
lkcl | there's just absolutely no comparison whatsoever with a SPIR-V parser | 18:48 |
lkcl | that is a minimum 2 orders of magnitude smaller problem | 18:49 |
programmerjake | it's not irrelevant because it is a very large file with likely similar compilation speed to a simulator with the same number of lines of code (i'd expect similar complexity of the generated compiler ir) -- it still has to run all the code through llvm which is quite similar to the compiler backend mikolaj is likely using...which is likely the majority of the runtime/memory used by the c compiler | 19:09 |
programmerjake | I was never referring to running a simulator, but to compiling generated code for a simulator | 19:11 |
lkcl | yes - i get that you're referring to compiling generated code for a simulator | 19:11 |
mikolajw | the thing I wanted to convey was that I made slots to be redefined in each file, and that's wrong, but instead I just made myself look stupid. I didn't want you to argue over it :P | 19:11 |
lkcl | mikolajw, appreciated | 19:12 |
programmerjake | imho you didn't look stupid, if that helps any... | 19:12 |
* lkcl agrees | 19:12 | |
mikolajw | phew, for a moment I was scared that you will start arguing whether I looked stupid or not :) | 19:12 |
lkcl | you can't predict everything in advance, and i have memory issues so i've found that the *only* way to work around that is to try things anyway and correct them (repeatedly, and quickly) | 19:13 |
lkcl | you maaay be able to get away without creating header files. | 19:14 |
lkcl | it did occur to me that perhaps you might need an "init()" function which populates the relevant slots (used by each module) | 19:15 |
lkcl | but i'm currently dealing with the mmu so can't do a full context-switch at the moment | 19:15 |
lkcl | programmerjake: | 19:16 |
lkcl | -rw-r--r-- 1 lkcl lkcl 507M Dec 22 19:08 Vsim__ALL.a | 19:16 |
lkcl | ls -altrh build/sim/gateware/obj_dir/ | wc | 19:16 |
lkcl | 849 7634 47026 | 19:16 |
lkcl | that's a verilator compile of the current (15 mbyte verilog) libresoc core with the MMU and L1 caches | 19:17 |
lkcl | 850 object files, and a 500 mbyte executable binary | 19:17 |
lkcl | compiling it takes up 45 gigabytes of resident RAM | 19:17 |
lkcl | (and if i move the mouse to another window the loadavg jumps from over 70 to over 400) | 19:18 |
lkcl | this is just how it is | 19:18 |
cesar | As I recall, the Litex simulation of Microwatt and ls1280 didn't take too long to compile, and do use Verilator under the hood... | 19:22 |
cesar | *ls180 | 19:22 |
lkcl | cesar, yes. | 19:22 |
lkcl | i'm currently fighting TestIssuer's FSMs when running in single-step mode | 19:23 |
lkcl | ... in verilator :) | 19:23 |
lkcl | in test_issuer.py (actually HDLRunner) we cheat by letting the core run, and intrusively-inspect the internals to find out if an instruction is done | 19:24 |
lkcl | so HDLRunner is more "following along by desperately keeping track" | 19:25 |
lkcl | but that way it's possible to have overlapping instructions | 19:25 |
lkcl | in verilator, i purely use the DMI interface | 19:25 |
lkcl | * place the core in STOP mode | 19:25 |
lkcl | * issue a DMI STEP request | 19:25 |
lkcl | * loop-repeat read the DMI STATUS register | 19:26 |
lkcl | the core is supposed to run just the one instruction, leaving the "stopping" bit HI (bit 0) but "stopped" bit (bit 1) LO | 19:27 |
lkcl | until the instruction is completed, where it is supposed to set bit 1 HI | 19:27 |
lkcl | that _used_ to work... | 19:28 |
lkcl | and still does in microwatt | 19:28 |
cesar | You are maybe hitting this bug, still unfixed: https://bugs.libre-soc.org/show_bug.cgi?id=726 | 19:28 |
cesar | Got distracted with converting TestIssuer FSMs to pipelines... | 19:29 |
lkcl | but in TestIssuer, it now sets stopping immediately HI when "stopped" is requested | 19:29 |
lkcl | yes | 19:29 |
lkcl | that's a good thing :) | 19:29 |
lkcl | core.py in in-order mode should work fine. | 19:38 |
lkcl | test_core.py is now functional again. | 19:38 |
lkcl | what's really funny is, despite having no fetch/issue, it can actually run loops and handle branches | 19:38 |
lkcl | but the reason is because test_core.py totally cheats, by using the PC that *ISACaller* generates :) | 19:39 |
lkcl | so it's as if core.py had 100% accurate branch-prediction | 19:39 |
lkcl | and it means we can spam core.py with one instruction per clock as long as there's a FU that allows that | 19:40 |
lkcl | anything that could change MSR, PC, or memory, is banned from overlapping at the moment | 19:40 |
cesar | lkcl: It seems you tried to run TestIssuerInternalInOrder by patching TestIssuer (https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/simple/issuer.py;h=ab414f520d8285a06c0bfe34a84b688afc2aaa5a;hb=HEAD#l1530) | 20:31 |
cesar | It has no effect... We have to do it in HDLRunner (https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/simple/test/test_runner.py;h=2daa86a5c5f8663151eb52a83d635e2282d5b8eb;hb=HEAD#l181) | 20:31 |
lkcl | cesar, ah that was clever of me :) | 20:34 |
cesar | lkcl: On the DMI output of the Litex simulation, I'm seeing the PC stuck at zero. It wasn't like that before... | 22:18 |
cesar | It works with a libresoc.v that I generated on October 11... | 22:26 |
cesar | VCD output stops at timestamp zero for some reason... | 22:48 |
programmerjake | reminds me of what happens with unit tests that only use Settle and not Delay or Tick | 22:54 |
cesar | Confirmed October 11 works. Easiest way I think is to do git bisect. | 22:58 |
lkcl | cesar, i sorted it (just hadn't committed/pushed) | 23:20 |
lkcl | programmerjake, yyeah that's a fun one. | 23:21 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!