openpowerbot_ | [mattermost] <lkcl> ha, i was just going to suggest this - https://github.com/antonblanchard/microwatt/blob/7fa7b45faa17950de44591f7a73722fdf8a87385/dcache.vhdl#L1105 | 00:32 |
---|---|---|
openpowerbot_ | [mattermost] <lkcl> moving the computation of the mux out of the rams loop :) | 00:32 |
openpowerbot_ | [mattermost] <lkcl> and it's done already. | 00:33 |
openpowerbot_ | [slack] <joel> <@U02AYTB9926> have you looked at these messages at all? It looks like something is not quite working right, but I am not sure : https://files.slack.com/files-pri/T443QD9JA-F02PAU44B2T/download/untitled.txt | 04:03 |
openpowerbot_ | [slack] <Matt Johnston> I think the gist is that 528 bytes is too small to bother with | 04:04 |
openpowerbot_ | [slack] <Matt Johnston> I think the gist is that 528 bits is too small to bother with | 04:05 |
openpowerbot_ | [slack] <Matt Johnston> you can force it to use it anyway with ` attribute ram_style of ram : signal is "block";` | 04:06 |
openpowerbot_ | [slack] <Matt Johnston> (or `attribute syn_ramstyle of ram : signal is "block_ram";` is the equiv for ecp5, I've tried in my tree) | 04:06 |
openpowerbot_ | [slack] <Matt Johnston> you can force it to use it anyway with `attribute ram_style of ram : signal is "block";` | 04:06 |
openpowerbot_ | [slack] <Matt Johnston> some other brams aren't chosen because it has async clock. but the main icache and dcache both seem to get created OK. the tlb etc around it takes up lots of gates though, I guess related to what lkcl noted ^ | 04:09 |
openpowerbot_ | [slack] <Matt Johnston> some other brams aren't chosen because it has async clock. but the main icache and dcache both seem to get created OK. the tlb etc around it takes up lots of luts though, I guess related to what lkcl noted ^ | 04:10 |
openpowerbot_ | [slack] <joel> It makes sense. I was confused by this one though: | 04:38 |
openpowerbot_ | [slack] <joel> | 04:38 |
openpowerbot_ | [slack] <joel> > Rule for bram type $__ECP5_DP16KD (variant 4) rejected: requirement 'attribute syn_ramstyle="block_ram" ...' not met. | 04:38 |
openpowerbot_ | [slack] <Matt Johnston> "variant 4" is https://github.com/YosysHQ/yosys/blob/master/techlibs/ecp5/brams.txt#L25 `abits 13 dbits 2`, "Rule #5" is the match condition https://github.com/YosysHQ/yosys/blob/master/techlibs/ecp5/brams.txt#L97 | 04:48 |
openpowerbot_ | [slack] <Matt Johnston> it iterates through each of those looking for a match. here it can't match rule #5 because that requires that attribute to be set | 04:48 |
openpowerbot_ | [slack] <joel> Oh, we don't have the attribute set in the hdl? I misunderstood the message; I assumed it was set | 04:49 |
openpowerbot_ | [slack] <Matt Johnston> this is the litedram generated verilog. not sure how you pass them through there | 04:49 |
openpowerbot_ | [slack] <joel> I did the build on a power10 and a apple m1. Both much faster than my laptop, around 15 minutes, but run-to-run seems to vary so much that it's hard to compare | 04:51 |
openpowerbot_ | [slack] <Matt Johnston> ```//------------------------------------------------------------------------------ | 04:54 |
openpowerbot_ | [slack] <Matt Johnston> // Memory storage_4: 66-words x 8-bit | 04:54 |
openpowerbot_ | [slack] <Matt Johnston> //------------------------------------------------------------------------------ | 04:54 |
openpowerbot_ | [slack] <Matt Johnston> // Port 0 | Read: Sync | Write: Sync | Mode: Read-First | Write-Granularity: 8 | 04:54 |
openpowerbot_ | [slack] <Matt Johnston> // Port 1 | Read: Sync | Write: ---- | ``` | 04:54 |
openpowerbot_ | [slack] <Matt Johnston> maybe the problem is that yosys doesn't handle dual port bram well at the moment? | 04:55 |
openpowerbot_ | [slack] <Matt Johnston> maybe the problem is that yosys doesn't handle dual port bram well at the moment? pending https://github.com/YosysHQ/yosys/issues/1959#issuecomment-903139936 | 04:57 |
openpowerbot_ | [slack] <Matt Johnston> maybe the problem is that yosys doesn't handle dual port bram well at the moment? pending https://github.com/YosysHQ/yosys/issues/1959#issuecomment-903139936 | 04:57 |
openpowerbot_ | [slack] <Matt Johnston> maybe the problem is that yosys doesn't handle dual port bram well at the moment? pending https://github.com/YosysHQ/yosys/issues/1959#issuecomment-903139936 | 04:58 |
openpowerbot_ | [slack] <Matt Johnston> @joel I tracked down where the GHDL `litedram_core.init` file open regressed, will see if my stab at fixing it is right. https://github.com/ghdl/ghdl/pull/1929 | 05:47 |
openpowerbot_ | [slack] <joel> @Matt Johnston ada hacker! Nice one | 05:57 |
openpowerbot_ | [slack] <joel> | 05:57 |
openpowerbot_ | [slack] <joel> I'll give it a test. I spent an evening chasing my tail trying to create a reduced test case; I kept on reducing it past the fail point because I was also hitting an assert that might have been unrelated | 05:57 |
openpowerbot_ | [slack] <joel> It fixed it for me. Thanks! | 06:05 |
lkcl | appreciate the references, i know attributes can be added in nmigen, which should help enormously when compiling libre-soc on ECP5 FPGAs | 12:13 |
lkcl | ahhmm.... the dcache.vhdl / icache.vhdl isn't supposed to have any dual-ported RAM access | 12:14 |
openpowerbot_ | [slack] <Matt Johnston> that `storage_4` was from valentyusb | 12:14 |
lkcl | all the read/write points i have been able to identify are 1R1W | 12:15 |
lkcl | ahh :) | 12:15 |
lkcl | wheww that's a relief | 12:15 |
lkcl | from an ASIC perspective, cache SRAMs need to be 1R *or* 1W. it's the only way the SRAM cells can guarantee same-cycle. | 13:34 |
lkcl | luckily we've access to someone here with the expertise and knowledge (Staf Verhaegen of Chips4Makers) | 13:36 |
lkcl | but have to make sure that dcache.py and icache.py fit the rules! | 13:36 |
*** msh_ <msh_!~matt@130.95.13.111> has joined #microwatt | 13:51 | |
lkcl | Paul (or does anyone else know), does the I-Cache rely on the *D-Cache* for MMU page-table lookups? | 13:54 |
lkcl | i'm only seeing code-paths in *from* the MMU, but not *to* the MMU, in icache.vhdl | 13:55 |
lkcl | where dcache.vhdl has MMuToDCacheType *and* DcacheToMMUType | 13:55 |
openpowerbot_ | [slack] <Matt Johnston> is it because dcache can write data, but icache is read only? | 13:57 |
openpowerbot_ | [slack] <Matt Johnston> mmu_to_icache looks like it? | 13:57 |
lkcl | i don't honestly know | 14:05 |
lkcl | it looks like it's more sophisticated than that: m_in seems to contain tlbie and tlbld | 14:07 |
lkcl | which either set or clear I-Cache PTE (pagetable) entries | 14:07 |
lkcl | D-Cache likewise has m_in (MMUToDCacheType) and consequently allows set or clear D-Cache PTE entries | 14:10 |
lkcl | (m_in.addr + m_in.pte => stored in the cache) | 14:11 |
lkcl | but it happens to contain also an m_out which returns data, done and err notification back to the MMU | 14:13 |
lkcl | and that's what I-Cache is missing: back-communication to the MMU. | 14:14 |
lkcl | the question is: why | 14:14 |
*** Guest8411 <Guest8411!~Username@2-228-138-58.ip191.fastwebnet.it> has joined #microwatt | 14:18 | |
*** Guest8411 <Guest8411!~Username@2-228-138-58.ip191.fastwebnet.it> has quit IRC | 14:38 | |
lkcl | i just completed a unit test for instruction-fetch-using-virtual-memory | 16:08 |
lkcl | https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/experiment/test/test_loadstore1.py;h=c351d4e9e44ddcdb560dcf09de0e2d66ea7fd364;hb=d20858b784533ef8dac6952f470ebd9fe60205b5#l66 | 16:08 |
lkcl | i think i maaay have the answer to the question: the hypothesis is that the MMU PTE lookup is fired into *both* d-cache *and* i-cache | 16:09 |
lkcl | if that hypothesis is correct, then assuming identical parameters (identical TLB sizes, identical cache sizes) any errors can be caught by D-Cache triggering them | 16:10 |
lkcl | consequently I-Cache doesn't have to | 16:10 |
lkcl | that's the theory. | 16:10 |
lkcl | the unit test works great btw :) | 16:10 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!