*** openpowerbot <openpowerbot!~openpower@94-226-186-169.access.telenet.be> has quit IRC | 05:26 | |
*** openpowerbot <openpowerbot!~openpower@94-226-186-169.access.telenet.be> has joined #microwatt | 05:27 | |
*** openpowerbot <openpowerbot!~openpower@94-226-186-169.access.telenet.be> has quit IRC | 05:27 | |
*** openpowerbot <openpowerbot!~openpower@94-226-186-169.access.telenet.be> has joined #microwatt | 05:28 | |
*** openpowerbot_ <openpowerbot_!~openpower@94-226-186-169.access.telenet.be> has joined #microwatt | 12:18 | |
*** openpowerbot <openpowerbot!~openpower@94-226-186-169.access.telenet.be> has quit IRC | 12:18 | |
*** openpowerbot_ <openpowerbot_!~openpower@94-226-186-169.access.telenet.be> has quit IRC | 13:12 | |
*** openpowerbot_ <openpowerbot_!~openpower@94-226-188-34.access.telenet.be> has joined #microwatt | 13:28 | |
*** openpowerbot_ <openpowerbot_!~openpower@94-226-188-34.access.telenet.be> has quit IRC | 13:38 | |
lkcl | toshywoshy, openpowerbot's gone walkies here :) | 14:32 |
---|---|---|
*** openpowerbot_ <openpowerbot_!~openpower@94-226-188-34.access.telenet.be> has joined #microwatt | 14:39 | |
openpowerbot_ | [mattermost] <lkcl> with 64 PLRUs (because of 64 way cache lines) that's a hell of a lot of 6-bit binary-comparators against tlb_hit_index | 14:40 |
openpowerbot_ | [mattermost] <lkcl> https://ftp.libre-soc.org/2021-12-06_14-42.png | 14:45 |
openpowerbot_ | [mattermost] <lkcl> that's after the binary-to-unary converter, you can see the 64 PLRUs on the right half | 14:46 |
openpowerbot_ | [mattermost] <lkcl> i would love to be surprised to learn that VHDL is capable of spotting this and creating optimal unary-encoded logic, not binary-comparators :) | 14:46 |
lkcl | toshywoshy, thx | 14:49 |
openpowerbot_ | [slack] <Paul Mackerras> lkcl, interesting | 21:15 |
openpowerbot_ | [slack] <Paul Mackerras> doesn't your one_hot_hit = 1 << r1.tlb_hit_index turn into a 6 to 64 decoder? | 21:16 |
openpowerbot_ | [slack] <Paul Mackerras> I wonder if a 6 to 64 decoder is going to take fewer LUTs than a bunch of 6-bit compare-with-constant comparators, or not | 21:17 |
openpowerbot_ | [slack] <Paul Mackerras> With 6-input LUTs, the 6 to 64 decoder is probably just 64 LUTs, and a 6-bit comparator is going to take one LUT | 21:18 |
openpowerbot_ | [slack] <Paul Mackerras> (for the case where one comparator input is a constant) | 21:19 |
openpowerbot_ | [slack] <Paul Mackerras> With 4-input LUTs, I assume the decoder would be done as a 1-to-8 decoder on the top 3 bits followed by eight 1-to-8 decoders, total 72 LUTs | 21:22 |
openpowerbot_ | [slack] <Paul Mackerras> The comparators would be 2 LUTs each in the simple case but with 64 of them it should be possible to share logic | 21:23 |
openpowerbot_ | [mattermost] <lkcl> yes. and there's a special nmigen module called Decoder. our focus is more ASIC than FPGA | 21:42 |
openpowerbot_ | [slack] <Paul Mackerras> ok fair enough | 21:43 |
openpowerbot_ | [mattermost] <lkcl> LUT6s are cheating, unfair! :) | 21:43 |
openpowerbot_ | [mattermost] <lkcl> yes, all the comparator inputs are constant, luckily: for-loop from 0-63 | 21:44 |
openpowerbot_ | [mattermost] <lkcl> my mentor of 12 years did warn me of these kinds of optimisations, the differences between targetting an FPGA and targetting an ASIC | 21:45 |
openpowerbot_ | [slack] <Paul Mackerras> right | 21:45 |
openpowerbot_ | [mattermost] <lkcl> i'm redoing DTLB Updates as nmigen Memory btw | 21:46 |
openpowerbot_ | [mattermost] <lkcl> so it'll actually be declared as if it was an explicit SRAM (with 4-way write-enable, which nmigen supports) | 21:46 |
openpowerbot_ | [mattermost] <lkcl> so 128-bit wide for the TAG_WAY_BITs but with 4 write-enable lines @ 32-bit each | 21:47 |
openpowerbot_ | [slack] <Paul Mackerras> ah ok | 21:48 |
openpowerbot_ | [mattermost] <lkcl> deep breath: we need to ask Staf Verhaegen (Chips4Makers) to custom-write the SRAMs (or, make sure that the memory compiler he's writing can cope with the dimensions) | 21:49 |
openpowerbot_ | [mattermost] <lkcl> correction: 256-bit wide with 4 write-enable lines. TAG_WAYS=4, TAG_WIDTH=64. | 21:50 |
openpowerbot_ | [mattermost] <lkcl> paul, you may be interested to know, i'm using nmigen Memory with write-enable to avoid having to have the full PTE/WAY tags put back into the TLB row and updated "in full" | 23:11 |
openpowerbot_ | [mattermost] <lkcl> so Memory.write-enable == 1<<repl_way | 23:12 |
openpowerbot_ | [mattermost] <lkcl> the PTE still needs shifting up by repl_way*TLB_PTE_BITS, likewise the WAY | 23:13 |
openpowerbot_ | [mattermost] <lkcl> because that's the data being presented to the (256-bit-wide and 184-bit-wide) Memorys | 23:13 |
openpowerbot_ | [mattermost] <lkcl> but at least it means the ANDing/ORing/masking with the original (old, full, 256/184-bit-wide) row value is gone because that's now handled by the Memory's write-enable | 23:14 |
openpowerbot_ | [mattermost] <lkcl> whether BRAMs are capable of supporting that in FPGA tools i have no idea | 23:15 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!