*** rsc_ is now known as rsc | 02:02 | |
lkcl | cesar, you saw my comments in the bugreport? if you use core.busy_o globally (combinatorially) as a "stall" signal, you should be able to avoid adding any kind of "flush" entirely | 13:57 |
---|---|---|
lkcl | core.busy_o is set when either: | 13:57 |
lkcl | * there are not enough ALU FUs or | 13:58 |
lkcl | * a LDST, Branch, or Trap FU is needed | 13:58 |
lkcl | ALUs will of course always complete, so it is enough to know that "at least one is available" | 13:58 |
lkcl | LDST, Branch or Trap could have errors (exceptions, or change PC), so those are important to know | 13:59 |
lkcl | by listening to that core.busy_o, the only stage that would need "flush" would be the decode one | 14:04 |
lkcl | because you would have this: | 14:05 |
lkcl | * fetch NNN+2 | 14:05 |
lkcl | * decode NNN+1 | 14:05 |
lkcl | * core busy (because of branch) | 14:05 |
lkcl | both fetch and decode would not proceed until core.busy_o is done, BUT, if it went the different way, the decode NNN+1 instruction would need to be removed | 14:06 |
cesar | Sure. | 14:23 |
cesar | ... got it. | 14:24 |
cesar | For flushing, I was thinking about a "switch box" pipeline device, which can redirect raw and decoded instructions to /dev/null when needed. | 14:40 |
cesar | ... instead of intrusively adding a flush signal to these stages. | 14:43 |
lkcl | mmm, at the gate level, both of them involve muxing | 14:44 |
cesar | Just on the ready/valid signals, not on the data. | 14:45 |
lkcl | a flush signal will be more "normal" | 14:45 |
lkcl | ahh | 14:45 |
cesar | The switch box would compare the PC of the incoming decoded instruction, and the PC on the register file. If they don't match, redirect to /dev/null. | 14:49 |
lkcl | mmm... that means adding yet another port to the StageRegs | 14:50 |
lkcl | there are already six | 14:51 |
lkcl | my feeling is, it is simpler to just add a global "flush" signal which is effectively a "hard reset" | 14:51 |
cesar | Sorry, I meant "compare the PC on the Core state", not the register file. | 14:52 |
cesar | Sure. | 14:52 |
cesar | Just an idea. | 14:52 |
lkcl | ah yes, maybe that would work | 14:52 |
lkcl | let me think | 14:52 |
lkcl | compare incoming PC to core_state.pc ... | 14:53 |
cesar | It can be a reset, yes, but only activated if the incoming PC is different from the core PC. | 14:54 |
cesar | That way, if/when we add a branch predictor, it can easily fit. | 14:55 |
lkcl | the "normal" way is: you pass the PC+insn along every pipeline stage | 14:55 |
lkcl | fetch @ 0000 will be passed to decode, so it has PC=0000,insn=XXXXX | 14:56 |
lkcl | next cycle, decode passes PC=0000,insn=XXXX to execute (core) | 14:56 |
lkcl | whilst fetch passes @ 0004,insn=YYYY to decode | 14:57 |
lkcl | next cycle, decode passes PC=0004,insn=YYYY to execute | 14:57 |
cesar | Indeed. | 14:57 |
lkcl | whilst fetch passes @0008,insn=ZZZZ to decode | 14:57 |
lkcl | if XXXX was a branch, then both fetch and decode must drop their current data, and fetch must use the PC that was written into the State Regfile, by core | 14:59 |
cesar | Sure. Without a branch predictor, any taken branch instruction should trigger a flush. | 15:02 |
cesar | With a branch predictor, we should check the incoming PC, since it could actually be the right one. | 15:03 |
cesar | ... a flush of Fectch and Decode, that is. | 15:03 |
cesar | * Fetch | 15:03 |
lkcl | urr i am messing with the dcache FSM and it is code that i do not properly grasp | 17:01 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!