Thursday, 2022-01-13

*** rsc_ is now known as rsc02:02
lkclcesar, you saw my comments in the bugreport? if you use core.busy_o globally (combinatorially) as a "stall" signal, you should be able to avoid adding any kind of "flush" entirely13:57
lkclcore.busy_o is set when either:13:57
lkcl* there are not enough ALU FUs or13:58
lkcl* a LDST, Branch, or Trap FU is needed13:58
lkclALUs will of course always complete, so it is enough to know that "at least one is available"13:58
lkclLDST, Branch or Trap could have errors (exceptions, or change PC), so those are important to know13:59
lkclby listening to that core.busy_o, the only stage that would need "flush" would be the decode one14:04
lkclbecause you would have this:14:05
lkcl* fetch NNN+214:05
lkcl* decode NNN+114:05
lkcl* core busy (because of branch)14:05
lkclboth fetch and decode would not proceed until core.busy_o is done, BUT, if it went the different way, the decode NNN+1 instruction would need to be removed14:06
cesar... got it.14:24
cesarFor flushing, I was thinking about a "switch box" pipeline device, which can redirect raw and decoded instructions to /dev/null when needed.14:40
cesar... instead of intrusively adding a flush signal to these stages.14:43
lkclmmm, at the gate level, both of them involve muxing14:44
cesarJust on the ready/valid signals, not on the data.14:45
lkcla flush signal will be more "normal"14:45
cesarThe switch box would compare the PC of the incoming decoded instruction, and the PC on the register file. If they don't match, redirect to /dev/null.14:49
lkclmmm... that means adding yet another port to the StageRegs14:50
lkclthere are already six14:51
lkclmy feeling is, it is simpler to just add a global "flush" signal which is effectively a "hard reset"14:51
cesarSorry, I meant "compare the PC on the Core state", not the register file.14:52
cesarJust an idea.14:52
lkclah yes, maybe that would work14:52
lkcllet me think14:52
lkclcompare incoming PC to core_state.pc ...14:53
cesarIt can be a reset, yes, but only activated if the incoming PC is different from the core PC.14:54
cesarThat way, if/when we add a branch predictor, it can easily fit.14:55
lkclthe "normal" way is: you pass the PC+insn along every pipeline stage14:55
lkclfetch @ 0000 will be passed to decode, so it has PC=0000,insn=XXXXX14:56
lkclnext cycle, decode passes PC=0000,insn=XXXX to execute (core)14:56
lkclwhilst fetch passes @ 0004,insn=YYYY to decode14:57
lkclnext cycle, decode passes PC=0004,insn=YYYY to execute14:57
lkclwhilst fetch passes @0008,insn=ZZZZ to decode14:57
lkclif XXXX was a branch, then both fetch and decode must drop their current data, and fetch must use the PC that was written into the State Regfile, by core14:59
cesarSure. Without a branch predictor, any taken branch instruction should trigger a flush.15:02
cesarWith a branch predictor, we should check the incoming PC, since it could actually be the right one.15:03
cesar... a flush of Fectch and Decode, that is.15:03
cesar* Fetch15:03
lkclurr i am messing with the dcache FSM and it is code that i do not properly grasp17:01

Generated by 2.17.1 by Marius Gedminas - find it at!