Monday, 2022-03-07

*** jfinkhae1ser is now known as jfinkhaeuser09:33
lkclprogrammerjake, apparently (i have vague recollections of seeing this before, i just forgot) you can use FFTs to do larger GF2^n math21:27
lkclguess what?21:27
lkclwe have hardware-assisted FFT in SVP64... :)21:27
* lkcl cackles manically21:27
programmerjakeyup, probably using the same algorithms as needed for fast integer multiply, just swapping in GF(2) polynomial ops (xor and clmul)21:30
lkclh21:56
lkclha!21:57
lkclthat'd be ridiculously funny21:57
lkcldoes mean we need a clmuladd though21:57
lkcleurrgh21:57
lkclsorry21:57
lkclclmultwinadd21:57
lkclor to be able to set the gftwinmuladd polynomial to 2^XLEN21:58
lkclwhich would probably be better, we're kinda running out of space in opcode 2221:58
lkclhttps://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=0a9f45f2615f35902c4783bcdd07ab6151db841d21:59
lkclthat's a neat algorithm you found btw21:59
lkcli'm so glad and relieved you're able to read those symbols in academic papers. i can never get my head round the "assumptions and missing information" :)21:59
lkcli managed to reconstruct the algorithm from the comments you gave, and it does seem to actually, like, y'know... give the right answer? :)22:00
programmerjakethey explained some of the symbols in the text right above the algorithm22:07
lkclgood god, an academic paper that provided explanations??22:09
programmerjakeXD22:11
lkclha, that's really exciting about the FFT.22:11
lkcland i realised we can use count-trailing-1s to short-circuit the gf_invert function in hardware22:12
lkclohh btw, deep breath: i found a bug in microwatt's WB4 pipeline-burst-mode handling of stall22:13
programmerjakewe could...but i'm inclined to instead have a pipeline... 1 stage per iteration. then we could merge stages together since the mux & xor should be fast enough22:13
lkclnobody's noticed before, because everyone uses litex, and the "joiner" HDL uses the WB4-to-WB3 trick22:13
lkcltrue22:13
lkclstall = cyc & ~ack22:14
lkcli've been frickin banging my head against a brick wall for 5 weeks and only just noticed / realised22:14
programmerjakehmm, wonder if that's why my 3d maze demo randomly crashes if you type too many characters...22:14
lkclah no, that'll almost certainly be because you ran the 16550 FIFOs out of space22:15
lkclwhich causes an interrupt22:15
programmerjakewell...it's not using a 16550...it's using valentyusb22:15
lkclrun it under the microwatt_verilator branch, you should get a full gtkwave stack trace22:15
lkclahh22:15
lkclno idea then :)22:16
programmerjake(though i did make it so you could also use the uart as input/output rather than only through usb)22:16
lkclnice. must try it when i'm not in headless-chicken-meltdown mode. really want to22:17
programmerjakenote that in that gfinv algorithm, you need to use the reducing polynomial without the msb stripped22:20
lkclehhmmm *without*? ok... i know how to deal with that22:31
lkclerr are you sure? it produces the wrong answer22:31
lkcloh hang on...22:31
lkclcoo! it produces the *right* answer :)22:32
programmerjake:)22:35
lkclit gets the answer "x" if you do (x * y) / y22:37
lkclpffh22:37

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!