Monday, 2022-03-07

*** jfinkhae1ser is now known as jfinkhaeuser		09:33
lkcl	programmerjake, apparently (i have vague recollections of seeing this before, i just forgot) you can use FFTs to do larger GF2^n math	21:27
lkcl	guess what?	21:27
lkcl	we have hardware-assisted FFT in SVP64... :)	21:27
* lkcl cackles manically		21:27
programmerjake	yup, probably using the same algorithms as needed for fast integer multiply, just swapping in GF(2) polynomial ops (xor and clmul)	21:30
lkcl	h	21:56
lkcl	ha!	21:57
lkcl	that'd be ridiculously funny	21:57
lkcl	does mean we need a clmuladd though	21:57
lkcl	eurrgh	21:57
lkcl	sorry	21:57
lkcl	clmultwinadd	21:57
lkcl	or to be able to set the gftwinmuladd polynomial to 2^XLEN	21:58
lkcl	which would probably be better, we're kinda running out of space in opcode 22	21:58
lkcl	https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=0a9f45f2615f35902c4783bcdd07ab6151db841d	21:59
lkcl	that's a neat algorithm you found btw	21:59
lkcl	i'm so glad and relieved you're able to read those symbols in academic papers. i can never get my head round the "assumptions and missing information" :)	21:59
lkcl	i managed to reconstruct the algorithm from the comments you gave, and it does seem to actually, like, y'know... give the right answer? :)	22:00
programmerjake	they explained some of the symbols in the text right above the algorithm	22:07
lkcl	good god, an academic paper that provided explanations??	22:09
programmerjake	XD	22:11
lkcl	ha, that's really exciting about the FFT.	22:11
lkcl	and i realised we can use count-trailing-1s to short-circuit the gf_invert function in hardware	22:12
lkcl	ohh btw, deep breath: i found a bug in microwatt's WB4 pipeline-burst-mode handling of stall	22:13
programmerjake	we could...but i'm inclined to instead have a pipeline... 1 stage per iteration. then we could merge stages together since the mux & xor should be fast enough	22:13
lkcl	nobody's noticed before, because everyone uses litex, and the "joiner" HDL uses the WB4-to-WB3 trick	22:13
lkcl	true	22:13
lkcl	stall = cyc & ~ack	22:14
lkcl	i've been frickin banging my head against a brick wall for 5 weeks and only just noticed / realised	22:14
programmerjake	hmm, wonder if that's why my 3d maze demo randomly crashes if you type too many characters...	22:14
lkcl	ah no, that'll almost certainly be because you ran the 16550 FIFOs out of space	22:15
lkcl	which causes an interrupt	22:15
programmerjake	well...it's not using a 16550...it's using valentyusb	22:15
lkcl	run it under the microwatt_verilator branch, you should get a full gtkwave stack trace	22:15
lkcl	ahh	22:15
lkcl	no idea then :)	22:16
programmerjake	(though i did make it so you could also use the uart as input/output rather than only through usb)	22:16
lkcl	nice. must try it when i'm not in headless-chicken-meltdown mode. really want to	22:17
programmerjake	note that in that gfinv algorithm, you need to use the reducing polynomial without the msb stripped	22:20
lkcl	ehhmmm without? ok... i know how to deal with that	22:31
lkcl	err are you sure? it produces the wrong answer	22:31
lkcl	oh hang on...	22:31
lkcl	coo! it produces the right answer :)	22:32
programmerjake	:)	22:35
lkcl	it gets the answer "x" if you do (x * y) / y	22:37
lkcl	pffh	22:37

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!