Sunday, 2021-05-09

choozyHey00:09
choozyDid the 180nm tapeout work well?00:09
programmerjakechoozy: I haven't heard yet...sorry07:06
programmerjakeI was looking through the Vulkan specs again, and I just realized how horribly weak the requirements are for the sin and cos functions: for f32 they have absolute error <= 1/2048 in the range [-pi, pi] and no accuracy requirements outside that range!!07:10
programmerjakeIf Vulkan is all we cared about, that can easily be done with a small lookup table and linear interpolation!!07:12
programmerjakeabsolute error <= 1/2048 means less than half of the mantissa bits are correct07:13
lkclprogrammerjake, dang. that saves vast amounts of silicon11:12
lkcland it's quick11:13
lkclplus, if we _don't_ do that, it means we won't be commercially competitive11:13
lkclwe'll need some "accuracy bits" to be set on FP.  there is such a bit in the FPCSR11:14
lkcls/FPCSR/FPSPR/whatever11:14
lkclcesar[m]1, i'm just endeavouring / working out how to add LD/ST exceptions13:01
lkclthis will allow a unit test to be written13:02
lkclhenriok, i went with "Optional features, if chosen, must be implemented in their entirety (partial implementation of an Optional feature is not permitted)"13:02
lkclcesar[m]1, i'm starting with a misalignment trap, that should do it13:28
lkclchoozy: it's moved to 9th Jun.13:37
choozylkcl, ah, thank you for the heads up13:37
lkclJean-Paul needs to do the Antenna https://gitlab.lip6.fr/vlsi-eda/coriolis/-/commit/bb5c99247a89b7fc892aeb61904ddab2b6e01b5913:38
lkclthat's critical: TSMC will not allow Antenna DRC violations13:39
lkclchoozy, http://lists.libre-soc.org/pipermail/libre-soc-dev/2021-April/002501.html13:39
choozyHow many are you guys planning?13:41
lkclchoozy, MPWs are extremely small runs.  maybe 100 ASICs, of which maybe 30 are functional if you are lucky.13:45
choozyAh, okay13:45
choozyMaybe a smaller part of a 200 or 250mm wafer?13:46
lkclthis is 180nm so the yields might be a bit higher. i don't know exactly how many we'll get13:46
lkclyes, that's a Shuttle Run.13:46
lkclmultiple designs sharing the same wafer13:46
lkclfolks, the new tasks section needs filling in with "sentences" https://libre-soc.org/13:47
lkclthis is so Dr Stallman can help find people to help13:48
lkcl*snort* a new record for me13:49
lkcllkcl@fizzy:~/src/libresoc/soc/src/soc$ ps auxww | grep  "vi " | wc13:49
lkcl   1306   15680  11866913:49
lkclthat beats my previous record by over 100% :)13:50
choozyAh, nice13:55
lkclit's the total number of vim editor commands i have running simultaneously on my laptop lol13:55
jn__wow. how do you switch between them? 1300 vims can't really fit on the screen at the same time13:57
lkcljn__: 24 virtual fvwm2 screens, at 3840x2160 each, with between 8 and 12 80x65 xterms in each13:58
lkclthen using (in some extreme cases) "jobs | grep {insertkeyword}"13:59
lkclhttps://libre-soc.org/HDL_workflow/640x-2020-01-24_11-56.png13:59
lkclhttps://libre-soc.org/HDL_workflow/2020-01-24_11-56.png14:00
jn__ok, that brings it up to about 500 — same order of magnitude14:00
lkcl[79]-  Stopped                 vi fu/ldst/loadstore.py14:01
lkcl[80]+  Stopped                 vi fu/mmu/fsm.py14:01
lkcllkcl@fizzy:~/src/libresoc/soc/src/soc$ jobs | wc14:01
lkcl     72     281    405014:01
lkclthat's just one xterm14:01
jn__i see14:01
lkclit's the only way i can keep track14:01
lkclone virtual desktop deals with main soc development14:02
lkclanother with coriolis214:02
lkclanother with "library investigation" (nmigen, ieee754fpu)14:02
lkclanother with litex14:02
lkclanother has web browsers14:03
lkcletc. etc. etc. etc.14:03
choozylkcl, you probably have a pretty hefty workstation?14:09
lkclcesar[m]1, doh, misalignment exceptions have to be implemented in ISACaller first :)16:09
cesar[m]1Indeed...16:25
lkclcesar[m]1, that would be really helpful if you could add exception handling to TestIssuer FSM16:44
lkclhttps://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/simple/issuer.py;hb=HEAD#l75816:44
lkclit should actually be really straightforward16:44
lkclsync += pdecode2.ldst_exc.eq(core.fus.get_exc("ldst0")16:47
lkclthen *re-run* the instruction16:47
cesar[m]1Sure, I'm on the case.16:48
lkclstar16:48
lkclother exceptions from other FUs will be different but the same principle16:48
lkclhttps://science.slashdot.org/story/21/05/09/0031246/mushrooms-on-mars-is-a-hoax-stop-believing-hacks16:49
lkclmushrooms on mars haha16:49
programmerjake<lkcl "plus, if we _don't_ do that, it "> well, OpenCL has much stricter accuracy requirements...also Vulkan specifies the loosest possible requirements so all GPUs can meet Vulkan requirements, not necessarily because barely meeting those requirements is a good implementation strategy...19:14
programmerjakelkcl: https://bugs.libre-soc.org/show_bug.cgi?id=541#c219:41
programmerjake> accuracy of sin GLSL function on Intel/AMD/NVidia GPUs:19:41
programmerjakehttps://community.khronos.org/t/builtin-math-function-execution-cost-issues-with-accuracy-of-builtins/75130/419:41
programmerjake> Both AMD and NVidia GPUs are waay more accurate than is required by Vulkan, another reason I think we shouldn't implement horribly inaccurate functions just because they technically meet the Vulkan spec.19:42
lkclprogrammerjake: in my mind that's all pointing towards "be flexible"20:44
lkclchoose high accuracy, that's high power consumption or longer time, we lose20:44
lkclchoose low accuracy, that's low power, people say "this isn't accurate enough", we lose20:45
lkclit's pointing towards adding runtime flexibility20:45
programmerjakeok, except iirc the programs that run on amd gpus (which have the highest accuracy) aren't any different (they don't have an option saying give me high/low accuracy) than the ones that run on e.g. intel gpus (lowest accuracy out of amd, nvidia, intel), so having options is fine but we'd have to always just pick the high accuracy one to meet developer expectations who are used to gpus that greatly exceed khronos's21:03
programmerjakejunk-tier minimum requirements21:03
programmerjakemeaning it takes extra silicon to implement the low-accuracy variant that we can't use anyway21:04
lkclextra silicon is not a problem22:50
lkclmeeting both end-user requirements when other vendors fail to meet both is, in my mind, a high priority22:50
lkclwhat the Khronos Group says should be done can take a back seat22:51
lkclwe can always have a mode-switch that "strictly complies with Khronos requirements"22:51
lkclthen make available a mode-switch that provides *WHAT THE USERS* actually want22:51
lkclremember: if done carefully (with the SIMD partitioning) we can get 2x the results in 1/2 the time (for a given O(N^2) algorithm)23:03
lkclthat's commercially deeply significant23:03

Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!