programmerjake | lkcl: looking through where you added divrem2du to the wiki, the definition needs to be changed so RC is the high half of the numerator otherwise the bigint / word division won't work with a single svp64 instruction because rs would feed into rc which you currently have set as the low half of the numerator which incorrect for the division algorithm. | 03:04 |
---|---|---|
programmerjake | which *is* incorrect for the division algorithm. | 03:05 |
ghostmansd[m] | Hi folks. I've recently migrated the VM I used for development to other laptop, and somehow lost git access. $(ssh -v -p922 gitolite3@git.libre-soc.org) hangs. Any tips? | 09:01 |
programmerjake | does it ask for a password? or does it just not connect? | 09:03 |
ghostmansd[m] | Just hangs. | 09:04 |
ghostmansd[m] | I'm not sure what's changed. The VM is passed as is. | 09:05 |
programmerjake | what's the last message in the debug output? | 09:05 |
ghostmansd[m] | MAC addresses of interfaces might have changed, that's one that comes to my mind. | 09:05 |
ghostmansd[m] | debug1: /etc/ssh/ssh_config line 19: Applying options for * | 09:06 |
ghostmansd[m] | Not particularly interesting | 09:06 |
programmerjake | hmm, sounds like you borked your local config...that's before it tries to start the tcp connection afaict | 09:08 |
programmerjake | can you resolve git.libre-soc.org through dns? | 09:09 |
programmerjake | nslookup git.libre-soc.org | 09:10 |
ghostmansd[m] | Let me check | 09:12 |
ghostmansd[m] | Yeah, nothing | 09:12 |
ghostmansd[m] | Hm | 09:12 |
ghostmansd[m] | Ok, looks like main adapter is the culprit | 09:13 |
ghostmansd[m] | Works now, thanks, programmerjake! | 09:42 |
programmerjake | :) | 09:44 |
ghostmansd | My primary laptop is dying, so I decided it's time to switch to my old yet good TP X220 :-) | 09:48 |
programmerjake | ah...maybe repair your laptop? i've soldered a new screen backlight connector on one of my previous laptops...the wire broke from metal fatigue | 10:05 |
lkcl | oo an X220, ooo | 10:20 |
lkcl | programmerjake, arg yes. of course. let me think about it... probably just inverting RC RA so (RA) || (RC) instead of (RC) || (RA) or er | 10:21 |
lkcl | 173 uint64_t dig2 = ((uint64_t)k << 32) | u[j]; | 10:23 |
lkcl | 174 q[j] = dig2 / v[0]; // divisor here. | 10:23 |
lkcl | 175 k = dig2 % v[0]; // modulo back into next loop | 10:23 |
lkcl | => | 10:23 |
lkcl | dividend[0:(XLEN*2)-1] <- (RA) || (RC) | 10:23 |
lkcl | k=>RA | 10:24 |
lkcl | RC=>u[j] | 10:24 |
lkcl | which is the wrong way round | 10:24 |
programmerjake | well, as i proposed earlier: https://bugs.libre-soc.org/show_bug.cgi?id=817#c26 | 10:25 |
lkcl | Power ISA div/mod operations use RA as dividend and RB as divisor | 10:26 |
lkcl | dividend[0:(XLEN*2)-1] <- (RC) || (RA) | 10:26 |
programmerjake | yeah...we don't necessarily have to replicate that... | 10:26 |
lkcl | does not break expectations in that regard (RC is the "interloper") | 10:26 |
lkcl | divdu: https://libre-soc.org/openpower/isa/fixedarith/ | 10:27 |
programmerjake | though maybe rb being divisor is better...less muxes | 10:27 |
lkcl | dividend[0:XLEN-1] <- (RA) | 10:27 |
lkcl | indeed. | 10:27 |
programmerjake | did you see the 700k cell 128/64-bit divisor i wrote? | 10:28 |
lkcl | going off of divdeu rather than divdu was what confused the hell out of me | 10:28 |
lkcl | 700,000 cells... holy cow | 10:28 |
lkcl | not yet i've only just got up :) | 10:29 |
programmerjake | ah, it's the kan-ban email i sent. has details on dealing with that...maybe fsm or different optimization of parameters | 10:29 |
lkcl | you have to be careful with Memory() | 10:30 |
programmerjake | the rom is 3kbit...that's not the problem | 10:30 |
lkcl | and yes, it'll really need to be a FSM :) | 10:30 |
lkcl | need to be extra-extra-careful about the gate usage | 10:31 |
lkcl | microwatt has no 128-bit pathways *at all* in its division. | 10:31 |
lkcl | it's purely 64-bit | 10:31 |
programmerjake | yeah...i was more concerned with working rather than being efficient for now | 10:32 |
lkcl | yehyeh | 10:32 |
programmerjake | imho we should leave the fully pipelined version in there...it could be useful later when we build much higher performance cpus where 500M transistors per core isn't a problem | 10:33 |
lkcl | i've gone over that before. | 10:34 |
programmerjake | so don't try to code-morph that into a fsm...copy it first if you want to try | 10:34 |
lkcl | although in this case if the ROMs are particularly large (16-bit estimates for example) then they become a factor in the calculations | 10:36 |
programmerjake | it's also possible i made a mistake somewhere in the error analysis code that makes it think it needs waay more accuracy than it actually does...increasing word lengths to compensate | 10:40 |
lkcl | well that should be easy to test, by adding an extra check that F-F' is zero | 10:43 |
lkcl | (or, zero in the bits that matter) | 10:43 |
lkcl | if it is, then there was one too many iterations | 10:43 |
lkcl | is that right? | 10:44 |
lkcl | or does it need comparison of Q/F and D/F... hmmm | 10:44 |
lkcl | F gets scaled | 10:44 |
programmerjake | except you'd have to check all possible inputs...2^(128+64) is too many to check | 10:44 |
lkcl | you get the idea where i'm going with that | 10:45 |
lkcl | true... would a smaller bitwidth give a good idea? | 10:45 |
lkcl | otherwise it's wandering into formal-correctness territory, and multiply is an absolute pig there | 10:45 |
programmerjake | even 12/6 is pushing it on test time | 10:45 |
programmerjake | yeah...i'm expecting smt formal proofs to fall over on this one | 10:46 |
lkcl | :) | 10:47 |
programmerjake | probably...they do support real numbers tho (technically only algebraic numbers...they have no functions to create transcendental numbers) | 10:47 |
programmerjake | according to the standard smt interface language (smt-lib2 iirc) | 10:48 |
lkcl | frick fricking frick. gmail is terminating IMAP access next month. | 10:52 |
lkcl | or, terminating username/password access | 10:53 |
lkcl | means i can't use K9 from f-droid | 10:53 |
programmerjake | you probably still can, with an app password: https://support.google.com/accounts/answer/185833 | 10:55 |
programmerjake | see k9 mail's github issue about that: https://github.com/k9mail/k-9/issues/6020 | 11:06 |
programmerjake | lkcl: ^ | 11:06 |
lkcl | programmerjake, thx | 11:44 |
lkcl | the latest version of kmail is s*** | 11:44 |
lkcl | the UI is awful | 11:45 |
markos | kmail used to be really good, I had to switch away from that and kontact due to the akonadi crap hogging my cpu constantly | 11:46 |
lkcl | yeah i stopped using KDE when it went to plasma and qt5 | 11:46 |
lkcl | trinity desktop still maintains KDE 3.0 which, because it uses qt3, is lightning-quick and extremely small binary size (relatively speaking) | 11:47 |
lkcl | i'll see how i get on with xypto, it seems to be doing ok | 11:47 |
markos | I'm using kde5 now, I prefer it to gnome tbh, though I would switch to xfce again if they finally ever make the switch to gtk3 | 11:48 |
markos | kde 3? interesting, I'll take a look | 11:48 |
lkcl | https://www.trinitydesktop.org/ | 11:48 |
lkcl | well-maintained, works great | 11:48 |
lkcl | a fork of KDE 3.5 from 2010 | 11:48 |
lkcl | startup time is FAST | 11:49 |
markos | don't they have to also provide support for old qt3? | 11:49 |
lkcl | yyep | 11:49 |
lkcl | installed... sigh... in /opt | 11:49 |
lkcl | sigh | 11:49 |
lkcl | but hey. | 11:50 |
lkcl | one of the good things about xfce is precisely that it *doesn't* use gtk3 | 11:50 |
lkcl | imo both qt5 and gtk3 have gotten... scary in size | 11:50 |
markos | but also one of the reasons it doesn't work in wayland | 12:38 |
markos | or AFAIU it | 12:38 |
lkcl | i've uesd fvwm2 on x11/xorg for over 22 years so will entirely pass wayland by for many more :) | 12:45 |
ghostmansd | I'm not sure how it'd be better to handle the prefix (aka ".long xxx" part). We operate on md_assemble() level, but ".long" et al. are handled on the upper level (exactly in the routine which calls md_assemble(), read_a_source_file()). Both routines heavily rely on global context, and it's not really easy to change it. | 14:41 |
ghostmansd | The ".long" stuff goes to cons_worker() routine, but this one, again, operates on global identifiers. | 14:42 |
lkcl | blech | 14:42 |
ghostmansd | I have yet to come up with some clever way to handle this. Otherwise the binutils code will need more tuning. | 14:43 |
lkcl | well, a workaround is to create a new instruction "svp64" which takes the 24-bit RM as an argument | 14:43 |
ghostmansd | Hm. This might do the trick. | 14:43 |
lkcl | it's something i thought of doing, 6+ months ago | 14:43 |
ghostmansd | Yes, as pseudo-op, right? | 14:43 |
ghostmansd | Exactly like we have "add", "xor" or whatever else. | 14:44 |
lkcl | no, "actual" op | 14:44 |
lkcl | as a 32-bit op | 14:44 |
lkcl | pseudo-ops are things like "li r5, 99" which is an alias for "addi r5, r0, 99" | 14:45 |
ghostmansd | OK, call it as you like :-) | 14:45 |
ghostmansd | I mean reserving an entry in the table of insns | 14:45 |
lkcl | yes :) | 14:45 |
ghostmansd | Giving it an opcode... | 14:45 |
ghostmansd | Yes, this should do the trick, I think! | 14:45 |
lkcl | yes, ah ok i see what you mean, "not an official op" | 14:45 |
ghostmansd | Then we won't leave md_assemble level... | 14:45 |
ghostmansd | Yeah, right | 14:46 |
ghostmansd | https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=opcodes/ppc-opc.c;h=bd83d44a2ae8837ab6a2f7f2cf64edda230ae0f8;hb=HEAD#l4852 | 14:47 |
ghostmansd | Like allocating an entry here... | 14:47 |
ghostmansd | I have yet to think of the operands array | 14:47 |
lkcl | yes basically | 14:47 |
ghostmansd | But seems like our best choice | 14:48 |
lkcl | btw there will be some entries added there, setvl for example | 14:48 |
ghostmansd | Thanks Luke, this basically concludes some hours of thinking | 14:48 |
lkcl | and svstep | 14:48 |
lkcl | which really are 32-bit ops (actual 32-bit ops) | 14:48 |
ghostmansd | Yeah, we have some stuff done manually | 14:48 |
lkcl | but they definitely need to go under a "--experimental aka libresoc" flag | 14:48 |
ghostmansd | setvl, svstep, fmadds, etc. | 14:49 |
ghostmansd | I think they'll reside in standalone table | 14:49 |
lkcl | yes, ah i forgot about fmadds, yes | 14:49 |
ghostmansd | that's basically how they do it in binutils | 14:49 |
lkcl | they're in svp64.py at the moment - even as 32-bit ops - precisely because standard upstream binutils doesn't have them | 14:49 |
ghostmansd | e.g. https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=opcodes/ppc-opc.c;h=bd83d44a2ae8837ab6a2f7f2cf64edda230ae0f8;hb=HEAD#l10509 | 14:49 |
lkcl | ohh ok, nice idea. yes, very sensible | 14:50 |
ghostmansd | Like here, there's another table | 14:50 |
lkcl | then they can be brought in to the hashtable or not based on a runtime switch. niec idea | 14:50 |
ghostmansd | binutils' PPC code knows all these tables but still groups them | 14:50 |
lkcl | sensible | 14:50 |
ghostmansd | even though they end up in the same hash | 14:50 |
ghostmansd | Perfect | 14:50 |
lkcl | 10,000+ line long file but still sensible | 14:50 |
ghostmansd | OK, I think I'll follow this way with reserving some magic opcode | 14:52 |
ghostmansd | I think this fits even better than .long incantation | 14:53 |
lkcl | .long was - is - a hack that i knew binutils could handle | 14:53 |
ghostmansd | OK, so we basically teach an old dog a couple of new tricks | 14:55 |
ghostmansd | It'd still technically be a hack | 14:55 |
ghostmansd | But, at least, somewhat more apt to binutils structure | 14:55 |
ghostmansd | I still need to check whether binutils can handle it, though | 14:56 |
ghostmansd | Their reliance on global context is something that really pisses me off | 14:56 |
ghostmansd | Any single fucking routine can adjust the global pointer to the input buffer | 14:56 |
lkcl | sounds great! | 14:56 |
ghostmansd | And yet no single argument which makes yo suspect it | 14:56 |
*** kylel1 is now known as kylel | 15:56 | |
tplaten | I just saw the BOOT_INIT_BASE variable in the Makefile at LS2. | 16:27 |
lkcl | yes, i added it so that hacking the source code is not necessary | 16:34 |
lkcl | powerpc.ld is generated by a Makefile target from powerpc.ld.S | 16:34 |
lkcl | whatever you do, you can still always investigate the dump file and also the vcd to track what is going on | 16:35 |
lkcl | if the address is wrong it will be damn obvious because pc will be wrong as shown in gtkwave | 16:35 |
lkcl | tplaten, i added your write-perms for the wiki | 16:38 |
tplaten | I'll push the changes soon | 16:52 |
lkcl | ok great. do include the links to the various Makefiles | 16:56 |
* lkcl just building an experimental kernel with CONFIG_RELOCATABLE | 16:56 | |
lkcl | i switched on so much debug info that the kernel size now exceeds previous limits by 50% | 16:57 |
lkcl | urrr | 16:57 |
tplaten | I saw, parser.add_argument("--pc-reset"... | 17:00 |
tplaten | and microwatt_external_core_bram | 17:05 |
lkcl | yep that's it. | 17:06 |
lkcl | there's almost certainly a better way to do this but it needs corresponding modifications to microwatt | 17:07 |
lkcl | to be able to pass in a reset address as a parameter | 17:07 |
lkcl | sorry, as a combinatorial value, not a parameter. | 17:07 |
lkcl | messy | 17:07 |
tplaten | I'm going to update the documentation soon. | 17:09 |
lkcl | great | 17:10 |
* lkcl maybe has the relocatable kernel uploaded and running... maybe | 17:11 | |
lkcl | zImage starting: loaded at 0x0000000001000000 (sp: 0x00000000019c7ec0) | 17:11 |
lkcl | No valid compressed data found, assume uncompressed data | 17:11 |
lkcl | Allocating 0x14c12c0 bytes for kernel... | 17:11 |
lkcl | ahh goood | 17:12 |
lkcl | 0x9a4710 bytes of uncompressed data copied | 17:12 |
lkcl | Linux/PowerPC load: | 17:12 |
lkcl | Finalizing device tree... flat tree at 0x19c8c80 | 17:12 |
lkcl | whew | 17:12 |
lkcl | it's noticeably slower to run due to the debug / relocation | 17:14 |
lkcl | aand blat. this time at clockevent | 17:17 |
lkcl | drat | 17:17 |
* lkcl going to leave it running see what happens | 17:17 | |
tplaten | Some time ago I had a look at how the Wii and Gamecube boot. Like the Talos II those machines have an ARM processor that boots the IBM processor. Gecko and Broadway. Gecko is 180nm, Broadway 65nm. | 17:18 |
lkcl | ah yes there was someone... | 17:23 |
lkcl | jn, weren't you doing one of the boot processors? 2500 or something? | 17:23 |
lkcl | AST2500? | 17:23 |
jn | Nuvoton/Winbond WPCM450 | 17:23 |
lkcl | jn, thx | 17:35 |
lkcl | tplaten, was that one of them? | 17:38 |
jn | the Wii uses a special chip that was developed by/for Nintendo | 17:39 |
jn | more specifically: one PowerPC chip plus one chip that contains a whole lot of functionality: ARM CPU, GPIO, GPU, SD card controller, etc. | 17:40 |
jn | the GC lacked the ARM CPU, AFAIK | 17:41 |
lkcl | ooOoo | 17:41 |
jn | Wii system block diagram: https://fail0verflow.com/media/img/blockdia_wii_full.png | 17:43 |
tplaten | I still remember that the Wii has a Gamecube compatibility mode, that is controlled by the ARM processor. The GPU is an ATI one. | 17:49 |
lkcl | oh! nice! one of those embedded ATI ones, probably the exact same one used on the HTC Universal! | 17:50 |
lkcl | those were nice pieces of kit. a memory-mapped GPU on a standard MCU 8080 bus (in effect), a bit like an AT/XT graphics card, exact same pins/wires | 17:51 |
lkcl | but instead just directly wired-up on the same PCB | 17:51 |
tplaten | I got ls2 working in verilator, tomorrow I'll continue on the orangecrab port. | 19:12 |
tplaten | I still get | 19:13 |
tplaten | (TOP.top.uart.uart16550_0) UART INFO: Data bus width is 8. No Debug interface. | 19:13 |
tplaten | (TOP.top.uart.uart16550_0) UART INFO: Doesn't have baudrate output | 19:13 |
tplaten | and finally mw. ; Microwatt, it works. | 19:13 |
lkcl | hooray | 19:19 |
lkcl | ahh i think DEC is going into an infinite loop | 19:28 |
ghostmansd | I think we now have somewhat better foundations for SVP64 translation, based on the idea discussed earlier today. I called this magic opcode `sv.', because it takes exactly as many bytes as `nop' does, and it also is the thing which we use to determine whether we deal with SVP64 mnemonic, so the choice seems to be rather logical. | 21:40 |
ghostmansd | For now, we have the very basic sketch of a tool which splits `sv.ANYTHING ARG0,ARG1,...' to a pair of `sv. XXXXXX' and `ANYTHING ARG0,ARG1,...'. | 21:46 |
ghostmansd | For now, `sv. XXXXXXXX' is always `sv. 60000000' (I think you guess why), this will be addressed at the later stage. | 21:47 |
ghostmansd | However, at this stage I have some questions, and would like to ask yo to take a look at some binutils structures. | 21:48 |
ghostmansd | Here are some links that will consider the stuff I'm currently dealing with | 21:52 |
ghostmansd | powerpc_opcode: https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=include/opcode/ppc.h;h=5e2d143d81bb71799f29b5143ee02ab1c71dc1a9;hb=refs/heads/svp64#l37 | 21:52 |
ghostmansd | powerpc_operand: https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=include/opcode/ppc.h;h=5e2d143d81bb71799f29b5143ee02ab1c71dc1a9;hb=refs/heads/svp64#l274 | 21:53 |
ghostmansd | sv. magic opcode: https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=opcodes/ppc-opc.c;h=c0f1721d4275c3444e2f7438dfb8e6c99e56fe9f;hb=refs/heads/svp64#l11328 | 21:53 |
ghostmansd | SVP64RM operand: https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=opcodes/ppc-opc.c;h=c0f1721d4275c3444e2f7438dfb8e6c99e56fe9f;hb=refs/heads/svp64#l3844 | 21:53 |
ghostmansd | Considering "sv." magic opcode... | 21:53 |
ghostmansd | the `name' field is obvious | 21:54 |
ghostmansd | https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=opcodes/ppc-opc.c;h=24f62e4585c1540ddbce336045568b9d671374e7;hb=refs/heads/svp64#l11328 | 21:57 |
ghostmansd | next follows insn mask, which is, as I get, is 0x2BF, because this corresponds to all bits we consider non-operand-based. That is, svp64_prefix.major (bits [0..5]) and svp64_prefix.pid (bits [7,9]). | 21:59 |
ghostmansd | next we have SVP64, which is simply an alias to PPC_OPCODE_SVP64, meaning that this opcode makes sense only for SVP64 | 22:00 |
ghostmansd | the next field might be of interest if we decide to deprecate something on the later stages... | 22:01 |
ghostmansd | and the last field, {SVP64RM}, is the only operand we have | 22:02 |
ghostmansd | And this is, actually, the place I'm mostly interested about. | 22:02 |
ghostmansd | https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=opcodes/ppc-opc.c;h=24f62e4585c1540ddbce336045568b9d671374e7;hb=refs/heads/svp64#l3844 | 22:02 |
ghostmansd | First, I'm not sure it's OK to take some arbitrary position in this table. It seems that the mapping is arbitrary, though: no code outside of that file refers to these indices in a special way, these are real indices. | 22:04 |
ghostmansd | So, if this is OK, let's proceed to the declaration itself... | 22:04 |
ghostmansd | The first is mask of bits we take from operand -- I deliberately set all 32 bits, but I think this should be 24 bits (as in svp64_prefix.rm). | 22:05 |
ghostmansd | I'm not sure what the second one should designate; for us it likely should be 0. | 22:06 |
ghostmansd | For the next two fields, they're function pointers, I still have to look at these. It'd be great for you to check these as well. | 22:07 |
ghostmansd | And the last field is empty, since the argument should be considered a simple expression w/o any special treatment (no % skip, as for registers, and so on). | 22:08 |
ghostmansd | So, now the questions. | 22:08 |
ghostmansd | 1. Is the overall understanding correct? | 22:08 |
ghostmansd | 2. Do you have ideas and tips on insert/extract function pointers? I guess this is exactly the place where we need to construct the actual insn. | 22:09 |
lkcl | ghostmansd, apologies was afk, can i take a look tomorrow? | 22:09 |
ghostmansd | 3. Any other ideas or tips? | 22:10 |
ghostmansd | Sure, that's just a summary whilst I remember all this :-) | 22:10 |
lkcl | :) | 22:10 |
ghostmansd | No need to hurry, take your time, that's mostly for me not to forget. | 22:10 |
ghostmansd | All in all, it seems the approach we agreed on might work, but I need you to take a look at it and confirm my understanding. | 22:11 |
ghostmansd | > but I think this should be 24 bits | 22:13 |
ghostmansd | done, switched mask to 0xFFFFFF from 0xFFFFFFFF | 22:13 |
ghostmansd | OK, the main task -- writing an annoying and bothering summary which pollutes the chat -- is done, see you later :-) | 22:13 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!