Friday, 2022-04-29

programmerjakelkcl: looking through where you added divrem2du to the wiki, the definition needs to be changed so RC is the high half of the numerator otherwise the bigint / word division won't work with a single svp64 instruction because rs would feed into rc which you currently have set as the low half of the numerator which incorrect for the division algorithm.03:04
programmerjakewhich *is* incorrect for the division algorithm.03:05
ghostmansd[m]Hi folks. I've recently migrated the VM I used for development to other laptop, and somehow lost git access. $(ssh -v -p922 hangs. Any tips?09:01
programmerjakedoes it ask for a password? or does it just not connect?09:03
ghostmansd[m]Just hangs.09:04
ghostmansd[m]I'm not sure what's changed. The VM is passed as is.09:05
programmerjakewhat's the last message in the debug output?09:05
ghostmansd[m]MAC addresses of interfaces might have changed, that's one that comes to my mind.09:05
ghostmansd[m]debug1: /etc/ssh/ssh_config line 19: Applying options for *09:06
ghostmansd[m]Not particularly interesting09:06
programmerjakehmm, sounds like you borked your local config...that's before it tries to start the tcp connection afaict09:08
programmerjakecan you resolve through dns?09:09
programmerjakenslookup git.libre-soc.org09:10
ghostmansd[m]Let me check09:12
ghostmansd[m]Yeah, nothing09:12
ghostmansd[m]Ok, looks like main adapter is the culprit09:13
ghostmansd[m]Works now, thanks, programmerjake!09:42
ghostmansdMy primary laptop is dying, so I decided it's time to switch to my old yet good TP X220 :-)09:48
programmerjakeah...maybe repair your laptop? i've soldered a new screen backlight connector on one of my previous laptops...the wire broke from metal fatigue10:05
lkcloo an X220, ooo10:20
lkclprogrammerjake, arg yes. of course. let me think about it... probably just inverting RC RA so (RA) || (RC) instead of (RC) || (RA) or er10:21
lkcl 173             uint64_t dig2 = ((uint64_t)k << 32) | u[j];10:23
lkcl 174             q[j] = dig2 / v[0]; // divisor here.10:23
lkcl 175             k = dig2 % v[0];    // modulo back into next loop10:23
lkcl    dividend[0:(XLEN*2)-1] <- (RA) || (RC)10:23
lkclwhich is the wrong way round10:24
programmerjakewell, as i proposed earlier:
lkclPower ISA div/mod operations use RA as dividend and RB as divisor10:26
lkcl        dividend[0:(XLEN*2)-1] <- (RC) || (RA)10:26
programmerjakeyeah...we don't necessarily have to replicate that...10:26
lkcldoes not break expectations in that regard (RC is the "interloper")10:26
programmerjakethough maybe rb being divisor is better...less muxes10:27
lkcl    dividend[0:XLEN-1] <- (RA)10:27
programmerjakedid you see the 700k cell 128/64-bit divisor i wrote?10:28
lkclgoing off of divdeu rather than divdu was what confused the hell out of me10:28
lkcl700,000 cells... holy cow10:28
lkclnot yet i've only just got up :)10:29
programmerjakeah, it's the kan-ban email i sent. has details on dealing with that...maybe fsm or different optimization of parameters10:29
lkclyou have to be careful with Memory()10:30
programmerjakethe rom is 3kbit...that's not the problem10:30
lkcland yes, it'll really need to be a FSM :)10:30
lkclneed to be extra-extra-careful about the gate usage10:31
lkclmicrowatt has no 128-bit pathways *at all* in its division.10:31
lkclit's purely 64-bit10:31
programmerjakeyeah...i was more concerned with working rather than being efficient for now10:32
programmerjakeimho we should leave the fully pipelined version in could be useful later when we build much higher performance cpus where 500M transistors per core isn't a problem10:33
lkcli've gone over that before.10:34
programmerjakeso don't try to code-morph that into a fsm...copy it first if you want to try10:34
lkclalthough in this case if the ROMs are particularly large (16-bit estimates for example) then they become a factor in the calculations10:36
programmerjakeit's also possible i made a mistake somewhere in the error analysis code that makes it think it needs waay more accuracy than it actually does...increasing word lengths to compensate10:40
lkclwell that should be easy to test, by adding an extra check that F-F' is zero10:43
lkcl(or, zero in the bits that matter)10:43
lkclif it is, then there was one too many iterations10:43
lkclis that right?10:44
lkclor does it need comparison of Q/F and D/F... hmmm10:44
lkclF gets scaled10:44
programmerjakeexcept you'd have to check all possible inputs...2^(128+64) is too many to check10:44
lkclyou get the idea where i'm going with that10:45
lkcltrue... would a smaller bitwidth give a good idea?10:45
lkclotherwise it's wandering into formal-correctness territory, and multiply is an absolute pig there10:45
programmerjakeeven 12/6 is pushing it on test time10:45
programmerjakeyeah...i'm expecting smt formal proofs to fall over on this one10:46
programmerjakeprobably...they do support real numbers tho (technically only algebraic numbers...they have no functions to create transcendental numbers)10:47
programmerjakeaccording to the standard smt interface language (smt-lib2 iirc)10:48
lkclfrick fricking frick.  gmail is terminating IMAP access next month.10:52
lkclor, terminating username/password access10:53
lkclmeans i can't use K9 from f-droid10:53
programmerjakeyou probably still can, with an app password:
programmerjakesee k9 mail's github issue about that:
programmerjakelkcl: ^11:06
lkclprogrammerjake, thx11:44
lkclthe latest version of kmail is s***11:44
lkclthe UI is awful11:45
markoskmail used to be really good, I had to switch away from that and kontact due to the akonadi crap hogging my cpu constantly11:46
lkclyeah i stopped using KDE when it went to plasma and qt511:46
lkcltrinity desktop still maintains KDE 3.0 which, because it uses qt3, is lightning-quick and extremely small binary size (relatively speaking)11:47
lkcli'll see how i get on with xypto, it seems to be doing ok11:47
markosI'm using kde5 now, I prefer it to gnome tbh, though I would switch to xfce again if they finally ever make the switch to gtk311:48
markoskde 3? interesting, I'll take a look11:48
lkclwell-maintained, works great11:48
lkcla fork of KDE 3.5 from 201011:48
lkclstartup time is FAST11:49
markosdon't they have to also provide support for old qt3?11:49
lkclinstalled... sigh... in /opt11:49
lkclbut hey.11:50
lkclone of the good things about xfce is precisely that it *doesn't* use gtk311:50
lkclimo both qt5 and gtk3 have gotten... scary in size11:50
markosbut also one of the reasons it doesn't work in wayland12:38
markosor AFAIU it12:38
lkcli've uesd fvwm2 on x11/xorg for over 22 years so will entirely pass wayland by for many more :)12:45
ghostmansdI'm not sure how it'd be better to handle the prefix (aka ".long xxx" part). We operate on md_assemble() level, but ".long" et al. are handled on the upper level (exactly in the routine which calls md_assemble(), read_a_source_file()). Both routines heavily rely on global context, and it's not really easy to change it.14:41
ghostmansdThe ".long" stuff goes to cons_worker() routine, but this one, again, operates on global identifiers.14:42
ghostmansdI have yet to come up with some clever way to handle this. Otherwise the binutils code will need more tuning.14:43
lkclwell, a workaround is to create a new instruction "svp64" which takes the 24-bit RM as an argument14:43
ghostmansdHm. This might do the trick.14:43
lkclit's something i thought of doing, 6+ months ago14:43
ghostmansdYes, as pseudo-op, right?14:43
ghostmansdExactly like we have "add", "xor" or whatever else.14:44
lkclno, "actual" op14:44
lkclas a 32-bit op14:44
lkclpseudo-ops are things like "li r5, 99" which is an alias for "addi r5, r0, 99"14:45
ghostmansdOK, call it as you like :-)14:45
ghostmansdI mean reserving an entry in the table of insns14:45
lkclyes :)14:45
ghostmansdGiving it an opcode...14:45
ghostmansdYes, this should do the trick, I think!14:45
lkclyes, ah ok i see what you mean, "not an official op"14:45
ghostmansdThen we won't leave md_assemble level...14:45
ghostmansdYeah, right14:46
ghostmansdLike allocating an entry here...14:47
ghostmansdI have yet to think of the operands array14:47
lkclyes basically14:47
ghostmansdBut seems like our best choice14:48
lkclbtw there will be some entries added there, setvl for example14:48
ghostmansdThanks Luke, this basically concludes some hours of thinking14:48
lkcland svstep14:48
lkclwhich really are 32-bit ops (actual 32-bit ops)14:48
ghostmansdYeah, we have some stuff done manually14:48
lkclbut they definitely need to go under a "--experimental aka libresoc" flag14:48
ghostmansdsetvl, svstep, fmadds, etc.14:49
ghostmansdI think they'll reside in standalone table14:49
lkclyes, ah i forgot about fmadds, yes14:49
ghostmansdthat's basically how they do it in binutils14:49
lkclthey're in at the moment - even as 32-bit ops - precisely because standard upstream binutils doesn't have them14:49
lkclohh ok, nice idea. yes, very sensible14:50
ghostmansdLike here, there's another table14:50
lkclthen they can be brought in to the hashtable or not based on a runtime switch. niec idea14:50
ghostmansdbinutils' PPC code knows all these tables but still groups them14:50
ghostmansdeven though they end up in the same hash14:50
lkcl10,000+ line long file but still sensible14:50
ghostmansdOK, I think I'll follow this way with reserving some magic opcode14:52
ghostmansdI think this fits even better than .long incantation14:53
lkcl.long was - is - a hack that i knew binutils could handle14:53
ghostmansdOK, so we basically teach an old dog a couple of new tricks14:55
ghostmansdIt'd still technically be a hack14:55
ghostmansdBut, at least, somewhat more apt to binutils structure14:55
ghostmansdI still need to check whether binutils can handle it, though14:56
ghostmansdTheir reliance on global context is something that really pisses me off14:56
ghostmansdAny single fucking routine can adjust the global pointer to the input buffer14:56
lkclsounds great!14:56
ghostmansdAnd yet no single argument which makes yo suspect it14:56
*** kylel1 is now known as kylel15:56
tplatenI just saw the BOOT_INIT_BASE variable in the Makefile at LS2.16:27
lkclyes, i added it so that hacking the source code is not necessary16:34
lkclpowerpc.ld is generated by a Makefile target from powerpc.ld.S16:34
lkclwhatever you do, you can still always investigate the dump file and also the vcd to track what is going on16:35
lkclif the address is wrong it will be damn obvious because pc will be wrong as shown in gtkwave16:35
lkcltplaten, i added your write-perms for the wiki16:38
tplatenI'll push the changes soon16:52
lkclok great. do include the links to the various Makefiles16:56
* lkcl just building an experimental kernel with CONFIG_RELOCATABLE16:56
lkcli switched on so much debug info that the kernel size now exceeds previous limits by 50%16:57
tplatenI saw, parser.add_argument("--pc-reset"...17:00
tplatenand microwatt_external_core_bram17:05
lkclyep that's it.17:06
lkclthere's almost certainly a better way to do this but it needs corresponding modifications to microwatt17:07
lkclto be able to pass in a reset address as a parameter17:07
lkclsorry, as a combinatorial value, not a parameter.17:07
tplatenI'm going to update the documentation soon.17:09
* lkcl maybe has the relocatable kernel uploaded and running... maybe17:11
lkclzImage starting: loaded at 0x0000000001000000 (sp: 0x00000000019c7ec0)17:11
lkclNo valid compressed data found, assume uncompressed data17:11
lkclAllocating 0x14c12c0 bytes for kernel...17:11
lkclahh goood17:12
lkcl0x9a4710 bytes of uncompressed data copied17:12
lkclLinux/PowerPC load:17:12
lkclFinalizing device tree... flat tree at 0x19c8c8017:12
lkclit's noticeably slower to run due to the debug / relocation17:14
lkclaand blat. this time at clockevent17:17
* lkcl going to leave it running see what happens17:17
tplatenSome time ago I had a look at how the Wii and Gamecube boot. Like the Talos II those machines have an ARM processor that boots the IBM processor. Gecko and Broadway. Gecko is 180nm, Broadway 65nm.17:18
lkclah yes there was someone...17:23
lkcljn, weren't you doing one of the boot processors? 2500 or something?17:23
jnNuvoton/Winbond WPCM45017:23
lkcljn, thx17:35
lkcltplaten, was that one of them?17:38
jnthe Wii uses a special chip that was developed by/for Nintendo17:39
jnmore specifically:  one PowerPC chip plus one chip that contains a whole lot of functionality: ARM CPU, GPIO, GPU, SD card controller, etc.17:40
jnthe GC lacked the ARM CPU, AFAIK17:41
jnWii system block diagram:
tplatenI still remember that the Wii has a Gamecube compatibility mode, that is controlled by the ARM processor. The GPU is an ATI one.17:49
lkcloh! nice! one of those embedded ATI ones, probably the exact same one used on the HTC Universal!17:50
lkclthose were nice pieces of kit.  a memory-mapped GPU on a standard MCU 8080 bus (in effect), a bit like an AT/XT graphics card, exact same pins/wires17:51
lkclbut instead just directly wired-up on the same PCB17:51
tplatenI got ls2 working in verilator, tomorrow I'll continue on the orangecrab port.19:12
tplatenI still get19:13
tplaten( UART INFO: Data bus width is 8. No Debug interface.19:13
tplaten( UART INFO: Doesn't have baudrate output19:13
tplatenand finally mw.  ;   Microwatt, it works.19:13
lkclahh i think DEC is going into an infinite loop19:28
ghostmansdI think we now have somewhat better foundations for SVP64 translation, based on the idea discussed earlier today. I called this magic opcode `sv.', because it takes exactly as many bytes as `nop' does, and it also is the thing which we use to determine whether we deal with SVP64 mnemonic, so the choice seems to be rather logical.21:40
ghostmansdFor now, we have the very basic sketch of a tool which splits `sv.ANYTHING ARG0,ARG1,...' to a pair of `sv. XXXXXX' and `ANYTHING ARG0,ARG1,...'.21:46
ghostmansdFor now, `sv. XXXXXXXX' is always `sv. 60000000' (I think you guess why), this will be addressed at the later stage.21:47
ghostmansdHowever, at this stage I have some questions, and would like to ask yo to take a look at some binutils structures.21:48
ghostmansdHere are some links that will consider the stuff I'm currently dealing with21:52
ghostmansdsv. magic opcode:;a=blob;f=opcodes/ppc-opc.c;h=c0f1721d4275c3444e2f7438dfb8e6c99e56fe9f;hb=refs/heads/svp64#l1132821:53
ghostmansdSVP64RM operand:;a=blob;f=opcodes/ppc-opc.c;h=c0f1721d4275c3444e2f7438dfb8e6c99e56fe9f;hb=refs/heads/svp64#l384421:53
ghostmansdConsidering "sv." magic opcode...21:53
ghostmansdthe `name' field is obvious21:54
ghostmansdnext follows insn mask, which is, as I get, is 0x2BF, because this corresponds to all bits we consider non-operand-based. That is, svp64_prefix.major (bits [0..5]) and (bits [7,9]).21:59
ghostmansdnext we have SVP64, which is simply an alias to PPC_OPCODE_SVP64, meaning that this opcode makes sense only for SVP6422:00
ghostmansdthe next field might be of interest if we decide to deprecate something on the later stages...22:01
ghostmansdand the last field, {SVP64RM}, is the only operand we have22:02
ghostmansdAnd this is, actually, the place I'm mostly interested about.22:02
ghostmansdFirst, I'm not sure it's OK to take some arbitrary position in this table. It seems that the mapping is arbitrary, though: no code outside of that file refers to these indices in a special way, these are real indices.22:04
ghostmansdSo, if this is OK, let's proceed to the declaration itself...22:04
ghostmansdThe first is mask of bits we take from operand -- I deliberately set all 32 bits, but I think this should be 24 bits (as in svp64_prefix.rm).22:05
ghostmansdI'm not sure what the second one should designate; for us it likely should be 0.22:06
ghostmansdFor the next two fields, they're function pointers, I still have to look at these. It'd be great for you to check these as well.22:07
ghostmansdAnd the last field is empty, since the argument should be considered a simple expression w/o any special treatment (no % skip, as for registers, and so on).22:08
ghostmansdSo, now the questions.22:08
ghostmansd1. Is the overall understanding correct?22:08
ghostmansd2. Do you have ideas and tips on insert/extract function pointers? I guess this is exactly the place where we need to construct the actual insn.22:09
lkclghostmansd, apologies was afk, can i take a look tomorrow?22:09
ghostmansd3. Any other ideas or tips?22:10
ghostmansdSure, that's just a summary whilst I remember all this :-)22:10
ghostmansdNo need to hurry, take your time, that's mostly for me not to forget.22:10
ghostmansdAll in all, it seems the approach we agreed on might work, but I need you to take a look at it and confirm my understanding.22:11
ghostmansd> but I think this should be 24 bits22:13
ghostmansddone, switched mask to 0xFFFFFF from 0xFFFFFFFF22:13
ghostmansdOK, the main task -- writing an annoying and bothering summary which pollutes the chat -- is done, see you later :-)22:13

Generated by 2.17.1 by Marius Gedminas - find it at!