Sunday, 2023-05-21

*** openpowerbot_ <openpowerbot_!> has quit IRC00:14
*** openpowerbot_ <openpowerbot_!> has joined #libre-soc01:19
programmerjakelkcl: if you're talking about the fcvtfg unit tests, those are marked @skip_case since I know they're currently broken and i'm fixing them soon:;a=blob;f=src/openpower/test/fmv_fcvt/;h=9e8914cce83cec67142a1cbba500f9162afaee9b;hb=371d91b299c0e4bd7b23e660b9936ed40debb824#l43909:39
programmerjakeso of course they pass09:39
lkclprogrammerjake, good enough :) it's the right strategy (i should have been using it with the dd-ff-ldst tests, sigh)09:41
programmerjakethe fcvttgo. tests are not skipped however09:41
programmerjakeso it's good they pass!09:41
lkclawesome :)09:41
programmerjakethough i'm not testing how they behave when fp exception traps are enabled since those aren't implemented09:42
programmerjakefun...testing all the corner cases nearly no one ever uses...09:44
programmerjakesince enabling underflow traps changes how it decides when underflow occurred09:45
programmerjakethat's all specified by ieee 75409:46
lkclfrickin ellfire09:48
lkcl"enabling" traps is easy09:48
lkclwhatever conditions are required - and this has to go in around where handle_carry() and handle_overflow() is called - you just call the self.TRAP() function with the appropriate memory location09:49
lkcli believe it's 0x80009:49
lkcloh - no - call the call_trap() function09:50
programmerjakecan you add that? should be pretty easy, just trap if the enable bits in msr are set and FPSCR.VEX is set iirc, it's described in the spec09:51
lkclso if writing to overflow is supposed to come first, the call to a function called "self.handle_fp_exception()" should come *after* that09:51
lkclok i can take a look09:52
lkclthe bit that will be slightly irkesome (but can be added/tested/confirmed later) will be data-dependent fail-first09:53
programmerjakesee v3.1b book iii 7.5.909:56
lkclalmost useless.09:57
programmerjake?? you mean the referenced section?09:58
programmerjakewhat is almost useless?09:59
lkclfer goodness sake.  it's 0x70009:59
lkcli have no idea what i'm doing here.10:00
lkclok - bit 43.10:01
programmerjakebasically just evaluate (MSR.FE0 | MSR.FE1) & FPSCR.FEX after each insn and if it's 1 then the instruction causes a trap10:01
lkcluh-huhn.... and what goes into MSR?10:02
lkcland is it ok to write to any registers?10:02
lkclif so, which ones?10:02
programmerjakemsr -- isn't that just a kernel-accessible register?10:02
lkclit's the most absolutely critical register of the entire Power ISA10:03
programmerjakethe fp pseudo-code will suppress writes if needed10:03
lkclcontrolling in a god-like fashion what happens underneath10:03
lkclit's a peer of PC10:03
programmerjakebut what i mean is msr should be easy to read in the simulator...10:03
lkclso if you set the wrong value it's as if you for example set PC equal to 0xfffffffffff10:03
lkclyes - self.msr10:04
programmerjakeyou don't need to write msr (unless trapping does that)10:04
lkclcorrect - trapping does precisely and exactly that10:04
lkclif you fail to write the correct values into MSR which is the *very definition* of what a trap actually is,10:04
lkclyou are utterly screwed.10:05
lkclsoftware will jump to (say) 0x700 then start executing code that checks the contents of MSR bits and route through to entirely the wrong code10:05
lkcli encountered this multiple times when running both microwatt unit tests and the linux kernel10:05
lkcl(last time was unaligned memory exceptions)10:06
lkcl(before that it was the setting of PR bit and associated bits)10:06
programmerjakewell, for fp unit tests at least, they don't care what msr is since they end right when the trapis taken or not10:06
lkclyou *have* to care10:07
lkclchecking in the unit test will be absolutely essential.10:07
lkclthe "ExpectedState" absolutely must have the expected value of MSR.10:07
lkcl(there's a way to specify in the unit test "if PC=={specific_value} then stop executing"10:07
lkcland that can be set (in this case) to 0x70010:08
programmerjakeso if the trap gives the wrong msr value, that's still fine for the fp unit tests at least. all the other unit tests that specifically test did it trap correctly care, the fp unit tests just care if it trapped or not10:08
lkclthe unit tests are *absolutely 100%* required to "care"10:08
lkclno, it is not ok. really.10:08
lkcllook at the text on "SRR0"10:09
lkcland the page after on "SRR1".10:09
lkcl43 Set to 1 for a Floating-Point Enabled10:09
lkcl   Exception type Program interrupt; other-10:09
lkcl   wise set to 0.10:09
lkclso when specifying what ExpectedState to check for, the unit test will be *required* to set bit 43 to 110:10
lkclthat cannot be ignored!10:10
lkclthe only reason we get away with "ignoring" it at the moment is because we run with all ExpectedState set to a fixed pre-defined value10:11
lkclthat juuuust happens - by a not-so-coincidence - to be in virtually 100% of all unit tests "the right value"10:11
programmerjakemaybe expectedstate needs a fp_trap method that updates msr to have all the right values so all the tests don't have to replicate that10:12
lkcli'd prefer if they were stand-alone explicitly spelled out in each unit test (deliberately)10:13
lkclbut if a unit test *class* turned out to have such common values then calling a function that receives an ExpectedState as a parameter i don't have a problem with10:14
programmerjakeok, though sounds like a pain to get right in every test, since there're no tests to copy from yet10:15
lkclMSR values are different based on different traps. with unit tests being localised within classes, the amortisation of all of those into an fp_trap function would result in making it difficult to review10:15
lkclexactly!! :)10:15
lkclhence why i was grumbling, above!10:15
lkclthis is going to need microwatt source code examination.10:16
lkcland another PIb bit10:16
lkclahh there already exists one!10:16
lkclwho the heck added that? probably me :)10:17
lkclahh it's not MSR that changes, it's the contents of SRR1.10:18
lkcl        self.spr['SRR1'][trap_bit] = 1  # change *copy* of MSR in SRR110:18
lkclso by calling "self.call_trap(0x700, PIb.FP)" that _should_ be sufficient10:19
programmerjakefeel free to just test `nop` with the fpscr bits set to cause a trap on entry, seems like the easiest way to unit test the fp trap mechanism10:19
lkclapologies it's SRR1 that should be passed in to an ExpectedState10:19
lkclbut msr should also be - just not to the value i was anticipating10:19
programmerjakesince none of the write-to-fpscr instructions have pseudo-code so are likely to be implemented10:20
lkclSRR147 can be set to 1 only if the10:20
lkclexception is a Floating-Point Enabled10:20
lkclException and either MSRFE0 FE1 =10:20
lkcl0b01 or 0b10 or MSRFE0 FE1 has just10:20
lkclbeen changed from 0b00 to a nonzero10:20
lkclvalue. (SRR147 is always set to 1 in the10:20
lkcllast case.)10:20
lkclfer frickin frick's sake10:20
programmerjakenote FE0 FE1 == 0b01 or 0b10 means fp traps may be imprecise, seems easiest to always have precise traps in the simulator10:22
lkclright up to the point where we find that third party programs assume otherwise, yes10:22
lkclok need to get up10:22
programmerjakeif they assume otherwise they're broken since cpus may cause those fp traps to occur at any point before the next fp context synchronizing instruction, including immediately when FPSCR.FEX becomes 110:24
programmerjake(icr if fp context synchronizing insn is the right phrase, but you get the idea)10:25
programmerjakethanks for doing this!10:26
programmerjakenote fp underflow can't actually occur for fcvt*/fmv*, so the main difference I'd need to test is that the destination register overwrite didn't occur, tho now that i think about it, preventing writing to FRT/RT is only necessary when FRT/RT is the same register as the source, which is impossible for fmv/fcvt since they write to a different reg file than they read from10:29
programmerjakeso maybe i can remove the preventing reg write from the pseudocode, which should simplify it a bunch10:30
programmerjakepreventing reg write is still necessary for other fp ops tho, e.g. fsinpi f3, f310:31
lkclactual write to the regfile does not take place unless you explicitly call GPR(nn) <- xx10:41
lkclthe contents of what is *requested* to be written to goes into some local variables within the function that are *returned* by the pseudocode function10:42
lkcland they're *not* stored until do_outregs_nia() is called10:43
lkclwhich goes over them carefully10:43
lkclwhat you do have to watch out for is writing to FPSCR, CR0 etc.10:43
lkclif those are *not supposed to be written to* then it will become necessary to add them to the list of "objects-written-to-that-must-become-return-results"10:44
programmerjakeyeah, i expect the pseudocode to fail with NameError at the autogenerated return, needs fixing10:44
lkcland that's handled by adding them to write_regs10:45
programmerjakeFPSCR and cr1 are specified to be written to10:45
lkclno they're not10:45
programmerjakejust not frt, since that could overlap with the input10:45
lkcl        if name in ['overflow', 'CR0']:10:45
lkcl            self.write_regs.add(name)10:45
lkcllikewise here10:46
lkcl            if name and name in self.gprs:10:46
lkcl                self.write_regs.add(name)  # add to list of regs to write10:46
programmerjakei'm talking about the PowerISA spec, not whatever possibly incorrect thing our simulator currently does10:46
lkcl"specified to be written to" ok --> "must be added to write_regs on-demand"10:47
lkclso are there any circumstances where the pseudocode writes *directly* into CR1?10:47
programmerjakethe fp ops *conditionally* write their dest regs, afaict the simulator doesn't support that yet10:47
programmerjakemaybe fcmp[u/o]?10:48
lkclthat can be "handled" by making sure that the copy of the reg is identical going in10:48
lkclok then CR1 definitely needs adding as a peer of overflow and CR010:48
programmerjakebut that doesn't need conditional writing since writing cr1 can't overwrite any inputs10:48
programmerjakei already did10:49
lkclno, you didn't add it in parser.py10:49
lkcl<lkcl>         if name in ['overflow', 'CR0']:10:49
lkcl<lkcl>             self.write_regs.add(name)10:49
lkcltherefore it will be a local-only variable10:49
lkclthe contents will *never* be returned10:50
programmerjakeah, ok. i added it in caller since i was adding Rc=1 CR1 support anyway10:51
lkcllook at decoder/isa/comparedfixed.py10:51
lkclwait... no.. hang on...10:51
programmerjakei didn't check fcmp[u/o]10:51
lkclah - and/or prefix_codes.py10:51
lkcldecoder/isa/        return (RT, CR0,)10:52
lkcldecoder/isa/                uninit_regs=OrderedSet(), write_regs=OrderedSet(['CR0', 'RT']),10:52
programmerjakeprefix_codes should never change CR110:52
lkcli am explaining in the context of CR010:52
lkclwhich is what we have "working"10:52
lkclso has to encounter CR110:53
lkcl(being written to - not being read)10:53
programmerjakeyeah, i'm already mostly aware how cr0 works in at least.10:53
lkclit has to go, "oh, this is being written to, let me add it to the write-regs set"10:53
lkclthen that ends up with...10:53
lkclso exactly the same thing is needed for CR1.10:53
programmerjakeok, we can do that when we need it, so not now10:54
lkcland if FPSCR is needed to be treated the same way (a return result that must not go into the regfile and "damage" it if an exception occurred)10:54
lkclthen it goes into that list too10:54
programmerjakefpscr does not need to have writes skipped10:55
lkclit just depends on whether the *pseudocode* should be writing FPSCR or whether it should be done as a post-analysis phase explicitly in python code.10:55
lkcli really would prefer that not to be the case10:55
programmerjakebesides fpscr skips all the read/write reg machinery since it's always accessed through self10:55
lkclthat doesn't mean doing so is the right thing to do10:56
lkclit just means that a completely new mechanism would be needed10:56
lkclan entirely new paradigm for "coping" with register contents not being destroyed10:56
lkcland that would be: take a copy of FPSCR before it is destroyed10:56
lkcland have the TRAP restore it10:56
programmerjakebut no restore is necessary10:57
lkclbut that's only if the specification requires that exceptions *not* allow FPSCR to be "damaged"10:57
programmerjakeit isn't destroyed10:57
lkclif however it *does* require being written to then that's fine10:57
lkcl"being written to such that when the trap occurs the trap can read that written FPSCR value"10:58
programmerjakeexceptions happen after pseudocode changes fpscr to the state needed by the exception10:58
lkclit's a communications mechanism10:58
lkclok great - then writing directly to it sounds perfectly fine10:59
lkclwhich is good because it's a pain in the ass :)10:59
lkclputting in explicit differences of behaviour10:59
lkclbut CR1 you will have to watch out for11:00
programmerjakeonly issues: all our proposed fp pseudocode doesn't handle exceptions (except fmv/fcvt and probably fminmax)11:00
programmerjakeand -- svp64 ddff says failed insns don't write outputs, does that include fpscr?11:01
programmerjakesometimes don't write some outputs*11:01
lkclthat's where it becomes Hell11:02
lkclit _shouldn't_ be allowed because when VLi=0 (exclusive) the instruction should never have been allowed to occur11:02
programmerjakei submitted
lkclas in: it should have a "Shadow-Hold" across it, the cancellation pulled, and the instruction treated as if it never even existed in the first place11:03
programmerjakewe need to add all libre-soc fp ops to the see also, unless they never cause fp exceptions11:03
lkclcorrect, dd-ff with vli=0 prohibits "failed" instructions from even existing11:04
lkclthey may be *attempted* but under no circumstances permitted to actually reach *any* regfiles11:04
lkcl(or memory contents)11:04
lkclthey're speculatively-executed (with Shadow-Hold) in other words11:05
programmerjakeok, sounds good then! any includes fpscr11:05
lkclcorrect. FPSCR is "a register"11:05
lkcltherefore it is not permitted to have been written11:05
lkclnor CRs11:05
lkcl*anything* beyond the failed Shadow-Hold point is PROHIBITED11:05
lkclit may be necessary to do that god-awful hack of taking a copy of FPSCR in order to cope11:06
lkclthe alternative is, you guessed it, adding FPSCR to write_regs11:06
lkclwhich will need particular care because analysing the list of return results involves length-checking for routing through to different behaviours11:07
lkcl"if len(results) == 1" in places11:07
programmerjakeit's 3am here, can you add the fp fft ops and other fp libre-soc ops to #1087 thx!11:07
programmerjakegtg to sleep11:08
programmerjakejust see also or blocking as appropriate11:08
lkclwill take a look11:11
programmerjakethx, ttyl11:11
lkclokaaay now need to run hundreds of unit tests - would be better to have a "long" and "short" test set12:18
lkclwith the short one being a few select examples, well-documented.12:19
lkclnooow FPSCR can be handled correctly, prohibiting it from write-back on VLi=012:22
lkclbut otherwise permitted to write12:22
lkclif not modified then the exact same value that was passed *in* will *happen* to be the exact same value *written* into the local-variable called "FPSCR"12:23
lkcl    def op_fcvtfg(self, RB, FPSCR):12:23
lkcl            FPSCR.FR = copy_assign_rhs(self.inc_flag)12:23
lkcl        return (FRT, FPSCR,)12:23
lkclok they seem a little short (for the quantity) but i *think* the unit tests have run12:27
* lkcl re-running everything after that change to and, important to ensure no "damage" occurs12:34
*** jn <jn!~quassel@user/jn/x-3390946> has quit IRC13:34
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC13:48
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc13:49
lkclahh, @inject works on helper functions14:43
lkcl(because ISACaller derives from ISACallerHelper)14:43
lkclaaaaa this is hairy15:00
lkcli'm in the middle of fixing adding helper @inject at the same time as doing LD/ST reordering, aaaaaa15:01
lkclok good15:01
lkclbut... holy cow it needs a major hack in pyfnwriter15:01
lkclholy cow big set of changes/simplification of LD/ST15:21
lkclghostmansd[m], that's the LD/ST modes updated to remove Saturation15:42
lkclImm is now remarkably similar (near-identical) to Idx15:42
lkcland they are both a lot simpler15:43
lkclmuch more like a sane ISA - one bit per "thing"15:43
*** jn <jn!> has joined #libre-soc16:04
*** jn <jn!> has quit IRC16:04
*** jn <jn!~quassel@user/jn/x-3390946> has joined #libre-soc16:04
*** jn_ <jn_!> has joined #libre-soc16:09
*** jn_ <jn_!> has quit IRC16:09
*** jn_ <jn_!~quassel@user/jn/x-3390946> has joined #libre-soc16:09
*** jn <jn!~quassel@user/jn/x-3390946> has quit IRC16:09
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has quit IRC16:31
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has joined #libre-soc16:31
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has quit IRC16:41
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has joined #libre-soc16:41
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has quit IRC16:48
*** ghostmansd[m] <ghostmansd[m]!> has joined #libre-soc16:49
*** ghostmansd[m] <ghostmansd[m]!> has quit IRC16:55
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has joined #libre-soc16:56
lkclaaaa /pi is needed in LDST-Indexed17:19
*** Guest34 <Guest34!~Guest34@> has joined #libre-soc18:02
*** Guest34 <Guest34!~Guest34@> has left #libre-soc18:02
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has quit IRC18:14
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has joined #libre-soc18:15
*** sadoon[m]1 <sadoon[m]1!~sadoonalb@2001:470:69fc:105::3:5f7c> has joined #libre-soc18:16
sadoon[m]1Hi guys18:17
sadoon[m]1I truly hope you can read this because setting up a matrix server is such a pain in the ass.18:17
sadoon[m]1If you do, please do let me know so I can stop banging my head xD18:21
jn_sadoon[m]1: hi, message received on the IRC side18:27
jn_so your matrix setup probably works :)18:27
sadoon[m]1Awesome, thanks!18:27
sadoon[m]1I've had an awful day dealing with this18:28
sadoon[m]1Federation is still not properly working even though their tester reports all is fine but to hell with that18:29
sadoon[m]1I moved to new domains and moved my server to POWER, currently in vms on my talos in preparation to move to the tyan later18:29
sadoon[m]1My new email is sadoon at albader dot co18:30
sadoon[m]1The old one still works for now though, might keep it for a year or two18:30
programmerjakelkcl: please confirm you've read and understand -- it is critically important19:39
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@> has quit IRC19:45
*** ghostmansd[m] <ghostmansd[m]!> has joined #libre-soc19:46
programmerjakesadoon you'll need to change your bugzilla email address and mailing list subscriptions19:55
programmerjakeif you want, you can do what lkcl does and have both email addresses subscribed to the mailing list but just disable mail delivery for one of them. this allows you to send mail from both addresses but only receive one copy of mail.20:01
programmerjakethough if you will lose control of your old email address it's important to unsubscribe it to help prevent someone else registering it and pretending to be you20:03
programmerjake(though it doesn't help much since resubscribing is relatively easy)20:03
programmerjakechanging bugzilla is much more important tho so someone can't just take your account over by getting your old email address20:05
programmerjake(i'd guess you already know all of that tho)20:07
*** midnight <midnight!~midnight@user/midnight> has quit IRC20:10
lkclprogrammerjake, i'm really sorry, i can literally see no difference between the two pieces of pseudocode20:50
*** octavius <octavius!> has joined #libre-soc20:55
octaviuslkcl, programmerjake, I apologise for being ignorant of the pseudo-code notation, but are these20:56
octavius   VSR[32xTX+T].dword[1] <- result20:56
octavius   VSR[32xTX+T].dword[2] <- 0x0000_0000_0000_000020:56
octaviusequivalent to:20:56
octavius   RT <- result20:56
octavius#1087 comment #1120:57
lkcleffectively, they achieve the same thing.  one targets VSR the other targets RT.20:57
lkclthat's "no effective difference"20:58
lkcl(it's not the focus - i pointed out that VSR/RT is not the issue - everything else is identical and that's what i am querying: there *is* no visible difference other than that one thing which is not relevant)21:00
lkcloctavius, extraoordinary paaatience needed when doing modifications to the devscripts.  the nix guys took literally over 3 months, i spent weeks on coriolis2 - it's mad.21:04
octaviusNo problem. Now that you've explained the '' dependencies a few times, I at least know what to do :)21:04
lkcljust take the buggers out, they cause so many problems.21:05
lkclif someone doesn't install the dependencies before-hand then they should have installed them manually and/or used the devscripts21:06
programmerjakelkcl: explained what needs changing in
lkclok brilliant, let me take a look21:12
lkclrright.  ok. so it's the fact that the pseudocode is *not* like how it should be that is the problem?21:13
lkclthat still does not in turn say that the *simulator* requires anything different.21:13
lkclthe pattern following FPSCR.FEX (if/else) i am still not seeing as requiring any change to the simulator?21:14
programmerjakei didn't say the simulator must change, i said our pseudocode must change. the simulator may need changes (tbd)21:17
lkclarrrgh ok21:17
lkcli thought you were referring to the existing pseudocode and therefore concluded that you must have meant, "the simulator must compensate"21:18
lkclhow irkesome a misunderstanding.21:18
programmerjakewell, if the simulator incorrectly interprets existing pseudocode (e.g. appendix a) then it needs change too21:19
*** sauce <sauce!> has quit IRC21:46
*** octavius <octavius!> has quit IRC21:56
*** sauce <sauce!> has joined #libre-soc22:35

Generated by 2.17.1 by Marius Gedminas - find it at!