programmerjake | got pcdec. to work! will commit after all tests pass | 01:15 |
---|---|---|
lkcl | ahh brilliant | 01:18 |
programmerjake | all tests pass! thx for fixing all the other tests :) | 01:22 |
lkcl | very cool instruction/idea | 01:29 |
programmerjake | thx! | 01:29 |
lkcl | bits-unused feeds back. worked that out | 01:29 |
lkcl | what happens if there are more bytes being produced than there are bits in input? | 01:30 |
lkcl | that's a different type of overflow condition from "there are greater than 6-bit-encodings", right? | 01:30 |
programmerjake | that's impossible, it will notice there's no input bits so it'll just stop there, leaving the rest the bytes in RT set to 0 (means unused) | 01:30 |
lkcl | RT=0 is not "the output was all zero-bytes"? | 01:31 |
programmerjake | RT=0 is "there was no output bytes" since output bytes can't be zero. | 01:32 |
lkcl | ngggh... these aren't actual "data bytes", they're indices-into-the-table-of-output, aren't they? | 01:33 |
lkcl | (something like that) | 01:33 |
programmerjake | they're decoded symbols packed into 1 per byte, symbols can be 1-bit to as-many-as-you-please. pcdec. only handles symbols up to 5-bits (not 6-bits, that was an error) | 01:34 |
programmerjake | symbols are encoded by taking the symbol's bits, packing them from LSB to MSB, then adding a 1-bit, then filling the rest of the byte with zeros | 01:35 |
programmerjake | encoded into bytes in RT ^ | 01:35 |
programmerjake | so their kinda indices into the table of all possible outputs, but rearranged to just be LSB0 indices into the corresponding 1-bit in tree | 01:36 |
lkcl | really want to see what happens on sv.pcdec. | 01:38 |
lkcl | and if /ff mode helps | 01:38 |
programmerjake | ff mode is the only way it can be vectorized at all... | 01:39 |
lkcl | jooy | 01:39 |
lkcl | /vli mode btw is only possible with testing *just* "eq" bit (ne/eq the only two options) | 01:40 |
programmerjake | icr what vli stands for... | 01:40 |
lkcl | truncates VL *inclusive* | 01:40 |
lkcl | if the test fails VL is truncated normally to *exclude* the failing element | 01:41 |
programmerjake | yeah, we'll want to include the last element | 01:41 |
lkcl | then you can't use /ff=lt or /ff=so | 01:42 |
lkcl | it has to be /ff=RC1/vli which i'll write tomorrow | 01:42 |
programmerjake | I'll move the "output-empty" to lt, and have eq be "we hit any stopping condition" | 01:42 |
lkcl | ack | 01:42 |
programmerjake | next week, when I'm working on it again | 01:43 |
lkcl | also i didn't foresee having this applied to "only-Rc=1" instructions | 01:43 |
lkcl | RC1 mode was only originally designed for instructions that don't have an Rc=1 mode | 01:43 |
lkcl | i'll have to do an XOR of the hard-coded-Rc=1 (csv rc column == 'ONE') /ff=RC1 flag | 01:44 |
lkcl | urrr | 01:44 |
programmerjake | uuh, couldn't it just be sv.pcdec./vli/ff=eq? | 01:44 |
lkcl | nnope. | 01:45 |
lkcl | it's a tri-mode not a dual-mode | 01:45 |
lkcl | (one flag==1 enables/activates another flag) | 01:46 |
lkcl | (i.e. the 2nd flag is *ignored* if the 1st flag == 0) | 01:46 |
programmerjake | uuh, RC1=1 can't be used, since the spec says the results are never stored, only the CR outputs...the whole point of pcdec. is the RT output, without it it's mostly useless | 01:48 |
programmerjake | either that or the spec is unclear | 01:48 |
programmerjake | > Note that when RC1=1 the result elements are never stored, only the CR Fields. | 01:49 |
programmerjake | https://libre-soc.org/openpower/sv/normal/#index5h1 | 01:49 |
programmerjake | imho the spec should be changed to always write outputs for each element up to and including the first one that fails the data-dependent fail-first test, only elements after that one are not executed. VL being set to exclude the failing element should happen after. | 01:51 |
programmerjake | that way, an element is always fully-executed or not executed. not partially-only-writes-CR-executed | 01:52 |
programmerjake | all outputs, RT, CR, OV, etc. | 01:53 |
*** zemaye__ <zemaye__!~zemaye@172.58.107.28> has joined #libre-soc | 03:31 | |
*** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has quit IRC | 03:34 | |
*** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has joined #libre-soc | 03:52 | |
*** zemaye__ <zemaye__!~zemaye@172.58.107.28> has quit IRC | 03:54 | |
*** lxo <lxo!~lxo@linux-libre.fsfla.org> has joined #libre-soc | 07:18 | |
lkcl | that's in predicate-result mode | 09:25 |
lkcl | or, it's supposed to be... | 09:25 |
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 09:43 | |
*** lxo <lxo!~lxo@linux-libre.fsfla.org> has quit IRC | 09:51 | |
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC | 10:00 | |
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 10:00 | |
ghostmansd | lkcl, I believe this patch with inversion is wrong, becase it makes the wole thing extremely inconsistent. | 10:28 |
ghostmansd | Either you should invert it in predicates, too, or keep it as is. | 10:28 |
ghostmansd | `/ff =nl` seems to give different results than `/m=nl` | 10:29 |
lkcl | ghostmansd, nggh, yyeah.... it means moving things in the spec though | 11:16 |
lkcl | and in hardware, the position of wires is not actually important | 11:16 |
ghostmansd[m] | I don't understand why not keep it the way it was | 11:16 |
ghostmansd[m] | Symmetrical and evident | 11:17 |
lkcl | because it was not according to the spec | 11:17 |
ghostmansd[m] | You mean the order of bits? | 11:17 |
lkcl | what is symmetrical and evident in hardware is not the same as symmetrical and evident in software | 11:17 |
lkcl | whether bits are *shared* between the same wires is more important | 11:17 |
ghostmansd[m] | This means that there are _two_ copies of predicates, one swaps the bits and other doesn't | 11:18 |
lkcl | now, if bits 19-23 were *actually* shared with predicate mask bits, that would matter | 11:18 |
lkcl | bit-ordering in hardware is completely meaningless as far as how they are shown in a specification | 11:19 |
lkcl | i know it's very weird. | 11:19 |
ghostmansd[m] | Yes, extremely | 11:19 |
lkcl | they only matter what they are connected to | 11:19 |
ghostmansd[m] | Ok, so I do need to keep two tables of predicates | 11:20 |
ghostmansd[m] | ? | 11:20 |
ghostmansd[m] | I mean binutils | 11:20 |
ghostmansd[m] | Also, I think a more obvious way to show this difference would be to explicitly filling in the table | 11:20 |
ghostmansd[m] | Without swaps | 11:20 |
lkcl | and document it | 11:20 |
lkcl | (one-line-comment) | 11:21 |
ghostmansd[m] | Yes | 11:21 |
ghostmansd[m] | Could you do it please for pysvp64asm, while I handle binutils? | 11:21 |
lkcl | sure | 11:21 |
ghostmansd[m] | Thank you! | 11:21 |
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC | 11:28 | |
*** tplaten <tplaten!~isengaara@55d45723.access.ecotel.net> has joined #libre-soc | 11:41 | |
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 11:50 | |
*** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 11:51 | |
ghostmansd | cat /tmp/test.s && SILENCELOG=true pysvp64asm /tmp/test.s /tmp/test.py.s && powerpc64le-linux-gnu-as /tmp/test.py.s -o /tmp/test.o && powerpc64le-linux-gnu-objcopy -Obinary /tmp/test.o /tmp/bin.o && pysvp64dis /tmp/bin.o | 11:51 |
ghostmansd | sv.add./ff=nl/m=nl *3,*7,*11 | 11:51 |
ghostmansd | ec 3f 50 07 sv.add./ff=ge/m=ge *r3,*r7,*r11 | 11:51 |
ghostmansd | 15 12 01 7c | 11:51 |
ghostmansd | That's from master | 11:51 |
ghostmansd | Either dis needs some tuning, or asm does the wrong thing | 11:51 |
ghostmansd | lkcl ^ | 11:51 |
ghostmansd | also, the comment "# decodes "Mode" in similar way to BO field (supposed to, anyway)" it somewhat misguiding :-) | 11:51 |
ghostmansd | except for "supposed to" part, perhaps | 11:51 |
ghostmansd | Ah wait, I got it. ge is alias to nl. | 11:52 |
*** tplaten <tplaten!~isengaara@55d45723.access.ecotel.net> has quit IRC | 12:01 | |
ghostmansd | lkcl, so, just to put all dots above all i's: we use the same predicates for ff/pr as for m/dm/sm, but swap the byte order for mask. Is it correct? | 12:13 |
ghostmansd | Tried doing this, but it didn't work. Basically all ff/pr handling is broken. | 12:44 |
*** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has joined #libre-soc | 13:08 | |
ghostmansd | The assembly is what's broken | 13:19 |
ghostmansd | `sv.add./ff=ge/m=ge *r3,*r7,*r11` gives this on binutils branch: ec 3f 50 07 | 13:20 |
ghostmansd | On binutils, I get this: e4 3f 50 07 | 13:20 |
ghostmansd | OK it seems this is mode-related. I suspect that it's again this fricking MSB0 order. | 13:36 |
ghostmansd | Yes it was exactly this. | 13:38 |
ghostmansd | OK updated. | 13:42 |
lkcl | yes, have to be careful to get aliases right | 13:50 |
lkcl | ya got there? :) | 13:51 |
lkcl | rebase done btw (tests passed) | 13:51 |
ghostmansd | Oh cool | 13:53 |
ghostmansd | Luke, can we postpone updating spec on the flight at least for a while? | 13:53 |
ghostmansd | It's really complex to develop binutils when there're changes in spec or svp64asm. | 13:54 |
ghostmansd | I basically have to track all four: specs, pysvp64asm, pysvp64dis and binutils. | 13:54 |
ghostmansd | And some changes are easy to miss. | 13:55 |
ghostmansd | The addition of RS register to OutSel was one of them. | 13:55 |
ghostmansd | I wouldn't even noticed it unless I had to re-generate the header and the source for a completely unrelated reason. | 13:55 |
lkcl | ah - yeah that one was partly-cosmetic, partly-not | 13:56 |
ghostmansd | I mean, it's difficult to develop further and keep up to date simultaneously. | 13:56 |
lkcl | the new instruction "pcdec." is an overwrite pair (RT,RS) | 13:56 |
lkcl | understood. | 13:57 |
ghostmansd | Thank you! | 13:58 |
ghostmansd | The assembly part is likely totally outdated for branch modes. I guess for CRs too. | 13:58 |
ghostmansd | I think it might be even outdated for pysvp64asm, the code there is so hairy with tons of variables, that I cannot even keep track of how it correlates to the spec. | 13:59 |
ghostmansd | At least disassembly is sufficiently close to the spec, speaking of how it sets bits. | 13:59 |
ghostmansd | But all these consts.py manipulations in pysvp64asm, all these if/else chains, etc., etc., they should eventually be done simpler too. | 14:00 |
lkcl | CR_ops haven't actually been done at all, it's simply a massive coincidence | 14:00 |
lkcl | yes i'd really like sv/trans/svp64.py to be updated | 14:01 |
ghostmansd | We already have tools for these, selectable int and fields, combined together, they make the code look close to spec. | 14:01 |
lkcl | drat | 14:01 |
lkcl | FAIL: test_13_RC1 (__main__.SVSTATETestCase) [0:sv.add/ff=RC1] | 14:01 |
lkcl | - sv.add/ff=RC1 *3,*7,*11 | 14:01 |
lkcl | ? ^^^ | 14:01 |
lkcl | + sv.add/ff=gt *3,*7,*11 | 14:01 |
ghostmansd | sigh | 14:01 |
ghostmansd | That's test_pysvp64dis? | 14:01 |
lkcl | yes | 14:01 |
ghostmansd | Will check. | 14:01 |
ghostmansd | Yeah reproducible | 14:02 |
ghostmansd | I guess this is part of this inv manipulation. | 14:03 |
lkcl | RC1 set when it should not be... or not set | 14:04 |
ghostmansd | sv.add/ff=RC1 *3,*7,*11 | 14:05 |
ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:05 |
ghostmansd | 14 12 01 7c | 14:05 |
ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:05 |
ghostmansd | 15 12 01 7c | 14:05 |
ghostmansd | `sv.add./ff=gt *3,*7,*11` is encoded the same way as `sv.add/ff=RC1 *3,*7,*11` | 14:06 |
ghostmansd | sv.add./ff=gt *3,*7,*11 | 14:06 |
ghostmansd | sv.add/ff=RC1 *3,*7,*11 | 14:06 |
ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:06 |
ghostmansd | 15 12 01 7c | 14:06 |
ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:06 |
ghostmansd | 14 12 01 7c | 14:06 |
lkcl | ok yep RC1 is not being reported | 14:06 |
ghostmansd | No it would have been reported | 14:06 |
lkcl | sv.add/ff=RC1/vli 3,7,11 | 14:06 |
lkcl | 0b 00 40 05 sv.add/ff=so r3,r7,r11 | 14:06 |
lkcl | 14 5a 67 7c | 14:06 |
ghostmansd | If it had been encoded properly | 14:06 |
lkcl | errmermermerm... | 14:07 |
ghostmansd | Ah wait | 14:07 |
ghostmansd | Rc seems not to be taken into account | 14:07 |
ghostmansd | 14 12 01 7c vs 15 12 01 7c | 14:07 |
ghostmansd | Though it was before this inv crap | 14:07 |
ghostmansd | (or it silently worked) | 14:07 |
ghostmansd | Hm | 14:08 |
lkcl | Rc | 14:08 |
lkcl | 0 | 14:08 |
lkcl | RM | 14:08 |
lkcl | normal: Rc=1: ffirst CR sel | 14:08 |
lkcl | RM | 14:08 |
lkcl | 000000000000000000001011 | 14:08 |
lkcl | RM.mode | 14:08 |
lkcl | 01011 | 14:08 |
lkcl | 27, 28, 29, 30, 31 | 14:08 |
ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:08 |
ghostmansd | Rc | 14:08 |
ghostmansd | 1 | 14:08 |
ghostmansd | 63 | 14:08 |
ghostmansd | It's taken into account | 14:09 |
ghostmansd | Or, well, it's recognized | 14:09 |
ghostmansd | normal: Rc=1: ffirst CR sel | 14:09 |
ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:09 |
ghostmansd | RM | 14:10 |
ghostmansd | normal: Rc=1: ffirst CR sel | 14:10 |
ghostmansd | This is wrong | 14:10 |
lkcl | yehyeh. | 14:10 |
ghostmansd | Looks like after that change you did you forgot to update the tables | 14:10 |
ghostmansd | The lookup is wrong | 14:10 |
ghostmansd | Seems like the most rational idea | 14:10 |
lkcl | ohh yeah | 14:11 |
lkcl | in RM.select. | 14:11 |
ghostmansd | Yep. | 14:11 |
lkcl | Rc=0 | 14:11 |
lkcl | it should be going to.... ffrc0 | 14:11 |
lkcl | let me just put a debug-print... | 14:11 |
ghostmansd | I mean value and mask | 14:12 |
ghostmansd | Rc is fine | 14:12 |
ghostmansd | It's 1 for . and 0 otherwise | 14:12 |
ghostmansd | It's this damned inv change | 14:12 |
lkcl | sv.add/ff=RC1/vli 3,7,11 | 14:12 |
lkcl | match 0b10001 0b110001 ffrc1 | 14:12 |
ghostmansd | Hm | 14:13 |
ghostmansd | So Rc is 1? | 14:13 |
ghostmansd | BTW what's search? | 14:13 |
lkcl | no, Rc=false | 14:13 |
ghostmansd | Hm | 14:13 |
ghostmansd | How is it matched then? | 14:13 |
lkcl | ah did you happen to change how Rc is done? | 14:14 |
lkcl | did you remove an __bool__ function? | 14:14 |
ghostmansd | @cached_property | 14:14 |
ghostmansd | def Rc(self): | 14:14 |
ghostmansd | Rc = self.mdwn.operands["Rc"] | 14:14 |
ghostmansd | if Rc is None: | 14:14 |
ghostmansd | return False | 14:14 |
ghostmansd | return bool(Rc.value) | 14:14 |
ghostmansd | self.mdwn.operands["Rc"] | 14:14 |
ghostmansd | This gets SI or None | 14:14 |
ghostmansd | IIRC SI has __bool__ | 14:14 |
lkcl | urr bizarre | 14:15 |
ghostmansd | 1 sec | 14:15 |
ghostmansd | This gets Operand or None, sorry | 14:15 |
lkcl | ok doing "Rc = 1 if Rc else 0" | 14:15 |
ghostmansd | Still this `return bool(Rc.value)` gets SI | 14:15 |
ghostmansd | I get Rc = True and Rc = False for these two instructions | 14:16 |
lkcl | which still doesn't quite work due to "|" with the other table entries | 14:16 |
lkcl | search = ((int(rm.mode) << 1) | Rc) | 14:16 |
lkcl | always sets LSB of that int to 1 | 14:17 |
ghostmansd | Why, if it's bool? | 14:17 |
ghostmansd | Shouldn't it be converted to int implicitly? | 14:17 |
lkcl | because it's not actually a bool i don't think, you return a SelectableInt() *from* __bool__() is that right? | 14:18 |
lkcl | oh wait | 14:18 |
lkcl | hang on | 14:18 |
lkcl | match 0 0b10001 0b110001 ffrc1 | 14:18 |
lkcl | huhn?? | 14:18 |
ghostmansd | print(type(Rc), Rc, bin(search)) | 14:18 |
ghostmansd | <class 'bool'> True 0b10011 | 14:18 |
ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:18 |
ghostmansd | 15 12 01 7c | 14:18 |
ghostmansd | <class 'bool'> False 0b10010 | 14:18 |
ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:18 |
ghostmansd | 14 12 01 7c | 14:18 |
lkcl | print ("match", Rc, bin(value), bin(mask), member) | 14:18 |
lkcl | 1 sec | 14:18 |
lkcl | match 0 0b10110 0b10001 0b110001 ffrc1 | 14:19 |
lkcl | if ((value & search) == (mask & search)): | 14:19 |
lkcl | print ("match", Rc, bin(search), bin(value), bin(mask), | 14:19 |
lkcl | member) | 14:19 |
lkcl | i have that wrong, don't i? | 14:20 |
ghostmansd | first, which instruction do you dump? | 14:20 |
lkcl | sigh | 14:20 |
lkcl | that should be value & mask == search & mask | 14:20 |
ghostmansd | I'd have sad valyue & mask | 14:20 |
ghostmansd | yep | 14:20 |
* lkcl face-palm | 14:20 | |
lkcl | RM | 14:20 |
lkcl | normal: Rc=0: ffirst z/nonz | 14:20 |
lkcl | RM | 14:20 |
lkcl | 000000000000000000001011 | 14:20 |
lkcl | all good :) | 14:20 |
* lkcl whistles | 14:20 | |
lkcl | sv.add/ff=RC1/vli 3,7,11 | 14:21 |
lkcl | match 0 0b10110 0b10000 0b110001 ffrc0 | 14:21 |
lkcl | 0b 00 40 05 sv.add/ff=RC1/vli r3,r7,r11 | 14:21 |
ghostmansd | sv.add./ff=gt *3,*7,*11 | 14:21 |
ghostmansd | sv.add/ff=RC1 *3,*7,*11 | 14:21 |
ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:21 |
ghostmansd | 15 12 01 7c | 14:21 |
ghostmansd | e9 3f 40 05 sv.add/ff=RC1 *r3,*r7,*r11 | 14:21 |
ghostmansd | 14 12 01 7c | 14:21 |
ghostmansd | This works | 14:21 |
ghostmansd | Ah OK you also did this :-) | 14:21 |
ghostmansd | pushed to binutils | 14:22 |
lkcl | why the hell it suddenly stopped working... | 14:22 |
ghostmansd | ¯\_(ツ)_/¯ | 14:23 |
lkcl | sorry, my mistake to fix. | 14:23 |
ghostmansd | not only yours :-) | 14:23 |
ghostmansd | I also copied it to binutils | 14:23 |
ghostmansd | lol | 14:23 |
lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=c125201b5ef4e24cec0f02eb111d6a1b80754773 | 14:23 |
ghostmansd | Two pairs of eyes are better they said | 14:23 |
ghostmansd | lol | 14:23 |
ghostmansd | exactly the same commit | 14:24 |
lkcl | two one-eyed kings... | 14:24 |
ghostmansd | I guess I can rebase safely | 14:24 |
ghostmansd | Hm, I'd have written it differently | 14:25 |
ghostmansd | if ((value & mask) == (search & mask)): | 14:25 |
lkcl | hey knock yourself out | 14:25 |
ghostmansd | done | 14:27 |
ghostmansd | OK back to the binutils | 14:28 |
lkcl | ack | 14:28 |
ghostmansd | ah you know what? | 14:28 |
ghostmansd | if ((subtable->value & match) == (subtable->mask & match)) | 14:28 |
lkcl | yaaa? | 14:28 |
lkcl | haha | 14:28 |
ghostmansd | I did it correctly lol | 14:28 |
lkcl | urrr... | 14:28 |
lkcl | nggh yeah | 14:28 |
ghostmansd | ah no | 14:28 |
ghostmansd | lol | 14:28 |
lkcl | honestly i tend to guess these things | 14:29 |
lkcl | urrr... yeah it should also be subtable->value & subtable->mask == match & subtable-mask | 14:29 |
lkcl | doh :) | 14:29 |
ghostmansd | yep | 14:30 |
ghostmansd | already done :-) | 14:30 |
ghostmansd | you noticed that I had to occupy 1 bit for Rc? | 14:30 |
ghostmansd | in binutils | 14:30 |
ghostmansd | because they don't store it | 14:30 |
lkcl | intriguing | 14:31 |
ghostmansd | we could check for . in the name (we already decoded it in dis), but I feel it's way to fragile | 14:31 |
ghostmansd | we already have crap like andis. | 14:31 |
ghostmansd | where there's no andis | 14:31 |
ghostmansd | So I thought it'd be better to keep what we have in mdwn | 14:32 |
ghostmansd | uint64_t inv = svp64_insn_get_prefix_rm_ldst_imm_prrc0_inv (&svp64->insn); | 14:32 |
ghostmansd | uint64_t els = svp64_insn_get_prefix_rm_ldst_imm_prrc0_els (&svp64->insn); | 14:33 |
ghostmansd | uint64_t RC1 = svp64_insn_get_prefix_rm_ldst_imm_prrc0_RC1 (&svp64->insn); | 14:33 |
ghostmansd | Feel the power of fields lol | 14:33 |
ghostmansd | what for fuck's sake is SEA? | 14:34 |
ghostmansd | Should it be /sea? | 14:34 |
ghostmansd | I mean, at the point when user calls some ld/st instruction, how he affects it to set SEA? | 14:37 |
ghostmansd | how does he affect* | 14:37 |
lkcl | yehyeh | 14:40 |
lkcl | signed effective address | 14:41 |
lkcl | it's for when you do elwidth overrides to below 64-bit | 14:41 |
lkcl | so you have a register RB which is now only 32-bit | 14:41 |
lkcl | it gets added to RA (64-bit) | 14:41 |
lkcl | do you add 32-bit RB to 64-bit RA as signed or unsigned? | 14:42 |
lkcl | both are useful | 14:42 |
lkcl | hence /sea | 14:42 |
lkcl | and yes it hasn't been added in to sv/trans/svp64.py sigh | 14:42 |
ghostmansd[m] | Ok, I'll add it | 14:50 |
ghostmansd[m] | To all of them | 14:50 |
ghostmansd[m] | I'd be simpler to keep track | 14:50 |
ghostmansd[m] | pysvp64asm, pysvp64dis and binutils | 14:50 |
lkcl | ack leave it with you | 15:10 |
* lkcl going to try doing RC1 in ISACaller | 15:10 | |
lkcl | actually get it working, so programmerjake has something for "sv.pcdec./ff=RC1" | 15:11 |
ghostmansd | lkcl, there's (again!) contradiction between the code and the spec | 15:50 |
ghostmansd | is SEA available in simple ld/st idx mode? | 15:50 |
ghostmansd | From the spec, it is | 15:50 |
ghostmansd | From the code, tables consider this to be part of the mask and assume bit 3 of mode to be 0 | 15:51 |
ghostmansd | I fixed the code, cf. binutils branch. Please update the spec otherwise. | 15:52 |
ghostmansd | cf. LD/ST Indexed here: https://libre-soc.org/openpower/sv/ldst/ | 15:53 |
ghostmansd | I think you simply did copy&paste error from LD/ST Immediate in the code | 15:58 |
ghostmansd | Also, how's setting ld/st idx stride mode is done? `sv.ldux/sea/ff=RC1 5,6,7` is the only (except for ~RC1) way I could think of. But this will lead to `sv.ldux/sea/sz r5,r6,r7`, since pysvp64asm sets only DZ field. | 16:14 |
ghostmansd | Again, this all is extremely inconsistent and confusing. | 16:14 |
ghostmansd | Othewise there must be some other way to set sv_mode = 0b01 for ld/st idx. ff explicitly bans src_zero (with totally misleading comment). | 16:16 |
ghostmansd | # "failfirst" modes | 16:16 |
ghostmansd | elif sv_mode == 0b01: | 16:16 |
ghostmansd | assert src_zero == 0, "dest-zero not allowed in failfirst mode" | 16:16 |
ghostmansd | I'll leave this mess to you to sort out. For now only ld/st idx simple mode is supported and test for it is pushed too. | 16:18 |
*** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has quit IRC | 16:22 | |
lkcl | 1 sec | 16:53 |
lkcl | ermmm ermermerm... | 16:53 |
lkcl | SEA is *only* available in LD/ST-indexed | 16:54 |
lkcl | i haven't implemented SEA so you're literally the first person to look at it. | 16:57 |
lkcl | fail-first mode is *not* possible in LD/ST-indexed | 16:58 |
ghostmansd[m] | Well, how does one set 0b01 sv_mode? | 17:00 |
ghostmansd[m] | I want ld/st idx strided. It's 0b01. How can I set it? Currently the only option to set sv_mode to 0b01 is ff. | 17:02 |
ghostmansd[m] | And, well, ff conflicts with ld/st idx strided. | 17:02 |
ghostmansd[m] | (not to mention it's not in the table at all). | 17:03 |
ghostmansd[m] | If strided was the only mode to allow SEA, I'd have thought that it's /sea itself to set the mode. But it's not the case, SEA is in simple mode too. | 17:04 |
lkcl | it should be "/els" | 17:04 |
ghostmansd[m] | Sorry, there's no bit in spec named els. | 17:05 |
ghostmansd[m] | That was another guess. | 17:05 |
ghostmansd[m] | If SEA is only in simple, then els will replace SEA in strided mode. | 17:05 |
lkcl | no, you *use* "/els" to set mode=0b01 | 17:05 |
lkcl | see this? | 17:05 |
lkcl | if is_ldst: | 17:06 |
lkcl | # TODO: for now, LD/ST-indexed is ignored. | 17:06 |
lkcl | mode |= ldst_elstride << SVP64MODE.ELS_NORMAL # el-strided | 17:06 |
lkcl | i haven't added it | 17:06 |
lkcl | let me sort that | 17:06 |
ghostmansd[m] | OK | 17:07 |
lkcl | elif encmode == 'els': | 17:08 |
lkcl | ldst_elstride = 1 | 17:08 |
lkcl | + # in indexed mode, set sv_mode=0b01 | 17:08 |
lkcl | + if is_ldst_idx: | 17:08 |
lkcl | + sv_mode = 0b01 | 17:08 |
lkcl | 1 sec | 17:08 |
ghostmansd[m] | Keep in mind that other mode prints /els | 17:08 |
ghostmansd[m] | IIRC normal mode | 17:08 |
lkcl | yes. | 17:08 |
lkcl | you mean ldst-imm | 17:08 |
ghostmansd[m] | Ah yes | 17:08 |
ghostmansd[m] | Sorry | 17:08 |
ghostmansd[m] | Well anything that inherits from ElsBaseRM | 17:08 |
ghostmansd[m] | Or how I called it | 17:08 |
lkcl | ok let's add a test for it... | 17:09 |
lkcl | really should have that check that RA and RB must be scalar, but hey | 17:10 |
lkcl | nggggh | 17:12 |
lkcl | ok done sv/trans/svp64.py | 17:20 |
lkcl | when LDST_IDX is detected, and "/els" is used, that's when mode=0b01 is allowed | 17:22 |
markos | [ OK ] SVP64/VpxVarianceTest.OneQuarter/9 (70950 ms) | 17:29 |
markos | [----------] 40 tests from SVP64/VpxVarianceTest (25578716 ms total) | 17:29 |
markos | [----------] Global test environment tear-down | 17:29 |
markos | [==========] 40 tests from 1 test suite ran. (25578717 ms total) | 17:29 |
markos | [ PASSED ] 40 tests. | 17:29 |
markos | finally | 17:29 |
lkcl | markos, aawesome :) | 17:30 |
lkcl | that took a while | 17:30 |
markos | yeah, and I had to remove the 64x64 blocks and even more reduce the number of iterations, it took more than 24h and it was still doing 64x64 blocks in the morning :D | 17:30 |
markos | so I'm thinking of trimming the variance tests to only include the functions I've done so far -2 more remaining but they're mostly the same stuff- and consider VP9 done and move to VP8 to a slightly more complicated function (quantize) | 17:31 |
markos | is that ok with you? | 17:31 |
markos | have to be afk now, will commit the stuff so far | 17:32 |
lkcl | yep perfect | 17:32 |
ghostmansd | lkcl, just returned to laptop. Thank you for patches, will take a look now and update binutils. | 18:26 |
ghostmansd | lkcl, apparently you didn't merge binutils patches, right? | 18:27 |
ghostmansd | aaah I see | 18:27 |
ghostmansd | OK | 18:27 |
ghostmansd | never mind :-) | 18:27 |
ghostmansd | I think this is really hacky: if sv_mode == 0b01 and is_ldst_idx: | 18:40 |
ghostmansd | Other modes are somewhat "unified", except for perhaps branches. | 18:41 |
ghostmansd | lkcl, I've been thinking that some specifiers should perhaps require setting mode before allowing to use them. | 18:43 |
ghostmansd | Not like "collect all specifiers and potentially set the mode, then finally post-check some stuff like SEA and diagnose that the mode was not set". But, instead, "if we found /sea and mode is still not set, immediately surrender and suggest the correction". | 18:45 |
ghostmansd | Rationale is that this relationship is more obvious, and also makes the code a bit more linear. | 18:46 |
lkcl | yes that makes sense | 18:46 |
ghostmansd | What do you think? | 18:46 |
ghostmansd | OK, good! | 18:47 |
ghostmansd | I have to admit that the assembly part is outdated a lot. | 18:47 |
lkcl | still probably needs either a 2-pass or something | 18:47 |
lkcl | uhhuhn | 18:47 |
ghostmansd | I added zz, will add SEA and other stuff, but this will need refactoring anyway. | 18:47 |
ghostmansd | I don't have time to do it in scope of disassembly. | 18:47 |
ghostmansd | Another problem is that I'd like to refactor commits in a way so that both assembly and disassembly for each specifier appears together with the tests for binutils. | 18:48 |
ghostmansd | And this will take a lot of time and should preferably be done after we refactor our reference assembler. | 18:48 |
lkcl | yeah that makes sense although don't push it unnecessarily | 18:48 |
ghostmansd | So this saga is not over. :-( | 18:49 |
lkcl | joooy | 18:49 |
ghostmansd | Waht do you mean by pushing unnecessarily?\ | 18:49 |
ghostmansd | You mean that upstream branch? :-) | 18:49 |
ghostmansd | Removed by Alan | 18:49 |
*** lxo <lxo!~lxo@linux-libre.fsfla.org> has joined #libre-soc | 18:49 | |
lkcl | oh were you referring to binutils? | 18:53 |
lkcl | i thought you mean refactor openpower-isa repo commits | 18:54 |
ghostmansd | ah no | 18:57 |
ghostmansd | I meant binutils | 18:57 |
ghostmansd | currently I have a lot of commits which add specifiers to assembly | 18:57 |
ghostmansd | then some commits which support these in disassembly | 18:58 |
ghostmansd | on the other hand, in disassembly, we support these in per-mode fashion | 18:58 |
ghostmansd | (which frankly we should do in assembly too) | 18:58 |
*** lxo <lxo!~lxo@linux-libre.fsfla.org> has quit IRC | 19:36 | |
*** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has joined #libre-soc | 19:40 | |
ghostmansd | I've walked over 38 commits (some are really huge) and synced them to some degree with pysvp64asm (SEA, els, etc.). Stuff I don't sync now includes VLi and branches: these are handled by new fields, and I'll handle them in scope of switching the whole binutils assembly to this mechanism. | 20:10 |
ghostmansd | Upon disassembly, this already proved to be a perfect choice; for assembly, the only change I'd like to have is to have modes enforced (e.g. forbid SEA without /els and non-LDST-idx, allow VLi only for normal failfirst Rc=0, etc. | 20:12 |
ghostmansd | Tomorrow I hope to complete CR ops and branches in disassembly. | 20:13 |
lkcl | fantastic | 20:39 |
*** zemaye__ <zemaye__!~zemaye@172.58.160.38> has joined #libre-soc | 21:04 | |
*** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has quit IRC | 21:06 | |
*** zemaye <zemaye!~zemaye@172.58.30.210> has joined #libre-soc | 21:06 | |
*** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has quit IRC | 21:07 | |
*** zemaye__ <zemaye__!~zemaye@172.58.160.38> has quit IRC | 21:09 | |
*** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has joined #libre-soc | 21:36 | |
*** zemaye <zemaye!~zemaye@172.58.30.210> has quit IRC | 21:38 | |
*** openpowerbot <openpowerbot!~openpower@94-226-188-34.access.telenet.be> has quit IRC | 21:48 | |
*** openpowerbot <openpowerbot!~openpower@94-226-188-34.access.telenet.be> has joined #libre-soc | 21:48 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!