| programmerjake | got pcdec. to work! will commit after all tests pass | 01:15 |
|---|---|---|
| lkcl | ahh brilliant | 01:18 |
| programmerjake | all tests pass! thx for fixing all the other tests :) | 01:22 |
| lkcl | very cool instruction/idea | 01:29 |
| programmerjake | thx! | 01:29 |
| lkcl | bits-unused feeds back. worked that out | 01:29 |
| lkcl | what happens if there are more bytes being produced than there are bits in input? | 01:30 |
| lkcl | that's a different type of overflow condition from "there are greater than 6-bit-encodings", right? | 01:30 |
| programmerjake | that's impossible, it will notice there's no input bits so it'll just stop there, leaving the rest the bytes in RT set to 0 (means unused) | 01:30 |
| lkcl | RT=0 is not "the output was all zero-bytes"? | 01:31 |
| programmerjake | RT=0 is "there was no output bytes" since output bytes can't be zero. | 01:32 |
| lkcl | ngggh... these aren't actual "data bytes", they're indices-into-the-table-of-output, aren't they? | 01:33 |
| lkcl | (something like that) | 01:33 |
| programmerjake | they're decoded symbols packed into 1 per byte, symbols can be 1-bit to as-many-as-you-please. pcdec. only handles symbols up to 5-bits (not 6-bits, that was an error) | 01:34 |
| programmerjake | symbols are encoded by taking the symbol's bits, packing them from LSB to MSB, then adding a 1-bit, then filling the rest of the byte with zeros | 01:35 |
| programmerjake | encoded into bytes in RT ^ | 01:35 |
| programmerjake | so their kinda indices into the table of all possible outputs, but rearranged to just be LSB0 indices into the corresponding 1-bit in tree | 01:36 |
| lkcl | really want to see what happens on sv.pcdec. | 01:38 |
| lkcl | and if /ff mode helps | 01:38 |
| programmerjake | ff mode is the only way it can be vectorized at all... | 01:39 |
| lkcl | jooy | 01:39 |
| lkcl | /vli mode btw is only possible with testing *just* "eq" bit (ne/eq the only two options) | 01:40 |
| programmerjake | icr what vli stands for... | 01:40 |
| lkcl | truncates VL *inclusive* | 01:40 |
| lkcl | if the test fails VL is truncated normally to *exclude* the failing element | 01:41 |
| programmerjake | yeah, we'll want to include the last element | 01:41 |
| lkcl | then you can't use /ff=lt or /ff=so | 01:42 |
| lkcl | it has to be /ff=RC1/vli which i'll write tomorrow | 01:42 |
| programmerjake | I'll move the "output-empty" to lt, and have eq be "we hit any stopping condition" | 01:42 |
| lkcl | ack | 01:42 |
| programmerjake | next week, when I'm working on it again | 01:43 |
| lkcl | also i didn't foresee having this applied to "only-Rc=1" instructions | 01:43 |
| lkcl | RC1 mode was only originally designed for instructions that don't have an Rc=1 mode | 01:43 |
| lkcl | i'll have to do an XOR of the hard-coded-Rc=1 (csv rc column == 'ONE') /ff=RC1 flag | 01:44 |
| lkcl | urrr | 01:44 |
| programmerjake | uuh, couldn't it just be sv.pcdec./vli/ff=eq? | 01:44 |
| lkcl | nnope. | 01:45 |
| lkcl | it's a tri-mode not a dual-mode | 01:45 |
| lkcl | (one flag==1 enables/activates another flag) | 01:46 |
| lkcl | (i.e. the 2nd flag is *ignored* if the 1st flag == 0) | 01:46 |
| programmerjake | uuh, RC1=1 can't be used, since the spec says the results are never stored, only the CR outputs...the whole point of pcdec. is the RT output, without it it's mostly useless | 01:48 |
| programmerjake | either that or the spec is unclear | 01:48 |
| programmerjake | > Note that when RC1=1 the result elements are never stored, only the CR Fields. | 01:49 |
| programmerjake | https://libre-soc.org/openpower/sv/normal/#index5h1 | 01:49 |
| programmerjake | imho the spec should be changed to always write outputs for each element up to and including the first one that fails the data-dependent fail-first test, only elements after that one are not executed. VL being set to exclude the failing element should happen after. | 01:51 |
| programmerjake | that way, an element is always fully-executed or not executed. not partially-only-writes-CR-executed | 01:52 |
| programmerjake | all outputs, RT, CR, OV, etc. | 01:53 |
| *** zemaye__ <zemaye__!~zemaye@172.58.107.28> has joined #libre-soc | 03:31 | |
| *** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has quit IRC | 03:34 | |
| *** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has joined #libre-soc | 03:52 | |
| *** zemaye__ <zemaye__!~zemaye@172.58.107.28> has quit IRC | 03:54 | |
| *** lxo <lxo!~lxo@linux-libre.fsfla.org> has joined #libre-soc | 07:18 | |
| lkcl | that's in predicate-result mode | 09:25 |
| lkcl | or, it's supposed to be... | 09:25 |
| *** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 09:43 | |
| *** lxo <lxo!~lxo@linux-libre.fsfla.org> has quit IRC | 09:51 | |
| *** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC | 10:00 | |
| *** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 10:00 | |
| ghostmansd | lkcl, I believe this patch with inversion is wrong, becase it makes the wole thing extremely inconsistent. | 10:28 |
| ghostmansd | Either you should invert it in predicates, too, or keep it as is. | 10:28 |
| ghostmansd | `/ff =nl` seems to give different results than `/m=nl` | 10:29 |
| lkcl | ghostmansd, nggh, yyeah.... it means moving things in the spec though | 11:16 |
| lkcl | and in hardware, the position of wires is not actually important | 11:16 |
| ghostmansd[m] | I don't understand why not keep it the way it was | 11:16 |
| ghostmansd[m] | Symmetrical and evident | 11:17 |
| lkcl | because it was not according to the spec | 11:17 |
| ghostmansd[m] | You mean the order of bits? | 11:17 |
| lkcl | what is symmetrical and evident in hardware is not the same as symmetrical and evident in software | 11:17 |
| lkcl | whether bits are *shared* between the same wires is more important | 11:17 |
| ghostmansd[m] | This means that there are _two_ copies of predicates, one swaps the bits and other doesn't | 11:18 |
| lkcl | now, if bits 19-23 were *actually* shared with predicate mask bits, that would matter | 11:18 |
| lkcl | bit-ordering in hardware is completely meaningless as far as how they are shown in a specification | 11:19 |
| lkcl | i know it's very weird. | 11:19 |
| ghostmansd[m] | Yes, extremely | 11:19 |
| lkcl | they only matter what they are connected to | 11:19 |
| ghostmansd[m] | Ok, so I do need to keep two tables of predicates | 11:20 |
| ghostmansd[m] | ? | 11:20 |
| ghostmansd[m] | I mean binutils | 11:20 |
| ghostmansd[m] | Also, I think a more obvious way to show this difference would be to explicitly filling in the table | 11:20 |
| ghostmansd[m] | Without swaps | 11:20 |
| lkcl | and document it | 11:20 |
| lkcl | (one-line-comment) | 11:21 |
| ghostmansd[m] | Yes | 11:21 |
| ghostmansd[m] | Could you do it please for pysvp64asm, while I handle binutils? | 11:21 |
| lkcl | sure | 11:21 |
| ghostmansd[m] | Thank you! | 11:21 |
| *** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has quit IRC | 11:28 | |
| *** tplaten <tplaten!~isengaara@55d45723.access.ecotel.net> has joined #libre-soc | 11:41 | |
| *** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 11:50 | |
| *** ghostmansd <ghostmansd!~ghostmans@broadband-188-32-220-156.ip.moscow.rt.ru> has joined #libre-soc | 11:51 | |
| ghostmansd | cat /tmp/test.s && SILENCELOG=true pysvp64asm /tmp/test.s /tmp/test.py.s && powerpc64le-linux-gnu-as /tmp/test.py.s -o /tmp/test.o && powerpc64le-linux-gnu-objcopy -Obinary /tmp/test.o /tmp/bin.o && pysvp64dis /tmp/bin.o | 11:51 |
| ghostmansd | sv.add./ff=nl/m=nl *3,*7,*11 | 11:51 |
| ghostmansd | ec 3f 50 07 sv.add./ff=ge/m=ge *r3,*r7,*r11 | 11:51 |
| ghostmansd | 15 12 01 7c | 11:51 |
| ghostmansd | That's from master | 11:51 |
| ghostmansd | Either dis needs some tuning, or asm does the wrong thing | 11:51 |
| ghostmansd | lkcl ^ | 11:51 |
| ghostmansd | also, the comment "# decodes "Mode" in similar way to BO field (supposed to, anyway)" it somewhat misguiding :-) | 11:51 |
| ghostmansd | except for "supposed to" part, perhaps | 11:51 |
| ghostmansd | Ah wait, I got it. ge is alias to nl. | 11:52 |
| *** tplaten <tplaten!~isengaara@55d45723.access.ecotel.net> has quit IRC | 12:01 | |
| ghostmansd | lkcl, so, just to put all dots above all i's: we use the same predicates for ff/pr as for m/dm/sm, but swap the byte order for mask. Is it correct? | 12:13 |
| ghostmansd | Tried doing this, but it didn't work. Basically all ff/pr handling is broken. | 12:44 |
| *** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has joined #libre-soc | 13:08 | |
| ghostmansd | The assembly is what's broken | 13:19 |
| ghostmansd | `sv.add./ff=ge/m=ge *r3,*r7,*r11` gives this on binutils branch: ec 3f 50 07 | 13:20 |
| ghostmansd | On binutils, I get this: e4 3f 50 07 | 13:20 |
| ghostmansd | OK it seems this is mode-related. I suspect that it's again this fricking MSB0 order. | 13:36 |
| ghostmansd | Yes it was exactly this. | 13:38 |
| ghostmansd | OK updated. | 13:42 |
| lkcl | yes, have to be careful to get aliases right | 13:50 |
| lkcl | ya got there? :) | 13:51 |
| lkcl | rebase done btw (tests passed) | 13:51 |
| ghostmansd | Oh cool | 13:53 |
| ghostmansd | Luke, can we postpone updating spec on the flight at least for a while? | 13:53 |
| ghostmansd | It's really complex to develop binutils when there're changes in spec or svp64asm. | 13:54 |
| ghostmansd | I basically have to track all four: specs, pysvp64asm, pysvp64dis and binutils. | 13:54 |
| ghostmansd | And some changes are easy to miss. | 13:55 |
| ghostmansd | The addition of RS register to OutSel was one of them. | 13:55 |
| ghostmansd | I wouldn't even noticed it unless I had to re-generate the header and the source for a completely unrelated reason. | 13:55 |
| lkcl | ah - yeah that one was partly-cosmetic, partly-not | 13:56 |
| ghostmansd | I mean, it's difficult to develop further and keep up to date simultaneously. | 13:56 |
| lkcl | the new instruction "pcdec." is an overwrite pair (RT,RS) | 13:56 |
| lkcl | understood. | 13:57 |
| ghostmansd | Thank you! | 13:58 |
| ghostmansd | The assembly part is likely totally outdated for branch modes. I guess for CRs too. | 13:58 |
| ghostmansd | I think it might be even outdated for pysvp64asm, the code there is so hairy with tons of variables, that I cannot even keep track of how it correlates to the spec. | 13:59 |
| ghostmansd | At least disassembly is sufficiently close to the spec, speaking of how it sets bits. | 13:59 |
| ghostmansd | But all these consts.py manipulations in pysvp64asm, all these if/else chains, etc., etc., they should eventually be done simpler too. | 14:00 |
| lkcl | CR_ops haven't actually been done at all, it's simply a massive coincidence | 14:00 |
| lkcl | yes i'd really like sv/trans/svp64.py to be updated | 14:01 |
| ghostmansd | We already have tools for these, selectable int and fields, combined together, they make the code look close to spec. | 14:01 |
| lkcl | drat | 14:01 |
| lkcl | FAIL: test_13_RC1 (__main__.SVSTATETestCase) [0:sv.add/ff=RC1] | 14:01 |
| lkcl | - sv.add/ff=RC1 *3,*7,*11 | 14:01 |
| lkcl | ? ^^^ | 14:01 |
| lkcl | + sv.add/ff=gt *3,*7,*11 | 14:01 |
| ghostmansd | sigh | 14:01 |
| ghostmansd | That's test_pysvp64dis? | 14:01 |
| lkcl | yes | 14:01 |
| ghostmansd | Will check. | 14:01 |
| ghostmansd | Yeah reproducible | 14:02 |
| ghostmansd | I guess this is part of this inv manipulation. | 14:03 |
| lkcl | RC1 set when it should not be... or not set | 14:04 |
| ghostmansd | sv.add/ff=RC1 *3,*7,*11 | 14:05 |
| ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:05 |
| ghostmansd | 14 12 01 7c | 14:05 |
| ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:05 |
| ghostmansd | 15 12 01 7c | 14:05 |
| ghostmansd | `sv.add./ff=gt *3,*7,*11` is encoded the same way as `sv.add/ff=RC1 *3,*7,*11` | 14:06 |
| ghostmansd | sv.add./ff=gt *3,*7,*11 | 14:06 |
| ghostmansd | sv.add/ff=RC1 *3,*7,*11 | 14:06 |
| ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:06 |
| ghostmansd | 15 12 01 7c | 14:06 |
| ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:06 |
| ghostmansd | 14 12 01 7c | 14:06 |
| lkcl | ok yep RC1 is not being reported | 14:06 |
| ghostmansd | No it would have been reported | 14:06 |
| lkcl | sv.add/ff=RC1/vli 3,7,11 | 14:06 |
| lkcl | 0b 00 40 05 sv.add/ff=so r3,r7,r11 | 14:06 |
| lkcl | 14 5a 67 7c | 14:06 |
| ghostmansd | If it had been encoded properly | 14:06 |
| lkcl | errmermermerm... | 14:07 |
| ghostmansd | Ah wait | 14:07 |
| ghostmansd | Rc seems not to be taken into account | 14:07 |
| ghostmansd | 14 12 01 7c vs 15 12 01 7c | 14:07 |
| ghostmansd | Though it was before this inv crap | 14:07 |
| ghostmansd | (or it silently worked) | 14:07 |
| ghostmansd | Hm | 14:08 |
| lkcl | Rc | 14:08 |
| lkcl | 0 | 14:08 |
| lkcl | RM | 14:08 |
| lkcl | normal: Rc=1: ffirst CR sel | 14:08 |
| lkcl | RM | 14:08 |
| lkcl | 000000000000000000001011 | 14:08 |
| lkcl | RM.mode | 14:08 |
| lkcl | 01011 | 14:08 |
| lkcl | 27, 28, 29, 30, 31 | 14:08 |
| ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:08 |
| ghostmansd | Rc | 14:08 |
| ghostmansd | 1 | 14:08 |
| ghostmansd | 63 | 14:08 |
| ghostmansd | It's taken into account | 14:09 |
| ghostmansd | Or, well, it's recognized | 14:09 |
| ghostmansd | normal: Rc=1: ffirst CR sel | 14:09 |
| ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:09 |
| ghostmansd | RM | 14:10 |
| ghostmansd | normal: Rc=1: ffirst CR sel | 14:10 |
| ghostmansd | This is wrong | 14:10 |
| lkcl | yehyeh. | 14:10 |
| ghostmansd | Looks like after that change you did you forgot to update the tables | 14:10 |
| ghostmansd | The lookup is wrong | 14:10 |
| ghostmansd | Seems like the most rational idea | 14:10 |
| lkcl | ohh yeah | 14:11 |
| lkcl | in RM.select. | 14:11 |
| ghostmansd | Yep. | 14:11 |
| lkcl | Rc=0 | 14:11 |
| lkcl | it should be going to.... ffrc0 | 14:11 |
| lkcl | let me just put a debug-print... | 14:11 |
| ghostmansd | I mean value and mask | 14:12 |
| ghostmansd | Rc is fine | 14:12 |
| ghostmansd | It's 1 for . and 0 otherwise | 14:12 |
| ghostmansd | It's this damned inv change | 14:12 |
| lkcl | sv.add/ff=RC1/vli 3,7,11 | 14:12 |
| lkcl | match 0b10001 0b110001 ffrc1 | 14:12 |
| ghostmansd | Hm | 14:13 |
| ghostmansd | So Rc is 1? | 14:13 |
| ghostmansd | BTW what's search? | 14:13 |
| lkcl | no, Rc=false | 14:13 |
| ghostmansd | Hm | 14:13 |
| ghostmansd | How is it matched then? | 14:13 |
| lkcl | ah did you happen to change how Rc is done? | 14:14 |
| lkcl | did you remove an __bool__ function? | 14:14 |
| ghostmansd | @cached_property | 14:14 |
| ghostmansd | def Rc(self): | 14:14 |
| ghostmansd | Rc = self.mdwn.operands["Rc"] | 14:14 |
| ghostmansd | if Rc is None: | 14:14 |
| ghostmansd | return False | 14:14 |
| ghostmansd | return bool(Rc.value) | 14:14 |
| ghostmansd | self.mdwn.operands["Rc"] | 14:14 |
| ghostmansd | This gets SI or None | 14:14 |
| ghostmansd | IIRC SI has __bool__ | 14:14 |
| lkcl | urr bizarre | 14:15 |
| ghostmansd | 1 sec | 14:15 |
| ghostmansd | This gets Operand or None, sorry | 14:15 |
| lkcl | ok doing "Rc = 1 if Rc else 0" | 14:15 |
| ghostmansd | Still this `return bool(Rc.value)` gets SI | 14:15 |
| ghostmansd | I get Rc = True and Rc = False for these two instructions | 14:16 |
| lkcl | which still doesn't quite work due to "|" with the other table entries | 14:16 |
| lkcl | search = ((int(rm.mode) << 1) | Rc) | 14:16 |
| lkcl | always sets LSB of that int to 1 | 14:17 |
| ghostmansd | Why, if it's bool? | 14:17 |
| ghostmansd | Shouldn't it be converted to int implicitly? | 14:17 |
| lkcl | because it's not actually a bool i don't think, you return a SelectableInt() *from* __bool__() is that right? | 14:18 |
| lkcl | oh wait | 14:18 |
| lkcl | hang on | 14:18 |
| lkcl | match 0 0b10001 0b110001 ffrc1 | 14:18 |
| lkcl | huhn?? | 14:18 |
| ghostmansd | print(type(Rc), Rc, bin(search)) | 14:18 |
| ghostmansd | <class 'bool'> True 0b10011 | 14:18 |
| ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:18 |
| ghostmansd | 15 12 01 7c | 14:18 |
| ghostmansd | <class 'bool'> False 0b10010 | 14:18 |
| ghostmansd | e9 3f 40 05 sv.add/ff=gt *r3,*r7,*r11 | 14:18 |
| ghostmansd | 14 12 01 7c | 14:18 |
| lkcl | print ("match", Rc, bin(value), bin(mask), member) | 14:18 |
| lkcl | 1 sec | 14:18 |
| lkcl | match 0 0b10110 0b10001 0b110001 ffrc1 | 14:19 |
| lkcl | if ((value & search) == (mask & search)): | 14:19 |
| lkcl | print ("match", Rc, bin(search), bin(value), bin(mask), | 14:19 |
| lkcl | member) | 14:19 |
| lkcl | i have that wrong, don't i? | 14:20 |
| ghostmansd | first, which instruction do you dump? | 14:20 |
| lkcl | sigh | 14:20 |
| lkcl | that should be value & mask == search & mask | 14:20 |
| ghostmansd | I'd have sad valyue & mask | 14:20 |
| ghostmansd | yep | 14:20 |
| * lkcl face-palm | 14:20 | |
| lkcl | RM | 14:20 |
| lkcl | normal: Rc=0: ffirst z/nonz | 14:20 |
| lkcl | RM | 14:20 |
| lkcl | 000000000000000000001011 | 14:20 |
| lkcl | all good :) | 14:20 |
| * lkcl whistles | 14:20 | |
| lkcl | sv.add/ff=RC1/vli 3,7,11 | 14:21 |
| lkcl | match 0 0b10110 0b10000 0b110001 ffrc0 | 14:21 |
| lkcl | 0b 00 40 05 sv.add/ff=RC1/vli r3,r7,r11 | 14:21 |
| ghostmansd | sv.add./ff=gt *3,*7,*11 | 14:21 |
| ghostmansd | sv.add/ff=RC1 *3,*7,*11 | 14:21 |
| ghostmansd | e9 3f 40 05 sv.add./ff=gt *r3,*r7,*r11 | 14:21 |
| ghostmansd | 15 12 01 7c | 14:21 |
| ghostmansd | e9 3f 40 05 sv.add/ff=RC1 *r3,*r7,*r11 | 14:21 |
| ghostmansd | 14 12 01 7c | 14:21 |
| ghostmansd | This works | 14:21 |
| ghostmansd | Ah OK you also did this :-) | 14:21 |
| ghostmansd | pushed to binutils | 14:22 |
| lkcl | why the hell it suddenly stopped working... | 14:22 |
| ghostmansd | ¯\_(ツ)_/¯ | 14:23 |
| lkcl | sorry, my mistake to fix. | 14:23 |
| ghostmansd | not only yours :-) | 14:23 |
| ghostmansd | I also copied it to binutils | 14:23 |
| ghostmansd | lol | 14:23 |
| lkcl | https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=c125201b5ef4e24cec0f02eb111d6a1b80754773 | 14:23 |
| ghostmansd | Two pairs of eyes are better they said | 14:23 |
| ghostmansd | lol | 14:23 |
| ghostmansd | exactly the same commit | 14:24 |
| lkcl | two one-eyed kings... | 14:24 |
| ghostmansd | I guess I can rebase safely | 14:24 |
| ghostmansd | Hm, I'd have written it differently | 14:25 |
| ghostmansd | if ((value & mask) == (search & mask)): | 14:25 |
| lkcl | hey knock yourself out | 14:25 |
| ghostmansd | done | 14:27 |
| ghostmansd | OK back to the binutils | 14:28 |
| lkcl | ack | 14:28 |
| ghostmansd | ah you know what? | 14:28 |
| ghostmansd | if ((subtable->value & match) == (subtable->mask & match)) | 14:28 |
| lkcl | yaaa? | 14:28 |
| lkcl | haha | 14:28 |
| ghostmansd | I did it correctly lol | 14:28 |
| lkcl | urrr... | 14:28 |
| lkcl | nggh yeah | 14:28 |
| ghostmansd | ah no | 14:28 |
| ghostmansd | lol | 14:28 |
| lkcl | honestly i tend to guess these things | 14:29 |
| lkcl | urrr... yeah it should also be subtable->value & subtable->mask == match & subtable-mask | 14:29 |
| lkcl | doh :) | 14:29 |
| ghostmansd | yep | 14:30 |
| ghostmansd | already done :-) | 14:30 |
| ghostmansd | you noticed that I had to occupy 1 bit for Rc? | 14:30 |
| ghostmansd | in binutils | 14:30 |
| ghostmansd | because they don't store it | 14:30 |
| lkcl | intriguing | 14:31 |
| ghostmansd | we could check for . in the name (we already decoded it in dis), but I feel it's way to fragile | 14:31 |
| ghostmansd | we already have crap like andis. | 14:31 |
| ghostmansd | where there's no andis | 14:31 |
| ghostmansd | So I thought it'd be better to keep what we have in mdwn | 14:32 |
| ghostmansd | uint64_t inv = svp64_insn_get_prefix_rm_ldst_imm_prrc0_inv (&svp64->insn); | 14:32 |
| ghostmansd | uint64_t els = svp64_insn_get_prefix_rm_ldst_imm_prrc0_els (&svp64->insn); | 14:33 |
| ghostmansd | uint64_t RC1 = svp64_insn_get_prefix_rm_ldst_imm_prrc0_RC1 (&svp64->insn); | 14:33 |
| ghostmansd | Feel the power of fields lol | 14:33 |
| ghostmansd | what for fuck's sake is SEA? | 14:34 |
| ghostmansd | Should it be /sea? | 14:34 |
| ghostmansd | I mean, at the point when user calls some ld/st instruction, how he affects it to set SEA? | 14:37 |
| ghostmansd | how does he affect* | 14:37 |
| lkcl | yehyeh | 14:40 |
| lkcl | signed effective address | 14:41 |
| lkcl | it's for when you do elwidth overrides to below 64-bit | 14:41 |
| lkcl | so you have a register RB which is now only 32-bit | 14:41 |
| lkcl | it gets added to RA (64-bit) | 14:41 |
| lkcl | do you add 32-bit RB to 64-bit RA as signed or unsigned? | 14:42 |
| lkcl | both are useful | 14:42 |
| lkcl | hence /sea | 14:42 |
| lkcl | and yes it hasn't been added in to sv/trans/svp64.py sigh | 14:42 |
| ghostmansd[m] | Ok, I'll add it | 14:50 |
| ghostmansd[m] | To all of them | 14:50 |
| ghostmansd[m] | I'd be simpler to keep track | 14:50 |
| ghostmansd[m] | pysvp64asm, pysvp64dis and binutils | 14:50 |
| lkcl | ack leave it with you | 15:10 |
| * lkcl going to try doing RC1 in ISACaller | 15:10 | |
| lkcl | actually get it working, so programmerjake has something for "sv.pcdec./ff=RC1" | 15:11 |
| ghostmansd | lkcl, there's (again!) contradiction between the code and the spec | 15:50 |
| ghostmansd | is SEA available in simple ld/st idx mode? | 15:50 |
| ghostmansd | From the spec, it is | 15:50 |
| ghostmansd | From the code, tables consider this to be part of the mask and assume bit 3 of mode to be 0 | 15:51 |
| ghostmansd | I fixed the code, cf. binutils branch. Please update the spec otherwise. | 15:52 |
| ghostmansd | cf. LD/ST Indexed here: https://libre-soc.org/openpower/sv/ldst/ | 15:53 |
| ghostmansd | I think you simply did copy&paste error from LD/ST Immediate in the code | 15:58 |
| ghostmansd | Also, how's setting ld/st idx stride mode is done? `sv.ldux/sea/ff=RC1 5,6,7` is the only (except for ~RC1) way I could think of. But this will lead to `sv.ldux/sea/sz r5,r6,r7`, since pysvp64asm sets only DZ field. | 16:14 |
| ghostmansd | Again, this all is extremely inconsistent and confusing. | 16:14 |
| ghostmansd | Othewise there must be some other way to set sv_mode = 0b01 for ld/st idx. ff explicitly bans src_zero (with totally misleading comment). | 16:16 |
| ghostmansd | # "failfirst" modes | 16:16 |
| ghostmansd | elif sv_mode == 0b01: | 16:16 |
| ghostmansd | assert src_zero == 0, "dest-zero not allowed in failfirst mode" | 16:16 |
| ghostmansd | I'll leave this mess to you to sort out. For now only ld/st idx simple mode is supported and test for it is pushed too. | 16:18 |
| *** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has quit IRC | 16:22 | |
| lkcl | 1 sec | 16:53 |
| lkcl | ermmm ermermerm... | 16:53 |
| lkcl | SEA is *only* available in LD/ST-indexed | 16:54 |
| lkcl | i haven't implemented SEA so you're literally the first person to look at it. | 16:57 |
| lkcl | fail-first mode is *not* possible in LD/ST-indexed | 16:58 |
| ghostmansd[m] | Well, how does one set 0b01 sv_mode? | 17:00 |
| ghostmansd[m] | I want ld/st idx strided. It's 0b01. How can I set it? Currently the only option to set sv_mode to 0b01 is ff. | 17:02 |
| ghostmansd[m] | And, well, ff conflicts with ld/st idx strided. | 17:02 |
| ghostmansd[m] | (not to mention it's not in the table at all). | 17:03 |
| ghostmansd[m] | If strided was the only mode to allow SEA, I'd have thought that it's /sea itself to set the mode. But it's not the case, SEA is in simple mode too. | 17:04 |
| lkcl | it should be "/els" | 17:04 |
| ghostmansd[m] | Sorry, there's no bit in spec named els. | 17:05 |
| ghostmansd[m] | That was another guess. | 17:05 |
| ghostmansd[m] | If SEA is only in simple, then els will replace SEA in strided mode. | 17:05 |
| lkcl | no, you *use* "/els" to set mode=0b01 | 17:05 |
| lkcl | see this? | 17:05 |
| lkcl | if is_ldst: | 17:06 |
| lkcl | # TODO: for now, LD/ST-indexed is ignored. | 17:06 |
| lkcl | mode |= ldst_elstride << SVP64MODE.ELS_NORMAL # el-strided | 17:06 |
| lkcl | i haven't added it | 17:06 |
| lkcl | let me sort that | 17:06 |
| ghostmansd[m] | OK | 17:07 |
| lkcl | elif encmode == 'els': | 17:08 |
| lkcl | ldst_elstride = 1 | 17:08 |
| lkcl | + # in indexed mode, set sv_mode=0b01 | 17:08 |
| lkcl | + if is_ldst_idx: | 17:08 |
| lkcl | + sv_mode = 0b01 | 17:08 |
| lkcl | 1 sec | 17:08 |
| ghostmansd[m] | Keep in mind that other mode prints /els | 17:08 |
| ghostmansd[m] | IIRC normal mode | 17:08 |
| lkcl | yes. | 17:08 |
| lkcl | you mean ldst-imm | 17:08 |
| ghostmansd[m] | Ah yes | 17:08 |
| ghostmansd[m] | Sorry | 17:08 |
| ghostmansd[m] | Well anything that inherits from ElsBaseRM | 17:08 |
| ghostmansd[m] | Or how I called it | 17:08 |
| lkcl | ok let's add a test for it... | 17:09 |
| lkcl | really should have that check that RA and RB must be scalar, but hey | 17:10 |
| lkcl | nggggh | 17:12 |
| lkcl | ok done sv/trans/svp64.py | 17:20 |
| lkcl | when LDST_IDX is detected, and "/els" is used, that's when mode=0b01 is allowed | 17:22 |
| markos | [ OK ] SVP64/VpxVarianceTest.OneQuarter/9 (70950 ms) | 17:29 |
| markos | [----------] 40 tests from SVP64/VpxVarianceTest (25578716 ms total) | 17:29 |
| markos | [----------] Global test environment tear-down | 17:29 |
| markos | [==========] 40 tests from 1 test suite ran. (25578717 ms total) | 17:29 |
| markos | [ PASSED ] 40 tests. | 17:29 |
| markos | finally | 17:29 |
| lkcl | markos, aawesome :) | 17:30 |
| lkcl | that took a while | 17:30 |
| markos | yeah, and I had to remove the 64x64 blocks and even more reduce the number of iterations, it took more than 24h and it was still doing 64x64 blocks in the morning :D | 17:30 |
| markos | so I'm thinking of trimming the variance tests to only include the functions I've done so far -2 more remaining but they're mostly the same stuff- and consider VP9 done and move to VP8 to a slightly more complicated function (quantize) | 17:31 |
| markos | is that ok with you? | 17:31 |
| markos | have to be afk now, will commit the stuff so far | 17:32 |
| lkcl | yep perfect | 17:32 |
| ghostmansd | lkcl, just returned to laptop. Thank you for patches, will take a look now and update binutils. | 18:26 |
| ghostmansd | lkcl, apparently you didn't merge binutils patches, right? | 18:27 |
| ghostmansd | aaah I see | 18:27 |
| ghostmansd | OK | 18:27 |
| ghostmansd | never mind :-) | 18:27 |
| ghostmansd | I think this is really hacky: if sv_mode == 0b01 and is_ldst_idx: | 18:40 |
| ghostmansd | Other modes are somewhat "unified", except for perhaps branches. | 18:41 |
| ghostmansd | lkcl, I've been thinking that some specifiers should perhaps require setting mode before allowing to use them. | 18:43 |
| ghostmansd | Not like "collect all specifiers and potentially set the mode, then finally post-check some stuff like SEA and diagnose that the mode was not set". But, instead, "if we found /sea and mode is still not set, immediately surrender and suggest the correction". | 18:45 |
| ghostmansd | Rationale is that this relationship is more obvious, and also makes the code a bit more linear. | 18:46 |
| lkcl | yes that makes sense | 18:46 |
| ghostmansd | What do you think? | 18:46 |
| ghostmansd | OK, good! | 18:47 |
| ghostmansd | I have to admit that the assembly part is outdated a lot. | 18:47 |
| lkcl | still probably needs either a 2-pass or something | 18:47 |
| lkcl | uhhuhn | 18:47 |
| ghostmansd | I added zz, will add SEA and other stuff, but this will need refactoring anyway. | 18:47 |
| ghostmansd | I don't have time to do it in scope of disassembly. | 18:47 |
| ghostmansd | Another problem is that I'd like to refactor commits in a way so that both assembly and disassembly for each specifier appears together with the tests for binutils. | 18:48 |
| ghostmansd | And this will take a lot of time and should preferably be done after we refactor our reference assembler. | 18:48 |
| lkcl | yeah that makes sense although don't push it unnecessarily | 18:48 |
| ghostmansd | So this saga is not over. :-( | 18:49 |
| lkcl | joooy | 18:49 |
| ghostmansd | Waht do you mean by pushing unnecessarily?\ | 18:49 |
| ghostmansd | You mean that upstream branch? :-) | 18:49 |
| ghostmansd | Removed by Alan | 18:49 |
| *** lxo <lxo!~lxo@linux-libre.fsfla.org> has joined #libre-soc | 18:49 | |
| lkcl | oh were you referring to binutils? | 18:53 |
| lkcl | i thought you mean refactor openpower-isa repo commits | 18:54 |
| ghostmansd | ah no | 18:57 |
| ghostmansd | I meant binutils | 18:57 |
| ghostmansd | currently I have a lot of commits which add specifiers to assembly | 18:57 |
| ghostmansd | then some commits which support these in disassembly | 18:58 |
| ghostmansd | on the other hand, in disassembly, we support these in per-mode fashion | 18:58 |
| ghostmansd | (which frankly we should do in assembly too) | 18:58 |
| *** lxo <lxo!~lxo@linux-libre.fsfla.org> has quit IRC | 19:36 | |
| *** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has joined #libre-soc | 19:40 | |
| ghostmansd | I've walked over 38 commits (some are really huge) and synced them to some degree with pysvp64asm (SEA, els, etc.). Stuff I don't sync now includes VLi and branches: these are handled by new fields, and I'll handle them in scope of switching the whole binutils assembly to this mechanism. | 20:10 |
| ghostmansd | Upon disassembly, this already proved to be a perfect choice; for assembly, the only change I'd like to have is to have modes enforced (e.g. forbid SEA without /els and non-LDST-idx, allow VLi only for normal failfirst Rc=0, etc. | 20:12 |
| ghostmansd | Tomorrow I hope to complete CR ops and branches in disassembly. | 20:13 |
| lkcl | fantastic | 20:39 |
| *** zemaye__ <zemaye__!~zemaye@172.58.160.38> has joined #libre-soc | 21:04 | |
| *** octavius <octavius!~octavius@227.147.93.209.dyn.plus.net> has quit IRC | 21:06 | |
| *** zemaye <zemaye!~zemaye@172.58.30.210> has joined #libre-soc | 21:06 | |
| *** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has quit IRC | 21:07 | |
| *** zemaye__ <zemaye__!~zemaye@172.58.160.38> has quit IRC | 21:09 | |
| *** zemaye_ <zemaye_!~zemaye@31-209-215-224.dsl.dynamic.simnet.is> has joined #libre-soc | 21:36 | |
| *** zemaye <zemaye!~zemaye@172.58.30.210> has quit IRC | 21:38 | |
| *** openpowerbot <openpowerbot!~openpower@94-226-188-34.access.telenet.be> has quit IRC | 21:48 | |
| *** openpowerbot <openpowerbot!~openpower@94-226-188-34.access.telenet.be> has joined #libre-soc | 21:48 | |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!