*** lx0 <lx0!~lxo@gateway/tor-sasl/lxo> has quit IRC | 05:01 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 06:52 | |
lkcl | ok :) | 08:53 |
---|---|---|
lkcl | commit it let's have a look | 08:54 |
programmerjake | oh, btw, i realized that using all the pseudocode functions from the vsx section of the powerisa spec might not go over so well, so i'm thinking of rewriting the pseudocode for fcvt[f/t]g | 08:57 |
programmerjake | it would either be based off iirc book 1 appendix A which has suggested fp models, or just write the pseudocode without using existing fp functions | 08:59 |
programmerjake | except SINGLE/DOUBLE of course | 09:00 |
programmerjake | i did at least check that all the vsx functions i used don't refer to any vsx-only registers afaict | 09:01 |
programmerjake | lkcl & others: any comments? | 09:02 |
programmerjake | the vsx/vmx functions i used are in powerisa v3.1b book 1 sections 6.2.1 and 7.6.2.2 | 09:05 |
programmerjake | actually mostly just 7.6.2.2 | 09:07 |
markos | lkcl, https://bugs.libre-soc.org/show_bug.cgi?id=1030 | 09:31 |
markos | running some final tests, I will commit the code so far in a while, refactored the code a bit better so that the constants are not created on every loop iteration but only at the start, so extra speedup | 09:32 |
markos | lkcl, committed | 09:40 |
markos | the smaller function xchacha_hchacha20_svp64 works if used alone, when I enable (in test.c) the encrypt_bytes_svp64 function as well | 09:41 |
markos | then I'm getting failures, but *only* if I enable the quarterround macro call in src/xchacha_encrypt_bytes_svp64.s L62 | 09:42 |
markos | while at the same time I have to disable the quarterround loop in src/xchacha20.c L231 | 09:43 |
markos | at the end of the wrapper I'm displaying both outputs | 09:43 |
markos | and in that case I'm getting the same outputs | 09:43 |
markos | I thought I was doing some register clobbering but I double checked and I don't do that | 09:43 |
markos | so it's something else that I'm missing | 09:44 |
markos | all in all, assuming it's something easy to fix, total code size 252 SLOC, 179 without comments :) | 09:46 |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.164.248> has joined #libre-soc | 12:08 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@176.59.164.248> has quit IRC | 13:55 | |
*** ghostmansd[m] <ghostmansd[m]!~ghostmans@broadband-109-173-83-100.ip.moscow.rt.ru> has joined #libre-soc | 13:56 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC | 14:31 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 14:31 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC | 14:36 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 14:36 | |
*** octavius <octavius!~octavius@2a01:4c8:807:3f72:383:6508:2386:a130> has joined #libre-soc | 16:52 | |
octavius | lkcl, do you remember if there's any documented information on the ls180 test chip? The wiki page is a little sparse. | 16:55 |
octavius | It' would be nice to get a sample running if it actually works | 16:55 |
*** octavius <octavius!~octavius@2a01:4c8:807:3f72:383:6508:2386:a130> has quit IRC | 17:11 | |
lkcl | markos, if they do exactly the same job you can always run both (macro, non-macro) then diff -u the logs | 17:16 |
markos | lkcl, tht's what I'm doing right now | 17:18 |
markos | trying to figure out where it differs | 17:18 |
*** octavius <octavius!~octavius@2a01:4c8:807:3f72:a99:4e65:493e:98dd> has joined #libre-soc | 17:18 | |
markos | lkcl, is there a way to reset svstate for svindex/svshape? | 17:41 |
markos | no, that's not it | 18:35 |
lkcl | just don't use it. you do need to watch out for the "persistence" bit of svremap. | 18:54 |
lkcl | svstate "establishes" the shapes 0-3 | 18:54 |
lkcl | svremap says *which* shape applies to *which register* [and for how long - one instruction or all future instructions] | 18:55 |
markos | trying to understand where it goes wrong, I've filled the place with prints to see what is different in pretty much every step | 19:20 |
markos | I *did* find one mistake I had made, a very sneaky one | 19:20 |
markos | I called the original C function on the same buffer and checked the result | 19:21 |
markos | however, I did not notice that the XChaCha_ctx struct has a "counter" embedded | 19:21 |
markos | so the actual input data in the context was different | 19:22 |
markos | I only figured out compairing the register contents with the data from C | 19:22 |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC | 19:41 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 19:43 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 20:00 | |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 20:03 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC | 21:31 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 21:47 | |
*** gnucode <gnucode!~gnucode@user/jab> has quit IRC | 22:20 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has quit IRC | 22:24 | |
*** lxo <lxo!~lxo@gateway/tor-sasl/lxo> has joined #libre-soc | 22:31 | |
markos | lkcl, getting there: | 22:39 |
markos | msglen: 34 | 22:39 |
markos | correct: | 22:39 |
markos | ee a7 c2 71 19 10 65 69 92 e1 ce d8 16 e2 0e 62 1b 25 17 82 36 71 6a e4 99 f2 97 37 a7 2a fc f8 6c 72 | 22:39 |
markos | svp64: | 22:39 |
markos | ee a7 c2 71 19 10 65 69 92 e1 ce d8 16 e2 0e 62 1b 25 17 82 36 71 6a e4 99 f2 97 37 a7 2a fc f8 00 28 | 22:39 |
programmerjake | looks like just the last two bytes are busted...progress | 22:41 |
markos | yup and just found the problem | 22:41 |
markos | I store 64-bit values, using sv.std | 22:42 |
markos | so I do len >> 3, and for len=34, len >> 3 = 4 -> 32 bytes | 22:43 |
programmerjake | so you forgot the remainder? | 22:44 |
markos | yes | 22:44 |
markos | I wonder if it's proper to just store the next 64-bit word even if not all elements are in the buffer | 22:45 |
programmerjake | you could store to a temp storage location and copy the bytes you need, or, once elwid works you can setvl remainder-len and store bytes from the reg holding the word | 22:50 |
programmerjake | (swapping needed on BE) | 22:50 |
markos | LE assumed for now for simplicity | 22:50 |
markos | for now I'll just store one more word depending on size % 8 | 22:51 |
programmerjake | well as long as you can guarantee you're not buffer overflowing... | 22:52 |
markos | yeah it's not a perfect solution | 22:53 |
markos | once we get elwidth I can do it properly then, I'll just add a TODO in the code | 22:53 |
markos | lol | 23:05 |
markos | no, it was even dumber than that | 23:05 |
markos | the assembly was correct | 23:05 |
markos | in the wrapper I only copied 32 bytes from the pypowersim memory object to the buffer :D | 23:06 |
markos | Cryptographic tests passed | 23:21 |
markos | fucking finally | 23:21 |
programmerjake | yay! | 23:38 |
*** gnucode <gnucode!~gnucode@user/jab> has joined #libre-soc | 23:57 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!