Simon Layton
899883644f
Fix test fails and warnings
...
Attention output was in bnij ordering instead of ijbn which everything
else will expect. This was an oversight on my part, and keeps the
attention inputs/outputs identical to the original code.
Also moved back from tensor slicing to index_select in rel_shift_bnij to
make the tracer happy.
2019-10-03 12:05:15 -04:00
..
2019-09-27 17:32:28 +02:00
2019-09-26 10:15:53 +02:00
2019-09-29 19:51:01 -04:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 12:02:54 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 12:04:47 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-27 19:03:55 -04:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-10-01 18:17:48 -04:00
2019-09-26 10:15:53 +02:00
2019-10-03 12:05:15 -04:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 10:15:53 +02:00
2019-09-26 16:49:03 -04:00
2019-09-26 10:15:53 +02:00
2019-09-27 17:02:53 -04:00
2019-09-26 10:15:53 +02:00
2019-10-01 19:09:13 -04:00
2019-09-26 10:15:53 +02:00
2019-09-26 12:02:54 +02:00