Commit Graph

9 Commits

Author SHA1 Message Date
thomwolf
8b388827b5 fix #1920 2019-12-05 11:18:43 +01:00
Julien Chaumond
ef1b8b2ae5 [CTRL] warn if generation prompt does not start with a control code
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Lysandre
777faa8ae7 Fix #1597 2019-10-22 11:26:42 -04:00
thomwolf
177a721205 move back to simple space spliting 2019-10-10 11:45:47 +02:00
thomwolf
43a237f15e switching to moses tokenizer 2019-10-10 10:11:16 +02:00
LysandreJik
036483fae5 Temporary CTRL tokenizer fix 2019-10-09 16:33:15 -04:00
thomwolf
248314772f fix tokenization 2019-10-08 17:19:28 +02:00
thomwolf
03c2c762a6 update tokenizer 2019-10-08 17:12:03 +02:00
keskarnitish
dbed1c5d94 Adding CTRL (squashed commit)
adding conversion script

adding first draft of modeling & tokenization

adding placeholder for test files

bunch of changes

registering the tokenizer/model/etc

tests

change link; something is very VERY wrong here

weird end-of-word thingy going on

i think the tokenization works now ; wrote the unit tests

overall structure works;load w next

the monster is alive!

works after some cleanup as well

adding emacs autosave to gitignore

currently only supporting the 48 layer one; seems to infer fine on my macbook

cleanup

fixing some documentation

fixing some documentation

tests passing?

now works on CUDA also

adding greedy?

adding greedy sampling

works well
2019-10-03 22:29:03 -07:00