* add first generation tutorial * VisionEnocderDecoder gradient checkpointing * remove generation * add tests