• Sourab Mangrulkar's avatar
    accelerate deepspeed and gradient accumulation integrate (#23236) · a73b1d59
    Sourab Mangrulkar authored
    * mixed precision support via accelerate
    
    * fix issues
    
    * fix for the sharded ddp case
    
    * fix flax and tf failing tests
    
    * `refactor the place to create `Accelerator` object
    
    * move ddp prep to accelerate
    
    * fix 😅
    
    * resolving comments
    
    * move fsdp handling to accelerate
    
    * fixex
    
    * fix saving
    
    * shift torch dynamo handling to accelerate
    
    * shift deepspeed integration and save & load utils to accelerate
    
    * fix accelerate launcher support
    
    * oops
    
    * fix 🐛
    
    * save ckpt fix
    
    * Trigger CI
    
    * nasty 🐛 😅
    
    * as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate
    
    * make tests happy
    
    * quality 
    
    * loss tracked needs to account for grad_acc
    
    * fixing the deepspeed tests
    
    * quality 
    
    * 😅😅😅
    
    * tests 😡
    
    * quality 
    
    
    
    * Trigger CI
    
    * resolve comments and fix the issue with the previous merge from branch
    
    * Trigger CI
    
    * accelerate took over deepspeed integration
    
    ---------
    
    Co-authored-by: default avatarStas Bekman <stas@stason.org>
    a73b1d59