Unverified Commit a73b1d59 authored by Sourab Mangrulkar's avatar Sourab Mangrulkar Committed by GitHub
Browse files

accelerate deepspeed and gradient accumulation integrate (#23236)

* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix :sweat_smile:

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

* shift deepspeed integration and save & load utils to accelerate

* fix accelerate launcher support

* oops

* fix :bug:

* save ckpt fix

* Trigger CI

* nasty :bug: :sweat_smile:

* as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate

* make tests happy

* quality :sparkles:

* loss tracked needs to account for grad_acc

* fixing the deepspeed tests

* quality :sparkles:

* :sweat_smile::sweat_smile::sweat_smile:

* tests :rage:

* quality :sparkles:



* Trigger CI

* resolve comments and fix the issue with the previous merge from branch

* Trigger CI

* accelerate took over deepspeed integration

---------

Co-authored-by: default avatarStas Bekman <stas@stason.org>
parent 88f50a1e
No related merge requests found
Showing with 167 additions and 164 deletions
+167 -164
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment