Fix how we compute the final non-padding token for ForSequenceClassification models (#35911) (694aaa7f) · Commits · 某某某 / transformers-new

Unverified Commit 694aaa7f authored 4 months ago by

Matt Committed by GitHub 4 months ago

Fix how we compute the final non-padding token for ForSequenceClassification models (#35911)

* Fix how we compute the final non-padding token for Gemma (and probably other models)

* .size() -> .shape[]

* Propagating changes to other models

* Propagating changes to other models

* Change it for all ForSequenceClassification models

* Fix batch dim

* More TF fixes

* Copy the TF fix around as well

* Correct layer name for TFCTRL

* Cleaner .to()

* Clean up the nested if-else

* Use argmax() instead of .max().values

parent 531d1511

Hide whitespace changes

Inline Side-by-side

Showing with 244 additions and 248 deletions

+244 -248

Please register or to comment