• dg845's avatar
    Add UnivNet Vocoder Model for Tortoise TTS Diffusers Integration (#24799) · 7f6a804d
    dg845 authored
    * initial commit
    
    * Add inital testing files and modify __init__ files to add UnivNet imports.
    
    * Fix some bugs
    
    * Add checkpoint conversion script and add references to transformers pre-trained model.
    
    * Add UnivNet entries for auto.
    
    * Add initial docs for UnivNet.
    
    * Handle input and output shapes in UnivNetGan.forward and add initial docstrings.
    
    * Write tests and make them pass.
    
    * Write docs.
    
    * Add UnivNet doc to _toctree.yml and improve docs.
    
    * fix typo
    
    * make fixup
    
    * make fix-copies
    
    * Add upsample_rates parameter to config and improve config documentation.
    
    * make fixup
    
    * make fix-copies
    
    * Remove unused upsample_rates config parameter.
    
    * apply suggestions from review
    
    * make style
    
    * Verify and add reason for skipped tests inherited from ModelTesterMixin.
    
    * Add initial UnivNetGan integration tests
    
    * make style
    
    * Remove noise_length input to UnivNetGan and improve integration tests.
    
    * Fix bug and make style
    
    * Make UnivNet integration tests pass
    
    * Add initial code for UnivNetFeatureExtractor.
    
    * make style
    
    * Add initial tests for UnivNetFeatureExtractor.
    
    * make style
    
    * Properly initialize weights for UnivNetGan
    
    * Get feature extractor fast tests passing
    
    * make style
    
    * Get feature extractor integration tests passing
    
    * Get UnivNet integration tests passing
    
    * make style
    
    * Add UnivNetGan usage example
    
    * make style and use feature extractor from hub in integration tests
    
    * Update tips in docs
    
    * apply suggestions from review
    
    * make style
    
    * Calculate padding directly instead of using get_padding methods.
    
    * Update UnivNetFeatureExtractor.to_dict to be UnivNet-specific.
    
    * Update feature extractor to support using model(**inputs) and add the ability to generate noise and pad the end of the spectrogram in __call__.
    
    * Perform padding before generating noise to ensure the shapes are correct.
    
    * Rename UnivNetGan.forward's noise_waveform argument to noise_sequence.
    
    * make style
    
    * Add tests to test generating noise and padding the end for UnivNetFeatureExtractor.__call__.
    
    * Add tests for checking batched vs unbatched inputs for UnivNet feature extractor and model.
    
    * Add expected mean and stddev checks to the integration tests and make them pass.
    
    * make style
    
    * Make it possible to use model(**inputs), where inputs is the output of the feature extractor.
    
    * fix typo in UnivNetGanConfig example
    
    * Calculate spectrogram_zero from other config values.
    
    * apply suggestions from review
    
    * make style
    
    * Refactor UnivNet conversion script to use load_state_dict (following persimmon).
    
    * Rename UnivNetFeatureExtractor to UnivNetGanFeatureExtractor.
    
    * make style
    
    * Switch to using torch.tensor and torch.testing.assert_close for testing expected values/slices.
    
    * make style
    
    * Use config in UnivNetGan modeling blocks.
    
    * make style
    
    * Rename the spectrogram argument of UnivNetGan.forward to input_features, following Whisper.
    
    * make style
    
    * Improving padding documentation.
    
    * Add UnivNet usage example to the docs.
    
    * apply suggestions from review
    
    * Move dynamic_range_compression computation into the mel_spectrogram method of the feature extractor.
    
    * Improve UnivNetGan.forward return docstring.
    
    * Update table in docs/source/en/index.md.
    
    * make fix-copies
    
    * Rename UnivNet components to have pattern UnivNet*.
    
    * make style
    
    * make fix-copies
    
    * Update docs
    
    * make style
    
    * Increase tolerance on flaky unbatched integration test.
    
    * Remove torch.no_grad decorators from UnivNet integration tests to try to avoid flax/Tensorflow test errors.
    
    * Add padding_mask argument to UnivNetModel.forward and add batch_decode feature extractor method to remove padding.
    
    * Update documentation and clean up padding code.
    
    * make style
    
    * make style
    
    * Remove torch dependency from UnivNetFeatureExtractor.
    
    * make style
    
    * Fix UnivNetModel usage example
    
    * Clean up feature extractor code/docstrings.
    
    * apply suggestions from review
    
    * make style
    
    * Add comments for tests skipped via ModelTesterMixin flags.
    
    * Add comment for model parallel tests skipped via the test_model_parallel ModelTesterMixin flag.
    
    * Add # Copied from statements to copied UnivNetFeatureExtractionTest tests.
    
    * Simplify UnivNetFeatureExtractorTest.test_batch_decode.
    
    * Add support for unbatched padding_masks in UnivNetModel.forward.
    
    * Refactor unbatched padding_mask support.
    
    * make style
    7f6a804d