8000 GitHub - hjkng/audiogenX
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

hjkng/audiogenX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AudioGenX: Text-to-Audio Generation eXplainable AI

Requirements

  • python==3.9
  • spacy==3.5.2
  • torch>=2.1.0
  • torchaudio>=2.1.0
  • transformers>=4.31.0

Refer to requirements.txt for more details.

Usage

  pip install -r requirements.txt
  
  # If you want to use the model in your project, add the project path to the sys path in head of the codes
  import sys
  project_path = '' # absolute path of your project
  sys.path.append(project_path)
  
  # if you want to generate audios from scratch, run the following scripts
  python gen_audio.py # generate explanation mask and factual & counterfactual audio

  # if you want to explain generated audios
  python explain.py # explain generated_audio & evalaute factual and counterfactual audio

AudioGen Model Setting

For this project, we initially used AudioGen. Explanation methods have not yet been implemented on MusicGen.

AudioGen
    Defalut model   : facebook/audiogen-medium
    Duration        : 5sec
    Top_k sampling  : 250
    sample_rate     : 16000

more detail about audio generate models

  • AudioGen: A state-of-the-art text-to-sound model.

Metrics

Models performance measures: We used the following objective measure to evaluate the model on a standard audio benchmark:
- Kullback-Leibler Divergence on label distributions extracted from a pre-trained audio classifier (PaSST)

File Structure

AudioGenX/

├── audiocraft/                     
        │
        └── models/                 
               ├── audiogen.py      # audigen 
               ├── lm.py            # predict sequence of audio token
               ├── mask.py          # explanation mask 
               └── explainer.py     # generate explanation mask for generated audio                     
├── data/                           # textual prompts for evaluation
├── AudioGenX_demo                  # demo 
├── gen_audio.py                    # generate audios
├── explain.py                      # train explanation mask & evaluate factual and counterfactual audios
├── example/                        # generated factual and counterfactual audios
├── readme.md                       
└── requirements.txt                

License

AudioGen model and codes are from audiocraft by Facebook Research.
See license information in the model card.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
0