AudioGenX: Text-to-Audio Generation eXplainable AI

Requirements

python==3.9
spacy==3.5.2
torch>=2.1.0
torchaudio>=2.1.0
transformers>=4.31.0

Refer to requirements.txt for more details.

Usage

  pip install -r requirements.txt
  
  # If you want to use the model in your project, add the project path to the sys path in head of the codes
  import sys
  project_path = '' # absolute path of your project
  sys.path.append(project_path)
  
  # if you want to generate audios from scratch, run the following scripts
  python gen_audio.py # generate explanation mask and factual & counterfactual audio

  # if you want to explain generated audios
  python explain.py # explain generated_audio & evalaute factual and counterfactual audio

AudioGen Model Setting

For this project, we initially used AudioGen. Explanation methods have not yet been implemented on MusicGen.

AudioGen
    Defalut model   : facebook/audiogen-medium
    Duration        : 5sec
    Top_k sampling  : 250
    sample_rate     : 16000

more detail about audio generate models

AudioGen: A state-of-the-art text-to-sound model.

Metrics

Models performance measures: We used the following objective measure to evaluate the model on a standard audio benchmark:
- Kullback-Leibler Divergence on label distributions extracted from a pre-trained audio classifier (PaSST)

File Structure

AudioGenX/

├── audiocraft/                     
        │
        └── models/                 
               ├── audiogen.py      # audigen 
               ├── lm.py            # predict sequence of audio token
               ├── mask.py          # explanation mask 
               └── explainer.py     # generate explanation mask for generated audio                     
├── data/                           # textual prompts for evaluation
├── AudioGenX_demo                  # demo 
├── gen_audio.py                    # generate audios
├── explain.py                      # train explanation mask & evaluate factual and counterfactual audios
├── example/                        # generated factual and counterfactual audios
├── readme.md                       
└── requirements.txt

License

AudioGen model and codes are from audiocraft by Facebook Research.
See license information in the model card.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AudioGenX: Text-to-Audio Generation eXplainable AI

Requirements

Usage

AudioGen Model Setting

Metrics

File Structure

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
audiocraft		audiocraft
data		data
example		example
AudioGenX_demo.ipynb		AudioGenX_demo.ipynb
explain.py		explain.py
gen_audio.py		gen_audio.py
readme.md		readme.md
requirements.txt		requirements.txt

hjkng/audiogenX

Folders and files

Latest commit

History

Repository files navigation

AudioGenX: Text-to-Audio Generation eXplainable AI

Requirements

Usage

AudioGen Model Setting

Metrics

File Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages