8000 GitHub - NasimPalvayeh/mmf: A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

License

Notifications You must be signed in to change notification settings

NasimPalvayeh/mmf

 
 

Repository files navigation

MoEVilbert


image



Architecture of MoEVilbert

MoEVilBERT extends the single-task ViLBERT model by incorporating expert models as its base. The outputs from these experts are fused and passed to task-specific networks. A critical component of the model is the gating network, which dynamically generates weights based on input to combine embeddings from each expert. The final embedding is a weighted sum of the expert outputs.

The model consists of 860,386,003 parameters in total, with only 10,723,795 trainable parameters. This low ratio is achieved by freezing a significant portion of the parameters, including those of the experts.

About

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.0%
  • Other 1.0%
0