MoEVilbert

Architecture of MoEVilbert

MoEVilBERT extends the single-task ViLBERT model by incorporating expert models as its base. The outputs from these experts are fused and passed to task-specific networks. A critical component of the model is the gating network, which dynamically generates weights based on input to combine embeddings from each expert. The final embedding is a weighted sum of the expert outputs.

The model consists of 860,386,003 parameters in total, with only 10,723,795 trainable parameters. This low ratio is achieved by freezing a significant portion of the parameters, including those of the experts.

Name		Name	Last commit message	Last commit date
Latest commit History 1,099 Commits
.circleci		.circleci
.github		.github
docs		docs
mmf		mmf
mmf_cli		mmf_cli
projects		projects
tests		tests
tools		tools
website		website
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
NOTICES		NOTICES
PACKAGE		PACKAGE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MoEVilbert

About

Uh oh!

Releases

Packages

Languages

License

NasimPalvayeh/mmf

Folders and files

Latest commit

History

Repository files navigation

MoEVilbert

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages