You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 3, 2021. It is now read-only.
It's erroneous to say that you can train your own text on the 774M and 1558M because the system won't allow it. Only the small and medium models allow that. Just FYI. Good work nonetheless. Could work on Google Cloud notebook paying a bit, but haven't checked yet. Won't do on Colab.
The text was updated successfully, but these errors were encountered:
Hello there,
Currently, this library is based on the original OpenAI gpt-2 repo. So, the larger models can only be used for inference instead of fine-tuning. With time, I'll be branching off from the original and creating the entire model from scratch which will allow fine-tuning for larger models as well.
Is the problem of finetuning the 1558M a GPU processing limitation or something else? Do you think running your client on an AWS instance with multiple GPU's would be worth the attempt?
The 1558M doesn't make much of a difference from what I experienced. Finetuning on the 355M is good enough―so much fanfare for nothing, total publicity stunt. GPT-2 seems pretty much stuck and can't get any better, remarkable enough though. It really depends on the training text you upload. I don't mean quality. Sometimes it comes out with some creative/funky text with as little as 40 step training but you have jack up the temperature sometimes over 1.5. GPT-2 can you give good snippets but it goes all over the place and can't finish a story.
I tried to finetune on a with google cloud console but it gives me an error, I'm in SouthEast Asia at the moment, I don't know what the problem is. I don't use AWS but you should try to clone it and see what happens. It needs tons of RAM. When you upload previous checkpoints over 1 gig to re-finetune it crashes.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Uh oh!
There was an error while loading. Please reload this page.
It's erroneous to say that you can train your own text on the 774M and 1558M because the system won't allow it. Only the small and medium models allow that. Just FYI. Good work nonetheless. Could work on Google Cloud notebook paying a bit, but haven't checked yet. Won't do on Colab.
The text was updated successfully, but these errors were encountered: