8000 Resize T5 Vocab by dirkgr · Pull Request #5497 · allenai/allennlp · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.

Resize T5 Vocab #5497

Merged
merged 7 commits into from
Dec 8, 2021
Merged

Resize T5 Vocab #5497

merged 7 commits into from
Dec 8, 2021

Conversation

dirkgr
Copy link
Member
@dirkgr dirkgr commented Dec 7, 2021

@dirkgr dirkgr requested a review from AkshitaB December 7, 2021 00:53
@dirkgr dirkgr self-assigned this Dec 7, 2021
@HarshTrivedi
Copy link
Contributor

Thank you! @dirkgr

Copy link
Contributor
@AkshitaB AkshitaB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple comments. Looks fine to me, overall. Will resize_token_embeddings be called internally somewhere? When does the user call it? Should there be a docstring for what init_fn is supposed to be (it wasn't clear to me without reading the code)?


# resize lm head
old_size = self.lm_head.out_features
if old_size == new_size:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing: Maybe we can do this check first thing? It'll avoid the 2 empty calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps not? We want to resize the embedding dim and the output dim to the same new_size. Will the old embedding_dim and the old lm_head.out_features be different?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of defensive programming. It should never happen, but if the three places where the size of the token embedding matters get out of sync, this call will sync them all back up.

Copy link
Member Author
@dirkgr dirkgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a docstring in 074975f, to hopefully make it clearer what init_fn does.


# resize lm head
old_size = self.lm_head.out_features
if old_size == new_size:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of defensive programming. It should never happen, but if the three places where the size of the token embedding matters get out of sync, this call will sync them all back up.

@dirkgr
Copy link
Member Author
dirkgr commented Dec 7, 2021

I also want to point out that I managed to push this and it passed all CI checks the first time 😁.

@dirkgr dirkgr enabled auto-merge (squash) December 7, 2021 22:05
@dirkgr dirkgr merged commit 19f6c8f into main Dec 8, 2021
@dirkgr dirkgr deleted the ResizeT5 branch December 8, 2021 05:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0