8000 add min_padding_length to TokenCharactersIndexer (#1954) by WrRan · Pull Request #1967 · allenai/allennlp · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.

add min_padding_length to TokenCharactersIndexer (#1954) #1967

Merged
merged 8 commits into from
Oct 26, 2018
Merged

add min_padding_length to TokenCharactersIndexer (#1954) #1967

merged 8 commits into from
Oct 26, 2018

Conversation

WrRan
Copy link
Contributor
@WrRan WrRan commented Oct 25, 2018

patch to #1954

Although the parameter min_padding_length is provided a default value (0), I think it may be a better idea with no default value.

@@ -33,10 +36,12 @@ class TokenCharactersIndexer(TokenIndexer[List[int]]):
"""
# pylint: disable=no-self-use
def __init__(self,
min_padding_length: int = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be added at the end of the argument list, otherwise you'll break existing calls to the constructor (which is why tests are failing). This is also why it needs to have a default value of 0 - we don't want to break backwards compatibility.

And, lastly, it's important for every new piece of functionality in the core library to have a test making sure it does what it's supposed to. Can you add a test making sure that things actually get padded correctly when this parameter is used?

@WrRan
Copy link
Contributor Author
WrRan commented Oct 25, 2018 via email

Copy link
Contributor
@matt-gardner matt-gardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, it looks very close to done. We just need to switch the line endings back to how they were in the token indexer, and simplify the test a bit.

assert padded == {"char": [[2, 3, 3, 4, 5, 6, 7, 8],
[9, 10, 0, 0, 0, 0, 0, 0],
[11, 12, 4, 10, 13, 14, 4, 0],
[15, 0, 0, 0, 0, 0, 0, 0]]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 84 through 126 here test functionality that's already tested elsewhere. The only thing we really need to test is the last case that you have, starting at line 128.

key, value = item.keys(), item.values()
assert set([key_padding_lengths]) == set(key)
value_padding_lengths = max(value_padding_lengths, max(value))
assert value_padding_lengths == 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for lines 130 through 141; you can just go straight to computing the padding, and just have the assert that's on line 145. The rest is all implicitly tested with that one assert.

@@ -1,113 +1,119 @@
from typing import Dict, List
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you ended up changing the line endings on this file to DOS line endings. Can you switch it back?

@WrRan
Copy link
Contributor Author
WrRan commented Oct 26, 2018

OK. PR's coming soon.

Copy link
Contributor
@matt-gardner matt-gardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0