-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one, couple of comments but basically LGTM
allennlp/models/esim.py
Outdated
from allennlp.nn.util import get_text_field_mask, last_dim_softmax, weighted_sum, replace_masked_values | ||
from allennlp.training.metrics import CategoricalAccuracy | ||
|
||
class InputVariationalDropout(torch.nn.Dropout): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you pull this out into a module under allennlp.modules
? It's generally useful (infact I was just about to implement the same thing for the dependency parser)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
allennlp/predictors/esim.py
Outdated
|
||
def predict(self, sentence1: str, sentence2: str) -> JsonDict: | ||
""" | ||
Predicts whether the sentence2 is entailed by the sentence1 text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this predictor is actually needed - it's exactly the same as the decomposable_attention
predictor. You can just delete this file, I think. If you wanted to, you could rename the decomposable attention predictor to be called entailment
or something more general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I remember why I added it -- the decomposable attention predictor uses premise
and hypothesis
but the SNLI / MultiNLI data uses sentence1
and sentence2
. This means the existing predictor can't be used with the standard datasets (to e.g. write out predictions to a file for submitting the the leaderboard). Is there an easy way to over ride the existing tensor names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hacky alternative (to my hacky fix that cuts and pastes the decomposable attention predictor and change the names) is to just modify the incoming json to change the keys, then modify again to back to the original keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nelson-liu and I were discussing a related issue -- he wanted to use the same model with different dataset readers (which produced differently named fields), we kicked around several bad ideas but never got to any we liked (I don't think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm all for removing the ESIM predictor and will do so now, and open an issue to provide a general solution.
allennlp/models/esim.py
Outdated
|
||
Parameters | ||
---------- | ||
input_tensor: torch.FloatTensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks around the types here make them render nicely in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if self.rnn_input_dropout: | ||
projected_enhanced_premise = self.rnn_input_dropout(projected_enhanced_premise) | ||
projected_enhanced_hypothesis = self.rnn_input_dropout(projected_enhanced_hypothesis) | ||
v_ai = self._inference_encoder(projected_enhanced_premise, premise_mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
marginally more informative variable names would be better here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These names follow the notation in the original paper and are helpful if someone wanted to align the code with equations.
* WIP: ESIM model * WIP: ESIM model for SNLI * WIP: ESIM * WIP: ESIM * WIP: ESIM * WIP: ESIM * ESLM model with ELMo * Add a ESIM predictor that works with SNLI formatted files * Move ESIM predictor * Clean up * Add test for ESIM * Add predictor for ESIM * pylint * pylint * mypy * fix the docs * ESIM predictor * Add comment to esim training config * Move InputVariationalDropout * pylint * Fix the docs * fix the docs * Remove ESIM predictor * Scrub all of ESIMPredictor
A modified version of the ESIM model used in "Deep contextualized word representations" (http://aclweb.org/anthology/N18-1202)