8000 Implementation of ESIM model by matt-peters · Pull Request #1469 · allenai/allennlp · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.

Implementation of ESIM model #1469

Merged
merged 28 commits into from
Jul 10, 2018
Merged

Implementation of ESIM model #1469

merged 28 commits into from
Jul 10, 2018

Conversation

matt-peters
Copy link
Contributor

A modified version of the ESIM model used in "Deep contextualized word representations" (http://aclweb.org/anthology/N18-1202)

@matt-peters matt-peters requested a review from DeNeutoy July 9, 2018 22:10
Copy link
Contributor
@DeNeutoy DeNeutoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, couple of comments but basically LGTM

from allennlp.nn.util import get_text_field_mask, last_dim_softmax, weighted_sum, replace_masked_values
from allennlp.training.metrics import CategoricalAccuracy

class InputVariationalDropout(torch.nn.Dropout):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you pull this out into a module under allennlp.modules? It's generally useful (infact I was just about to implement the same thing for the dependency parser)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def predict(self, sentence1: str, sentence2: str) -> JsonDict:
"""
Predicts whether the sentence2 is entailed by the sentence1 text.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this predictor is actually needed - it's exactly the same as the decomposable_attention predictor. You can just delete this file, I think. If you wanted to, you could rename the decomposable attention predictor to be called entailment or something more general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I remember why I added it -- the decomposable attention predictor uses premise and hypothesis but the SNLI / MultiNLI data uses sentence1 and sentence2. This means the existing predictor can't be used with the standard datasets (to e.g. write out predictions to a file for submitting the the leaderboard). Is there an easy way to over ride the existing tensor names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hacky alternative (to my hacky fix that cuts and pastes the decomposable attention predictor and change the names) is to just modify the incoming json to change the keys, then modify again to back to the original keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nelson-liu and I were discussing a related issue -- he wanted to use the same model with different dataset readers (which produced differently named fields), we kicked around several bad ideas but never got to any we liked (I don't think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for removing the ESIM predictor and will do so now, and open an issue to provide a general solution.


Parameters
----------
input_tensor: torch.FloatTensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double backticks around the types here make them render nicely in the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if self.rnn_input_dropout:
projected_enhanced_premise = self.rnn_input_dropout(projected_enhanced_premise)
projected_enhanced_hypothesis = self.rnn_input_dropout(projected_enhanced_hypothesis)
v_ai = self._inference_encoder(projected_enhanced_premise, premise_mask)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marginally more informative variable names would be better here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names follow the notation in the original paper and are helpful if someone wanted to align the code with equations.

@matt-peters matt-peters merged commit ff41dda into allenai:master Jul 10, 2018
@matt-peters matt-peters deleted the mp/esim2 branch July 10, 2018 00:05
gabrielStanovsky pushed a commit to gabrielStanovsky/allennlp that referenced this pull request Sep 7, 2018
* WIP: ESIM model

* WIP: ESIM model for SNLI

* WIP: ESIM

* WIP: ESIM

* WIP: ESIM

* WIP: ESIM

* ESLM model with ELMo

* Add a ESIM predictor that works with SNLI formatted files

* Move ESIM predictor

* Clean up

* Add test for ESIM

* Add predictor for ESIM

* pylint

* pylint

* mypy

* fix the docs

* ESIM predictor

* Add comment to esim training config

* Move InputVariationalDropout

* pylint

* Fix the docs

* fix the docs

* Remove ESIM predictor

* Scrub all of ESIMPredictor
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0