8000 SpanPruner produces -inf scores when an instance has too few spans · Issue #1696 · allenai/allennlp · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.
This repository was archived by the owner on Dec 16, 2022. It is now read-only.
SpanPruner produces -inf scores when an instance has too few spans #1696
Closed
@julianmichael

Description

@julianmichael

Describe the bug
In the case when there are fewer than num_spans_to_keep total spans in the original text, some padding makes its way into the top_span_scores output of SpanPruner with scores of -inf. Even though the top_spans_mask output is correct, this is a problem because multiplying the scores by the mask produces nan in those slots instead of the desired 0.0.

To Reproduce
In python REPL:

import torch
from allennlp.modules.span_pruner import SpanPruner
emb = torch.ones([1, 2, 1]) # batch size 1, 2 spans, embedding size 1
scorer = torch.nn.Linear(1, 1)
mask = torch.tensor([1, 0]).view(1, 2).float() # only 1 span is present in the instance
pruner = SpanPruner(scorer)
_, _, _, scores = pruner(emb, mask, 2)
print(scores)

For me, outputs:

tensor([[[ 0.5783],
         [   -inf]]])

though of course the non-inf number is arbitrary.

Expected behavior
I think in this case we should replace the -infs with -1. Because of this issue I had a loss of nan that I had to debug until I found this. It should be an easy fix in SpanPruner. BTW, there's nothing particular to spans in SpanPruner, is there? Might as well just call it Pruner, right?

System (please complete the following information):

  • OS: Reproduced on mac OS 10.13.3 on CPU as well as Ubuntu 16.04.4 LTS on GPU (V100).
  • Python version: 3.6.5 and 3.6.6.
  • AllenNLP version: v0.5.0, but looking at the source code on master I assume this remains an issue.
  • PyTorch version: 0.4.0

Additional context
My guess is this hasn't come up before because span pruning was only used with long texts where this doesn't ever happen. It came up for me because I'm using span pruning with a sentence-level model where a few of the sentences have only 2 tokens and are batched with 4-token sentences.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0