8000 wikitext-2 is not available anymore · Issue #2247 · pytorch/text · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
wikitext-2 is not available anymore #2247
Open
@huangjia2019

Description

@huangjia2019

🐛 Bug

Describe the bug

requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip
This exception is thrown by iter of HTTPReaderIterDataPipe(skip_on_error=False, source_datapipe=OnDiskCacheHolderIterDataPipe, timeout=None)

To Reproduce Steps to reproduce the behavior:

from torchtext.datasets import WikiText2
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator
from torch.utils.data import DataLoader, Dataset

tokenizer = get_tokenizer("basic_english")

train_iter = WikiText2(split='train')
valid_iter = WikiText2(split='valid')

def yield_tokens(data_iter):
for item in data_iter:
yield tokenizer(item)

vocab = build_vocab_from_iterator(yield_tokens(train_iter),
specials=["", "", ""])
vocab.set_default_index(vocab[""])

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Environment

Please copy and paste the output from our
environment collection script (or
fill out the checklist below manually).

You can get the script and run it with:

wget https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip

  • PyTorch Version (e.g., 1.0):
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0