-
Notifications
You must be signed in to change notification settings - Fork 747
[Q]: UsageError: Unable to attach to run ... #9948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service 8000 and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thomas Drayton commented: Thanks for reaching out! I appreciate the detail you've provided regarding this issue that you're having. Based on the traceback, it looks like our service is trying to re-attach to run If you don't object, would you also mind sharing:
Thanks in advance! Best, |
Hey Thomas, I am using darts 0.35.0 and PyTorch Lightning 2.5.1.post0. model_best = model.load_from_checkpoint(work_dir=work_dir, model_name=model_name, best=True) where work_dir = "./models/first_runs/"
model_name = "warm-waterfall-26" The warm-waterfall-26 model is actually located in ./models/first_runs/ The training is configured the following: First all arguments are read in, then the training is started using: run_name = wandb_go(args)
SAVE = '/models/first_runs/' + run_name +'.pth.tar'
model = define_model(args, run_name)
model.fit(ts_ttrain_list,
future_covariates=[tcov_train_future] * num_knoten,
past_covariates=[tcov_train] * num_knoten,
verbose=True,
val_series=ts_ttest_list,
val_future_covariates=[tcov_test_future] * num_knoten,
val_past_covariates=[tcov_test] * num_knoten
)
wandb.finish() def wandb_go(args):
'''Start wandb session with parameters'''
wandb.init(project=args.project_name, entity="MY_ENTITY", sync_tensorboard=True, config=args)
name = wandb.run.name
print("Name of run for wandb: ", name)
return name def define_model(args, model_name):
wandb_logger = WandbLogger()
lr_monitor = LearningRateMonitor(logging_interval='step')
n_categories = 70 # how many nodes exist
embedding_size = 70 # embed the categorical variable into a numeric vector of size 2
categorical_embedding_sizes = {"Knoten": (n_categories, embedding_size)}
model = TFTModel(input_chunk_length=args.back_window,
output_chunk_length=args.horizon,
hidden_size=args.hidden,
lstm_layers=args.lstm_layers,
num_attention_heads=args.att_heads,
full_attention=args.full_att,
dropout=args.dropout,
batch_size=args.batch_size,
n_epochs=args.epochs,
likelihood=args.likelihood,
loss_fn=args.loss,
lr_scheduler_cls=args.decay_lr_class,
lr_scheduler_kwargs={"gamma":0.1},
random_state=args.rand,
force_reset=True,
log_tensorboard=True,
save_checkpoints=True,
model_name=model_name,
categorical_embedding_sizes=categorical_embedding_sizes,
work_dir = "./models/first_runs",
pl_trainer_kwargs={
"accelerator": "gpu",
"devices": -1,
"logger":[wandb_logger],
"callbacks":[lr_monitor]
})
return model I hope this helps. Thank you for your assistance. |
Uh oh!
There was an error while loading. Please reload this page.
Hey everyone,
I have trained a TFT-Model using WandB, which worked just fine. But when i try to predict using the trained model i get this error
Has anyone encountered a similar error or knows how to fix this?
I am using wandb version 0.19.11 with python version 3.12.10.
A small example on how I try to make predictions:
I then prepare my data for the predictions and try to make the predictions using:
The entire Traceback looks the following:
The text was updated successfully, but these errors were encountered: