[SQS] Cancel pending and future tasks on shutdown #12228

gregfurman · 2025-02-04T14:48:30Z

Motivation

Under high volumes of SQS receive/send/delete calls, the internal SQS ThreadPoolExecutors can execute tasks long after a service has completed their shutdown/stop lifecycle hooks.

That is, when we try stop the SQS service, currently running and future pending tasks are not cleared from the executor pool -- potentially being executed long after the service signalled to have stopped.

This PR ensures that all enqueued tasks are cancelled on shutdown.

Changes

Changed the CloudWatchDispatcher.shutdown and MessageMoveTaskManager.close methods to cancel all pending/future tasks that are currently enqueued for execution.

github-actions · 2025-02-04T15:38:14Z

LocalStack Community integration with Pro

2 files 2 suites 1h 15m 16s ⏱️
3 004 tests 2 873 ✅ 131 💤 0 ❌
3 006 runs 2 873 ✅ 133 💤 0 ❌

Results for commit d74089c.

♻️ This comment has been updated with latest results.

dfangl · 2025-02-07T13:34:36Z

localstack-core/localstack/services/sqs/queue.py

+            if self.is_shutdown:
+                raise Empty


Is there a specific reason why we need this at the beginning here as well?

Just allows for an earlier exit if we enter the poll loop in the time it takes between shutdown() being called and get() being called. Just a very minor optimisation

We are for example also not raising exceptions on timeouts < 0 in this case here, but that might not be all too important.

dfangl · 2025-02-07T13:37:51Z

localstack-core/localstack/services/sqs/provider.py

@@ -192,7 +194,7 @@ def __init__(self, num_thread: int = 3):
        )

    def shutdown(self):
-        self.executor.shutdown(wait=False)
+        self.executor.shutdown(wait=True, cancel_futures=True)


Did you check if we get any issues when doing a wait=True here? I am fine with cancelling the futures, but this could block the rest of the shutdown quite early (depending on the shutdown order), in contrast to at the very end of the interpreter shutdown.

I noticed these dispatched jobs were running while the SQS provider was already supposed to have shut down + while others were in the process of shutting down.

The wait=True was to prevent this behaviour, meaning fewer exceptions being raised and no botocore retrying happening simultaneously to LS exiting.

If we're OK with the dispatched jobs running while other services are shutting down then then happy to make this false again since the cancel_futures behaviour is definitely more important.

It is fine, I am just worried about it blocking too long without a timeout after which we call it quits.

Fair. Let's allow the interpreter to tear these down instead. I'll change wait=False

dfangl

I think this is innocent enough, LGTM!

dfangl · 2025-02-07T14:48:21Z

localstack-core/localstack/services/sqs/queue.py

+            if self.is_shutdown:
+                raise Empty


We are for example also not raising exceptions on timeouts < 0 in this case here, but that might not be all too important.

dfangl · 2025-02-07T14:48:42Z

localstack-core/localstack/services/sqs/provider.py

@@ -192,7 +194,7 @@ def __init__(self, num_thread: int = 3):
        )

    def shutdown(self):
-        self.executor.shutdown(wait=False)
+        self.executor.shutdown(wait=True, cancel_futures=True)


It is fine, I am just worried about it blocking too long without a timeout after which we call it quits.

[SQS] Wait for running tasks to complete and cancel pending on shutdown

5d0434c

gregfurman added aws:sqs Amazon Simple Queue Service semver: patch Non-breaking changes which can be included in patch releases labels Feb 4, 2025

gregfurman requested review from dfangl and bentsku February 4, 2025 14:48

gregfurman self-assigned this Feb 4, 2025

gregfurman marked this pull request as ready for review February 4, 2025 15:52

gregfurman requested review from thrau and baermat as code owners February 4, 2025 15:52

dfangl reviewed Feb 7, 2025

View reviewed changes

gregfurman requested a review from dfangl February 7, 2025 14:39

dfangl approved these changes Feb 7, 2025

View reviewed changes

revert waiting on executor shutdown

d74089c

gregfurman changed the title ~~[SQS] Wait for running tasks to complete and cancel pending on shutdown~~ [SQS] Cancel pending and future tasks on shutdown Feb 10, 2025

gregfurman merged commit afb3931 into master Feb 10, 2025
35 checks passed

gregfurman deleted the fix/sqs/hanging-threads branch February 10, 2025 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SQS] Cancel pending and future tasks on shutdown #12228

[SQS] Cancel pending and future tasks on shutdown #12228

[SQS] Cancel pending and future tasks on shutdown #12228

[SQS] Cancel pending and future tasks on shutdown #12228

Conversation

Motivation

Changes

LocalStack Community integration with Pro

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment