Engine: encourage better garbage collection #932

josephjclark · 2025-04-30T15:21:28Z

Short Description

This PR fixes an issue in the engine where state references are not garbage collected, and so memory leaks over time.

This is a probably cause of at least some lost runs.

Fixes #826

Implementation Details

After a heck of a lot of investigation I've worked out that the suspect memory leak flagged by @taylordowns2000 is real. I've also worked out that it comes from the engine, and if you cut out child processes and the runtime and just statically return a large object from each run, the leak still occurs

What seems to be happening is:

for every run we build a context object, which tracks a bunch of data for and about that run
This is designed to be a temporary object, dropped as soon as the run completes
When a run completes, we take the final state object returned by the workflow and write it to this context
And for whatever reason, this object does not get unreferenced and so does not get garbage collected

The fix - the dumbest, easiest fix I can find - is to force that object to be unreferenced after the workflow has finished.

I've added a new test:mem script to the engine. This creates an engine with quite a low memory limit, and then infinitely runs a job with a large-ish payload.

Against main the test fails (heap error) on my machine after about 20 runs.

At the time of writing, on this branch, I've just crossed ~~2000~~ ~~3000~~ ~~4000~~ 5000 runs (my poor CPU!)

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

You can read more details in our Responsible AI Policy

josephjclark added 2 commits April 30, 2025 16:10

engine: add memor test

4af1a94

engine: encourage gc on state objects

71a89c2

taylordowns2000 added this to v2 Apr 30, 2025

github-project-automation bot moved this to New Issues in v2 Apr 30, 2025

This comment was marked as resolved.

Sign in to view

josephjclark force-pushed the garbage-collect-engine-state branch from 8f4e0fc to 71a89c2 Compare April 30, 2025 15:34

josephjclark added 5 commits April 30, 2025 16:36

package lock

e9a4a21

types

745af03

engine: don't write state to context at all

bc82181

8000
engine: better test pattern

a106a41

versions: worker@1.13.4

269d630

theroinaochieng moved this from New Issues to DevX Backlog in v2 May 6, 2025

theroinaochieng moved this from DevX Backlog to In progress in v2 May 6, 2025

josephjclark merged commit dfc16a9 into main May 6, 2025
10 checks passed

josephjclark deleted the garbage-collect-engine-state branch May 6, 2025 17:59

github-project-automation bot moved this from In progress to Done in v2 May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Engine: encourage better garbage collection #932

Engine: encourage better garbage collection #932

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Engine: encourage better garbage collection #932

Engine: encourage better garbage collection #932

Uh oh!

Conversation

Uh oh!

Short Description

Implementation Details

AI Usage

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!