fix: Deadlock during cost calculation #173

dadrus · 2025-05-16T17:41:50Z

Unfortunately, the implementation, I've done in #152 can cause a deadlock during cost calculation:

The update method of the Item locks an RW lock and calls the item.calculateCost(item), which if configured, may make use of the available getter functions of the item object. All these functions make however use of the R lock - thus we have a deadlock.

This PR updates the implementation of the Item and performs the calculation of the item cost on a snapshot rather the actual update.

The test has been updated as well to use a public method. Without the update to the Item implementation the corresponding test will run into a deadlock

…tem instead of a private property

coveralls · 2025-05-16T17:43:54Z

Pull Request Test Coverage Report for Build 15447941855

Details

15 of 15 (100.0%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.001%) to 99.733%

Totals
Change from base Build 15260220298:	0.001%
Covered Lines:	747
Relevant Lines:	749

💛 - Coveralls

dadrus · 2025-05-16T17:50:16Z

@swithek: Would you please review?

swithek · 2025-05-21T16:14:04Z

@dadrus thanks for the PR. I'll try to review it by the end of this week.

davseby · 2025-05-25T15:26:40Z

item_test.go

@@ -152,7 +152,7 @@ func Test_Item_update(t *testing.T) {
 			uc: "with version calculation and version tracking",
 			opts: []itemOption[string, string]{
 				withVersionTracking[string, string](true),
-				withCostFunc[string, string](func(item *Item[string, string]) uint64 { return uint64(len(item.value)) }),
+				withCostFunc[string, string](func(item *Item[string, string]) uint64 { return uint64(len(item.Value())) }),


I believe we should modify the CostFunc signature. Can we introduce a new type called CostItem[K, V], which would be passed to the cost function instead of re-using *Item[K,V]? The feature wasn't yet released, we should be fine making this change.

I was also thinking about a similar thing, but was reluctant to directly implement it. My idea was, respectively is, however not introducing a special CostItem, but moving the actual content related attributes into an internal struct. That way we could use atomics instead of locks in the update and the getter functions. What do you think about this idea?

I agree with @davseby. The current implementation of CostFunc would work better if instead of passing the whole Item, we passed just the Item.Value() along with the key. The CostItem[K, V] would be a wrapper around them without any extra data fields (e.g., expiration time, version etc.). This way we wouldn't have to deal with any deadlocks or snapshot modes.

As an idea, the snapshot mode is quite good. However, it introduces some new complexity that's not needed by the current state of this library. Besides, that would also contribute to the general maintenance overhead.

So to sum up, the fix/changes for this would be:

// add CostItem, it would act as a special snapshot for the cost func type CostItem[K comparable, V any] struct { Key K Value V } // adjust the CostFunc type CostFunc[K comparable, V any] func(item *CostItem[K, V]) uint64 // adjust item.calculateCost() calls item.mu.Lock() defer item.mu.Unlock() costItem := &CostItem[K, V]{ Key: item.key, Value: item.value } item.calculateCost(costItem)

@dadrus is this something you could add to your PR? I could open a new PR if you feel like this is out of scope here.

Sure. I'm however on a conference the entire week and not sure whether I'll be able to get to it until I'm back home.

@davseby, @swithek: The new commits implement your suggestion. Please take a look.

I've used CostItem as value to guarantee no memory is allocated on the heap, even though no object escape can happen if a pointer to CostItem would have been used in the signature of the calculateCost function - at least as long as the user of the library does not store the object for some reason.

I think that makes sense. What do you think @davseby?

swithek

Looks good, one small change is needed. Also, you might need to merge/rebase your branch on the latest v3 branch, because there have been some changes there that might affect some of your code (at least the tests).

swithek · 2025-06-04T14:57:01Z

README.md

-        ttlcache.WithMaxCost[string, string](5120, func(item *ttlcache.Item[string, string]) uint64 {
-            return uint64(size.Of(item))
+        ttlcache.WithMaxCost[string, string](5120, func(item ttlcache.CostItem[string, string]) uint64 {
+            // The cache maintains internal structures averaging ~144 bytes per entry.


Although it's really nice that you thoroughly explain the internal sizes, I think making these internal structures part of the cost calculation might lead to unexpected results and feels a bit like an anti-pattern. One of the reasons why I suggested using a simple CostItem struct here instead of the whole Item pointer is because some item metadata fields, like the expiration time or version, might change without triggering a cost recalculation. This means that the most accurate calculation of an item's cost would be the one that uses only its key and value (aka, the "accessible" data).

So my suggestion here would be to remove the "176" and the explanation (but again, the explanation and your effort are really appreciated).

Sure. Will update it asap. A small comment from my side to

... some item metadata fields, like the expiration time or version, might change without triggering a cost recalculation

That's definitely true, but it has no effect on the amount of the used memory ;)

The latest commit includes the update, along with a comment to help manage expectations. Let me know if you'd prefer it removed as well.

btw. updates from the v3 branch were already included in b5bd076 and there were no new updates since then.

dadrus added 3 commits May 16, 2025 18:30

not locking item when calling functions in the cost calculation callback

036f171

changed test cost calculation callback to use a public method of an i…

b27d431

…tem instead of a private property

ensuring atomicy

cd4dda2

dadrus mentioned this pull request May 16, 2025

In-memory cache can deadlock under concurrent use dadrus/heimdall#2466

Closed

3 tasks

swithek mentioned this pull request May 24, 2025

Feature request: Expose Item cost #159

Open

davseby reviewed May 25, 2025

View reviewed changes

dadrus added 3 commits June 3, 2025 10:01

implementation updated according to the PR comments

4542755

readme updated

5a4cb48

Merge branch 'v3' into fix/cost_calc_deadlock

b5bd076

dadrus requested review from swithek and davseby June 3, 2025 08:27

swithek requested changes Jun 4, 2025

View reviewed changes

readme updated

d5df875

dadrus requested a review from swithek June 4, 2025 16:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Deadlock during cost calculation #173

fix: Deadlock during cost calculation #173

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix: Deadlock during cost calculation #173

Are you sure you want to change the base?

fix: Deadlock during cost calculation #173

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Pull Request Test Coverage Report for Build 15447941855

Details

💛 - Coveralls

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!