store_in(MemoryType::Stack) should use alloca if the size is small #6289

abadams · 2021-10-05T23:25:30Z

If an allocation for a Func marked as store_in MemoryType::Stack is just made once (e.g. because it's compute_at just inside a parallel loop), then we currently just call malloc and there's no win from placing it on the stack.

This PR changes the behavior of MemoryType::Stack to call alloca instead when the size is small. This helps a lot in the single-use case.

It's the same pseudostack logic as in master. It allocates on first use (even if inside a loop), and reallocates only if the required size grows. Stack is never freed. If the cumulative stack usage for all reallocations for one Func exceeds our can_fit_on_stack threshold, we switch to heap for all future reallocations of this Func.

The following three runtimes are from a test of the single-use case where in the first case we statically know the allocation size, so we can just size the original stack frame appropriately. In the second case we use the new behavior in this PR and subtract from the stack pointer by a computed amount. In the third case we call malloc, as we would on master.

Constant-sized stack allocation: 0.165252
Use alloca: 0.184218
Use malloc: 0.215683

The runtimes are close together because allocating isn't the only thing this pipeline does. You should view the first of the three figures as the zero, which means the new behavior is about 2.5x faster than the old behavior when it comes to the cost of allocating.

Those numbers are from linux. On macos malloc is slower and the win is 7x.

and add a test case that illustrates why this matters.

dsharletg

Thanks for doing this!

dsharletg · 2021-10-05T23:32:35Z

src/CodeGen_Posix.cpp

+        Value *size_zero = ConstantInt::get(size_type, 0);
+        Value *alloca_size = builder->CreateSelect(returned_null, llvm_size, size_zero);
+        // Allocate it. It's zero most of the time.
+        Value *stack_ptr = builder->CreateAlloca(i8_t->getPointerTo(), alloca_size);


Is alloca(0) guaranteed to be a no-op? The docs say: https://llvm.org/docs/LangRef.html#id203

Allocating zero bytes is legal, but the returned pointer may not be unique.

So it's allowed to be a no-op, but not sure if that will always be the case...?

From that explanation. I would expect it it to return the current stack pointer without doing anything else. Whether that is a nop or not is debatable, but I think that serves the purpose here. I gather the worry is it may instead allocate the smallest size or something?

I think maybe we should just go ahead with this and assume that it is a no-op. I can't imagine any target doing something else.

Putting the alloca behind a branch instead was cheaper anyway in the common case of not reallocating.

alexreinking · 2021-10-06T18:55:16Z

src/CodeGen_Internal.cpp

@@ -262,6 +262,7 @@ bool function_takes_user_context(const std::string &name) {

 bool can_allocation_fit_on_stack(int64_t size) {
    user_assert(size > 0) << "Allocation size should be a positive number\n";
+    // Should match the threshold defined in runtime/pseudostack.cpp


I think it would be worthwhile to create a runtime/constants.h that's just a bunch of #defines or constexprs or something that records constants that need to be kept consistent between the compiler and the runtime.

Agreed, but can you think of any others to put in there? I've been hunting and coming up blank.

Maybe the target features enum belongs there? There are also a few places we include all of HalideRuntime.h for a single macro (e.g. HALIDE_ALWAYS_INLINE). I still think it's worthwhile even for one constant, though, since it's a path towards enforcing this and future constant consistency in the code rather than a reminder in a comment.

steven-johnson · 2021-10-07T16:27:14Z

Lots of failures, apparently

src/runtime/constants.h

abadams · 2021-10-08T17:48:47Z

builds are clean. ptaal

steven-johnson · 2021-10-11T18:25:08Z

test/correctness/growing_stack.cpp

+    // of total stack used per Func, we bail and start making heap
+    // allocations instead.
+
+    // The following would use 200 mb of stack if we just kept


200mb of stack ought to be enough for anybody

abadams added 4 commits October 5, 2021 13:43

Test using a real alloca call instead of the pseudostack

244f19d

Improve test and remove debugging prints

05a15e6

Fix test

ef8932e

Switch to heap based on cumulative size rather than current size

ee543fc

and add a test case that illustrates why this matters.

dsharletg reviewed Oct 5, 2021

View reviewed changes

alexreinking reviewed Oct 6, 2021

View reviewed changes

abadams added 2 commits October 6, 2021 12:03

Fix test that requires actual heap allocations

3eaf469

Make test actually test more than one trip through the loop

a6a0e35

abadams added 5 commits October 7, 2021 12:00

Fix alignment of stack allocation

d13b118

Branching is cheaper than alloca(0)

6620c4e

Tweak test pass condition

4367aca

Move shared constant to a single locations

e41a342

Namespace shuffling

40d7aa0

alexreinking reviewed Oct 7, 2021

View reviewed changes

src/runtime/constants.h Outdated Show resolved Hide resolved

Fix comment location

3fa94ab

steven-johnson approved these changes Oct 11, 2021

View reviewed changes

abadams merged commit e058532 into master Oct 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

store_in(MemoryType::Stack) should use alloca if the size is small #6289

store_in(MemoryType::Stack) should use alloca if the size is small #6289

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

store_in(MemoryType::Stack) should use alloca if the size is small #6289

store_in(MemoryType::Stack) should use alloca if the size is small #6289

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!