-
Notifications
You must be signed in to change notification settings - Fork 1.5k
vm.max_map_count growing steadily when vm.overcommit_memory is 2 #1328
Comments
Thank you for this great bug report and diagnosis. I don't think the current behavior was really chosen per se; it's just an emergent property of a combination of things we don't test very well. I think this needs some amount of philosophizing. (I think it might have to wait for @interwq before anyone can take thorough look; we're stretched a little thin for now). Ironically, one thing that might help is disabling purging settings (i.e. |
@davidtgoldblatt : thanks for looking into it and the suggestion for disabling purging. With jemalloc 5.1.0 this indeed makes the number of mappings grows much much more slowly, but the overall number still increases over time. It is the hundreds rather than in the tens of thousands now. |
Just a side note regarding kernel if needed, this behavior could also be observed with 3.10.x and 4.4.x. |
For reference, here is a "fix" for 5.1.0: diff --git a/3rdParty/jemalloc/v5.1.0/src/pages.c b/3rdParty/jemalloc/v5.1.0/src/pages.c
index 26002692d6..3fbad076ad 100644
--- a/3rdParty/jemalloc/v5.1.0/src/pages.c
+++ b/3rdParty/jemalloc/v5.1.0/src/pages.c
@@ -23,7 +23,7 @@ static size_t os_page;
#ifndef _WIN32
# define PAGES_PROT_COMMIT (PROT_READ | PROT_WRITE)
-# define PAGES_PROT_DECOMMIT (PROT_NONE)
+# define PAGES_PROT_DECOMMIT (PROT_READ | PROT_WRITE)
static int mmap_flags;
#endif
static bool os_overcommits; And for 5.0.1: diff --git a/3rdParty/jemalloc/v5.0.1/src/pages.c b/3rdParty/jemalloc/v5.0.1/src/pages.c
index fec64dd01d..733652adf3 100644
--- a/3rdParty/jemalloc/v5.0.1/src/pages.c
+++ b/3rdParty/jemalloc/v5.0.1/src/pages.c
@@ -20,7 +20,7 @@ static size_t os_page;
#ifndef _WIN32
# define PAGES_PROT_COMMIT (PROT_READ | PROT_WRITE)
-# define PAGES_PROT_DECOMMIT (PROT_NONE)
+# define PAGES_PROT_DECOMMIT (PROT_READ | PROT_WRITE)
static int mmap_flags;
#endif
static bool os_overcommits; These "fixes" prevent the endless growth of mappings. |
@interwq Hello, is there any chance to get this fixed for milestone 5.2.0 ? |
@egaudry : there doesn't seem to be a straightforward fix I can think of right now. Like David mentioned the current behavior under no overcommit isn't particularly optimized, as the environment we work with usually has overcommit enabled. I wasn't able to change the overcommit setting on my dev box somehow. Can you guys help to try one more thing, running with malloc_conf |
Do you mean that I'm not sure about the retain option : as @jsteemann observed, any option that would reduce the number of mapping would not allow to avoid the issue in the long term (i.e. with numerous allocation and/or long living process). |
I meant turning off retain is worth trying, since it could affect number of mappings even in the long term. Plus I believe the option was more designed with overcommit in mind; it may affect # of mappings negatively with no overcommit. |
Thanks, I will have a look (I though this was a compile time parameter). |
Do I need to rebuild jemalloc ?
I should have stated that I was using (and need to stick to this) version 4.5.0 here. |
I tried |
Jan, you are right : I relaunched my test with the current master branch and I was able to use retain:false at runtime. Unfortunately, it didn't solve the large map_count issue (or not enough) I observed with one of our test-cases. |
@egaudry : that is also what I had observed before. |
My main concern is that our users are reluctant to such a change or are not in a position to request such a change (ex: cluster/centralized computing resources with different software running) and I cannot reasonably expect them to switch to a more permissive mode. |
@egaudry : yes, I understand this. |
@interwq Qi, I hope our feedbacks can help. Please let us know if you have another test we can perform. |
@jsteemann @egaudry thanks for all your feedback and help testing the cases! We did discuss this in our last team meeting, however no straightforward fix came to mind. One thing that can for sure alleviate the issue is, using a larger page size, e.g. build jemalloc with My best suggestion for right now, is to combine the following options:
For long term, it's unclear if we will be focusing on reducing the # of mappings w/o overcommit. On one hand, it's probably fair to consider this an limitation of the Linux kernel (require a max mapping limit / suffer big perf degradation as mappings grow). IIRC FreeBSD doesn't have such issues. On the other hand we already spent effort to workaround this (i.e. the retain feature but obviously only for w/ overcommit). Let us know if the config above solves it for you, or how far it goes. |
@interwq Qi, this configuration indeed offers a solution (at least on a specific test-case I'm using). I do understand that having vm.overcommit_memory=2 nowadays might not be really relevant and as such I won't discuss anymore the need for a fix. I will instead rely on this configuration when needed. The downside of this solution is that for an external (i.e. not aware of the jemalloc behavior) user looking at VmRSS and VmSize, it will be difficult to understand when memory gets back to the system (because it is relying on muzzy/dirty decay time jemalloc-5 behavior, but that is out of the scope here). Thanks you all for your feedback and the solutions offered. |
@egaudry : glad it worked for you. re: the time based decay, we did observe efficiency wins on the vast majority of our binaries -- given that memory reuse is usually very frequent, in theory we should only start purging memory after the workload is finished / reduced; time-based decay does that a lot better than the previous ratio-based decay which assumes a fixed ratio. However I also understand that memory not returned to OS immediately may cause some confusion, especially in micro benchmarks (we got quite a few questions on that front). We had some discussion regarding combining time based decay with ratio based; however the exact approach is still a bit unclear. Please feel free to share your use cases / thoughts, or ask for features there. |
Is there an open issue tracking the problem that jemalloc in its default configuration, on a system with overcommit disabled (with vm.overcommit_memory=2), will exhaust the default mapping limit under normal usage patterns? I'm having trouble finding the decay algorithm discussions that @interwq mentioned as being the right place to pursue this further. I also don't understand jemalloc well enough to grasp how even the best decay algorithm would prevent eventually hitting the mapping limit. Eventually unused pages will still decay, and mappings will split, right? If they split more than they recombine, eventually the limit will be hit. We're shipping jemalloc in our release binaries in our scientific computing project, with the result that on some high-performance computing systems where memory overcommit has been disabled by cluster administrators, our software crashes because it stops being able to allocate memory. Is the recommended solution to add |
(Sorry, accidentally posted and then deleted a half-done comment): Reopening, since I don't think we yet have a good general solution to this class of issues, even though the original question seems solved; there's more left to do here. I think that it may be the case that recent changes have helped some (opt.oversize_threshold). Even better would be to turn off retain for oversized allocations, even if it's on for smaller ones (which can't be done as a tuning change I think; it needs a little bit of extra jemalloc code written). I don't know that there's something that would make us consider this problem "solved". Fundamentally, saying so with confidence would need prod performance testing across a range of applications, and I'm not sure that the core dev team has the ability to do that sort of "testing in anger", given the sorts of prod systems we touch day-to-day. (E.g. none of us work on HPC scientific computing applications, and so can't form and test guesses on what sorts of configurations work well there). We'd definitely be receptive to PRs updating configuration settings / tweaking allocation strategies in those cases. |
@adamnovak I'm still puzzled by the fact that people tends to believe that disabling overcommit and/or limiting the max_map_count is the way to go on computing nodes. I know for a fact that it is pretty difficult to get sysadmins to change those settings (and mainly because they would use it since more than a decade), but limiting virtual memory does not make much sense in the HPC world... If you consider that, for instance, cuda will allocate a VM equals to the physical memory detected on the host when starting, the problem becomes broader too. |
@egaudry I can't necessarily explain why someone would want to disable overcommit either. The best I can come up with is that they want jobs to fail fast at allocation time, rather than after wasting a bunch of cluster time filling in the pages that did happen to fit in memory. My project's immediate users are the scientists who sometimes get handed clusters with overcommit off, not the people who decided to adopt that setting, so I need to provide at least passable, if not particularly performant, behavior in that environment. As for limiting max_map_count, the default limit on my workstation is 65530, without me having done anything to reduce it. So I don't think that people are choosing to limit it so much as that they haven't thought to increase it. |
Like mentioned currently there is no plan to focus on the no overcommit case. @adamnovak: the page size + decay tuning should alleviate the issue. For the decay setting, you can also run the binary with env var |
…Codec#catch_and_raise': ActiveSupport::MessageEncryptor::InvalidMessage error) 2025-02-28T07:42:28.618485307Z 8:M 28 Feb 2025 07:42:28.618 * Ready to accept connections 2025-02-28T07:42:30.095610705Z => Booting Puma 2025-02-28T07:42:30.095707345Z => Rails 8.0.1 application starting in production 2025-02-28T07:42:30.095744945Z => Run `bin/rails server --help` for more startup options 2025-02-28T07:42:30.356966530Z WARNING! : We didn't find an active_storage.service configured so we're falling back to the local store, but it's A VERY BAD IDEA to rely on it in production unless you know what you're doing. 2025-02-28T07:42:30.357004811Z Exiting 2025-02-28T07:42:30.357263852Z /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/messages/codec.rb:57:in 'ActiveSupport::Messages::Codec#catch_and_raise': ActiveSupport::MessageEncryptor::InvalidMessage 2025-02-28T07:42:30.357283812Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/message_encryptor.rb:242:in 'ActiveSupport::MessageEncryptor#decrypt_and_verify' 2025-02-28T07:42:30.357293612Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/encrypted_file.rb:109:in 'ActiveSupport::EncryptedFile#decrypt' 2025-02-28T07:42:30.357335212Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/encrypted_file.rb:72:in 'ActiveSupport::EncryptedFile#read' 2025-02-28T07:42:30.357346292Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/encrypted_configuration.rb:63:in 'ActiveSupport::EncryptedConfiguration#read' 2025-02-28T07:42:30.357354652Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/encrypted_configuration.rb:86:in 'ActiveSupport::EncryptedConfiguration#config' 2025-02-28T07:42:30.357364212Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/encrypted_configuration.rb:113:in 'ActiveSupport::EncryptedConfiguration#options' 2025-02-28T07:42:30.357372092Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/delegation.rb:186:in 'ActiveSupport::EncryptedConfiguration#method_missing' 2025-02-28T07:42:30.357420692Z from /usr/local/bundle/ruby/3.4.0/gems/bullet_train-1.18.0/lib/bullet_train.rb:149:in 'Object#two_factor_authentication_enabled?' 2025-02-28T07:42:30.357429412Z from /rails/config/initializers/devise.rb:10:in 'block in <main>' 2025-02-28T07:42:30.357435292Z from /usr/local/bundle/ruby/3.4.0/gems/devise-4.9.4/lib/devise.rb:314:in 'Devise.setup' 2025-02-28T07:42:30.357440852Z from /rails/config/initializers/devise.rb:3:in '<main>' 2025-02-28T07:42:30.357446172Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/engine.rb:693:in 'Kernel#load' 2025-02-28T07:42:30.357451852Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/engine.rb:693:in 'block in Rails::Engine#load_config_initializer' 2025-02-28T07:42:30.357457572Z from /usr/local/bundle/ruby/3.4.0/gems/activesupport-8.0.1/lib/active_support/notifications.rb:212:in 'ActiveSupport::Notifications.instrument' 2025-02-28T07:42:30.357465292Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/engine.rb:692:in 'Rails::Engine#load_config_initializer' 2025-02-28T07:42:30.357471372Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/engine.rb:646:in 'block (2 levels) in <class:Engine>' 2025-02-28T07:42:30.357477292Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/engine.rb:645:in 'Array#each' 2025-02-28T07:42:30.357482732Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/engine.rb:645:in 'block in <class:Engine>' 2025-02-28T07:42:30.357509053Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/initializable.rb:32:in 'BasicObject#instance_exec' 2025-02-28T07:42:30.357515533Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/initializable.rb:32:in 'Rails::Initializable::Initializer#run' 2025-02-28T07:42:30.357521573Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/initializable.rb:61:in 'block in Rails::Initializable#run_initializers' 2025-02-28T07:42:30.357527413Z from /usr/local/lib/ruby/3.4.0/tsort.rb:231:in 'block in TSort.tsort_each' 2025-02-28T07:42:30.357532653Z from /usr/local/lib/ruby/3.4.0/tsort.rb:353:in 'block (2 levels) in TSort.each_strongly_connected_component' 2025-02-28T07:42:30.357538173Z from /usr/local/lib/ruby/3.4.0/tsort.rb:425:in 'block (2 levels) in TSort.each_strongly_connected_component_from' 2025-02-28T07:42:30.357543813Z from /usr/local/lib/ruby/3.4.0/tsort.rb:434:in 'TSort.each_strongly_connected_component_from' 2025-02-28T07:42:30.357549173Z from /usr/local/lib/ruby/3.4.0/tsort.rb:424:in 'block in TSort.each_strongly_connected_component_from' 2025-02-28T07:42:30.357554613Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/initializable.rb:50:in 'Array#each' 2025-02-28T07:42:30.357560053Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/initializable.rb:50:in 'Rails::Initializable::Collection#tsort_each_child' 2025-02-28T07:42:30.357565853Z from /usr/local/lib/ruby/3.4.0/tsort.rb:418:in 'Method#call' 2025-02-28T07:42:30.357597933Z from /usr/local/lib/ruby/3.4.0/tsort.rb:418:in 'TSort.each_strongly_connected_component_from' 2025-02-28T07:42:30.357605093Z from /usr/local/lib/ruby/3.4.0/tsort.rb:352:in 'block in TSort.each_strongly_connected_component' 2025-02-28T07:42:30.357610653Z from /usr/local/lib/ruby/3.4.0/tsort.rb:350:in 'Rails::Initializable::Collection#each' 2025-02-28T07:42:30.357616053Z from /usr/local/lib/ruby/3.4.0/tsort.rb:350:in 'Method#call' 2025-02-28T07:42:30.357621133Z from /usr/local/lib/ruby/3.4.0/tsort.rb:350:in 'TSort.each_strongly_connected_component' 2025-02-28T07:42:30.357627333Z from /usr/local/lib/ruby/3.4.0/tsort.rb:229:in 'TSort.tsort_each' 2025-02-28T07:42:30.357632813Z from /usr/local/lib/ruby/3.4.0/tsort.rb:208:in 'TSort#tsort_each' 2025-02-28T07:42:30.357638093Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/initializable.rb:60:in 'Rails::Initializable#run_initializers' 2025-02-28T07:42:30.357643813Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/application.rb:440:in 'Rails::Application#initialize!' 2025-02-28T07:42:30.357649413Z from /rails/config/environment.rb:5:in '<main>' 2025-02-28T07:42:30.357673813Z from config.ru:3:in 'Kernel#require_relative' 2025-02-28T07:42:30.357679173Z from config.ru:3:in 'block (2 levels) in <main>' 2025-02-28T07:42:30.357711333Z from /usr/local/bundle/ruby/3.4.0/gems/rack-3.1.10/lib/rack/builder.rb:108:in 'Kernel#eval' 2025-02-28T07:42:30.357723973Z from /usr/local/bundle/ruby/3.4.0/gems/rack-3.1.10/lib/rack/builder.rb:108:in 'Rack::Builder.new_from_string' 2025-02-28T07:42:30.357730133Z from /usr/local/bundle/ruby/3.4.0/gems/rack-3.1.10/lib/rack/builder.rb:97:in 'Rack::Builder.load_file' 2025-02-28T07:42:30.357735813Z from /usr/local/bundle/ruby/3.4.0/gems/rack-3.1.10/lib/rack/builder.rb:67:in 'Rack::Builder.parse_file' 2025-02-28T07:42:30.357741373Z from /usr/local/bundle/ruby/3.4.0/gems/rackup-2.2.1/lib/rackup/server.rb:354:in 'Rackup::Server#build_app_and_options_from_config' 2025-02-28T07:42:30.357747333Z from /usr/local/bundle/ruby/3.4.0/gems/rackup-2.2.1/lib/rackup/server.rb:263:in 'Rackup::Server#app' 2025-02-28T07:42:30.357752773Z from /usr/local/bundle/ruby/3.4.0/gems/rackup-2.2.1/lib/rackup/server.rb:424:in 'Rackup::Server#wrapped_app' 2025-02-28T07:42:30.357759374Z from /usr/local/bundle/ruby/3.4.0/gems/rackup-2.2.1/lib/rackup/server.rb:326:in 'block in Rackup::Server#start' 2025-02-28T07:42:30.357765094Z from /usr/local/bundle/ruby/3.4.0/gems/rackup-2.2.1/lib/rackup/server.rb:382:in 'Rackup::Server#handle_profiling' 2025-02-28T07:42:30.357770654Z from /usr/local/bundle/ruby/3.4.0/gems/rackup-2.2.1/lib/rackup/server.rb:325:in 'Rackup::Server#start' 2025-02-28T07:42:30.357776094Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/commands/server/server_command.rb:38:in 'Rails::Server#start' 2025-02-28T07:42:30.357813214Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/commands/server/server_command.rb:145:in 'block in Rails::Command::ServerCommand#perform' 2025-02-28T07:42:30.357828334Z from <internal:kernel>:91:in 'Kernel#tap' 2025-02-28T07:42:30.357834534Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/commands/server/server_command.rb:136:in 'Rails::Command::ServerCommand#perform' 2025-02-28T07:42:30.357841654Z from /usr/local/bundle/ruby/3.4.0/gems/thor-1.3.2/lib/thor/command.rb:28:in 'Thor::Command#run' 2025-02-28T07:42:30.357847414Z from /usr/local/bundle/ruby/3.4.0/gems/thor-1.3.2/lib/thor/invocation.rb:127:in 'Thor::Invocation#invoke_command' 2025-02-28T07:42:30.357853534Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/command/base.rb:178:in 'Rails::Command::Base#invoke_command' 2025-02-28T07:42:30.357859214Z from /usr/local/bundle/ruby/3.4.0/gems/thor-1.3.2/lib/thor.rb:538:in 'Thor.dispatch' 2025-02-28T07:42:30.357864454Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/command/base.rb:73:in 'Rails::Command::Base.perform' 2025-02-28T07:42:30.357887094Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/command.rb:65:in 'block in Rails::Command.invoke' 2025-02-28T07:42:30.357894774Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/command.rb:143:in 'Rails::Command.with_argv' 2025-02-28T07:42:30.357900454Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/command.rb:63:in 'Rails::Command.invoke' 2025-02-28T07:42:30.357905934Z from /usr/local/bundle/ruby/3.4.0/gems/railties-8.0.1/lib/rails/commands.rb:18:in '<main>' 2025-02-28T07:42:30.357911494Z from <internal:/usr/local/lib/ruby/3.4.0/rubygems/core_ext/kernel_require.rb>:37:in 'Kernel#require' 2025-02-28T07:42:30.357917134Z from <internal:/usr/local/lib/ruby/3.4.0/rubygems/core_ext/kernel_require.rb>:37:in 'Kernel#require' 2025-02-28T07:42:30.357922734Z from /usr/local/bundle/ruby/3.4.0/gems/bootsnap-1.18.4/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in 'Kernel#require' 2025-02-28T07:42:30.357928654Z from ./bin/rails:4:in '<main>' 2025-02-28T07:42:36.987868425Z 8:C 28 Feb 2025 07:42:36.986 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 2025-02-28T07:42:36.987899425Z 8:C 28 Feb 2025 07:42:36.986 # Redis version=7.0.15, bits=64, commit=00000000, modified=0, pid=8, just started 2025-02-28T07:42:36.987903425Z 8:C 28 Feb 2025 07:42:36.986 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf 2025-02-28T07:42:36.987906945Z 8:M 28 Feb 2025 07:42:36.987 * monotonic clock: POSIX clock_gettime 2025-02-28T07:42:36.988104746Z 8:M 28 Feb 2025 07:42:36.987 * Running mode=standalone, port=6379. 2025-02-28T07:42:36.988227546Z 8:M 28 Feb 2025 07:42:36.988 # Server initialized 2025-02-28T07:42:36.988332907Z 8:M 28 Feb 2025 07:42:36.988 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see jemalloc/jemalloc#1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 2025-02-28T07:42:36.991131598Z 8:M 28 Feb 2025 07:42:36.991 * Ready to accept connections INFO [fab3b1fa] Running docker container ls --all --filter name=^socialgames-cc-web-177b833df3bb72515df19bf25623e7715d19a15a$ --quiet | xargs docker inspect --format '{{json .State.Health}}' on 138.199.206.130 INFO [fab3b1fa] Finished in 0.201 seconds with exit status 0 (successful). ERROR null INFO [8e68af83] Running docker container ls --all --filter name=^socialgames-cc-web-177b833df3bb72515df19bf25623e7715d19a15a$ --quiet | xargs docker stop on 138.199.206.130 INFO [8e68af83] Finished in 0.990 seconds with exit status 0 (successful). Finished all in 131.2 seconds Releasing the deploy lock... Finished all in 133.2 seconds ERROR (SSHKit::Command::Failed): Exception while executing on host 138.199.206.130: docker exit status: 1 docker stdout: Nothing written docker stderr: Error: target failed to become healthy within configured timeout (30s)
currently, the CI outputs the foollowing error when running redis tests: > 752:M 06 May 2025 05:31:27.212 # Server initialized > 3752:M 06 May 2025 05:31:27.212 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see jemalloc/jemalloc#1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. not a big fan of warnings on CI
Uh oh!
There was an error while loading. Please reload this page.
In an application that uses jemalloc statically linked, I am seeing an ever-increasing value of the process'
vm.max_map_count
value. Theovercommit_memory
setting value is2
, so no overcommitting.It seems that jemalloc reads the overcommit setting at startup, and later takes this setting's value into account when "returning" memory.
When
overcommit_memory
is set to2
, it seems to callmmap
on the returned range, with a protection ofPROT_NONE
. It seems that this punches holes into existing mappings, so that the kernel will split them and create more of them. This would not be a problem if it happened only seldomly, but we have several use cases in which it happens so often that even increasing the value ofvm.max_map_count
to tens of millions does not help much.I have created some (contrived) standalone test program which shows the behavior. I hope it is somewhat deterministic so others can reproduce it:
The test program can be compiled and run as follows:
The program allocates memory of pseudo-random sizes and returns some of the memory. It does so with a few parallel threads. Each thread will not exceed a certain size of allocated memory, so it should not leak.
Each thread is writing out some values to
std::cout
. The only interesting figure to look at is the "mappings" value reported, e.g.That "mappings" value is calculated as the number of lines in
/proc/self/maps
, which is not 100% accurate but should be a good-enough approximation.The problem is that when
overcommit_memory
is set to2
, the number of mappings will grow crazily, both with jemalloc 5.0.1 and jemalloc 5.1.0.A "fix" for the problem is to apply the following patch:
This makes the test program run with a very low number of memory mappings. It is obviously not a good fix, because it will leave the memory around with read & write access allowed. So please consider it just a demo.
I think it would be good to make jemalloc more usable with an
overcommit_memory
setting value of2
. Right now, it is kind of risky to use it, because applications may too quickly hit the defaultvm.max_map_count
value of 65K. And even increasing that setting does not help much, because the number of mappings can increase much over time, which means long-running server processes can hit the threshold easily, even if increased.I guess the current implementation is as it is for a reason, so I guess you will be pretty reluctant to change it. However, it would be good to suggest how to avoid that behavior on systems that don't use overcommit and where vm settings cannot be adjusted. Can an option be added to jemalloc to adjust the behavior on commit in this case, when explicitly configured as such? I think this would help plenty of users, as I have seen several issues in this repository that may have the same root cause. The last one I checked was #1324.
Thanks!
(btw. although I think it does make any difference: the above was tried on Linux kernels 4.15 both on bare metal and an Azure cloud instance, compilers in use were g++-7.3.0 and g++-5.4.0)
The text was updated successfully, but these errors were encountered: