-
Notifications
You must be signed in to change notification settings - Fork 9.6k
Reload all scrape pools concurrently #16595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
edaded0
to
74995a5
Compare
At the moment all scrape pools that need to be reloaded are reloaded one by one. While reloads are ongoing mtxScrape is locked. For each pool that's being reloaded we need to wait until all targets are updated. This whole process can take a while and the more scrape pools to reload the longer. At the same time all pools are independent and there's no real reason to do them one-by-one. Reload each pool in a seperate goroutine so we finish config reload as ASAP as possible and unlock the mtxScrape. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
74995a5
to
04ae5d5
Compare
A bit more context: while |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this, code lgtm,
But I'd be interested in some numbers if you could share; for how much pools/targets you need to wait for how long? any profiling, which operations are the heaviest?
given that stop()
and reload()
themselves spawn routines, i don't know at which scale this could be problematic...
I've noticed this issue because scrapes against the affected instance were failing and I've noticed that config reload was taking almost 6 minutes there:
When I triggered a similar config change after applying this patch it all went down to 19 seconds:
This particular issue was with a config file that had 44 scrape jobs defined, each with a different number of targets (3 to 30), but they all share the same pattern where the targets are read from a JSON file but there's also a
|
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
thanks for the context. That's a some good improvement. For such config, 6 minutes is insane, but 19 seconds still feels like a lo.
That’s interesting. I would expect only the target/scraping related routes to be blocked, but not |
It's different each time but I could trigger it by simply re-ordering elements of
Some of the exposed metrics describe scrape pools and there's a https://github.com/prometheus/prometheus/blob/main/scrape/metrics.go#L300-L302 |
Also: ApplyConfig() is a big function that does a lot, we could make locks there more granular as we don't need the lock being held all the time, but doing so would make the exported metrics update less atomic - right now you get metrics before the reload and after the reload, so everything on metrics response will be consistent, if we make locking inside ApplyConfig more granular then when you scrape Prometehus while it's running ApplyConfig you might get metrics from scrape pools that are being removed etc. That seemed risky to me so I didn't do it here. |
I could make the locks more granular, which would make ApplyConfig() less blocking, but at the cost of /metrics responses potentially showing not 100% consistent data if one is made half way through the ApplyConfig run, what do you think @machine424 ? |
Thanks for the details. Yes, relaxing the present "consistency" model will certainly come at a cost, but blocking the endpoint for 19 seconds isn't ideal either. I was able to reproduce this, got |
There's always at least one unhealthy target in each pool and that slows down each scrape pool stop() when calling reload()->restartLoops() as we need to wait for all scrapes to complete before a scrape pool is fully stopped. This adds a few second delay when stopping each pool and with enough pool and stopping each one-by-one it all adds up to a big delay. Reloading all pools at the same time means we just wait for the slowest one. |
Anything else I can / should do here? |
As I mentioned above. I agree that the change will help speeding up things, again, before opting for concurrency, I wanted to know (since it’s not clear if you’ve verified this) if we can avoid that delay altogether. It turns out that, prior to #7752 there was no wait/the reload was abrupt with confusing side effects. If concurrency isn't enough, we could always consider restoring that behavior without the side effects in #7097 behind a flag (we could use sth like https://pkg.go.dev/context#CancelCauseFunc to identify cancel for reload). Also, with a bit more logic, we could be selective about which changes actually require restarting a scrape loop, for post-scrape operations like metrics relabeling (don't really influence the scrape), we could just update the existing loop with the new config... That said, let’s go 8000 with this change for now. Since we’ll be spawning a goroutine per target/loop, have you monitored how it behaves on some instances with "many" targets to reload? (Regarding the /metrics and other targets related API stalling, there was as attempt to address that #8104 reverted in #8176), anyway, we'll need to either relax consistency or keep an accessible copy of needed data and swap, both options come at a cost, let's continue the discussion somewhere else. |
To an extent, this PR was raised to fix an issue we had where we would get alerts about a few busy instances of Prometheus being down because they were timing out scrape requests and so marked as down. We’ve been running with this patch for a while now and since deploying it we stopped getting these alerts while not seeing any other issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, let's give this a try then.
At the moment all scrape pools that need to be reloaded are reloaded one by one. While reloads are ongoing mtxScrape is locked. For each pool that's being reloaded we need to wait until all targets are updated. This whole process can take a while and the more scrape pools to reload the longer. At the same time all pools are independent and there's no real reason to do them one-by-one. Reload each pool in a seperate goroutine so we finish config reload as ASAP as possible and unlock the mtxScrape.