-
Notifications
You must be signed in to change notification settings - Fork 111
Optimize completion query, benchmarks, JobInsertFullMany #904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
41e32e3
to
673bd71
Compare
func (e *Executor) JobInsertFullMany(ctx context.Context, params *riverdriver.JobInsertFullManyParams) ([]*rivertype.JobRow, error) { | ||
// TODO: Implement this | ||
return nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't totally sure what to do here and didn't want to rabbithole on it while getting the rest of this working. Looks like I just need to loop over single inserts as is done with JobInsertFastMany
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's no way to do batch insertions right now, so the right answer for this one is just to iterate in the driver and call into the normal JobInsertFull
a bunch of times.
f62cb06
to
dbbedfe
Compare
dbbedfe
to
e8c9f77
Compare
@brandur down to one failing sqlite test that seems to be an issue with timezone conversions. You might have a better idea of what's going on there? |
It looks like you got it right? Usually this stems from use of return &rivertype.JobRow{
ID: internal.ID,
Attempt: max(int(internal.Attempt), 0),
AttemptedAt: attemptedAt,
AttemptedBy: internal.AttemptedBy,
CreatedAt: internal.CreatedAt.UTC(),
EncodedArgs: internal.Args,
Errors: errors,
FinalizedAt: finalizedAt,
Kind: internal.Kind,
MaxAttempts: max(int(internal.MaxAttempts), 0),
Metadata: internal.Metadata,
Priority: max(int(internal.Priority), 0),
Queue: internal.Queue,
ScheduledAt: internal.ScheduledAt.UTC(),
State: rivertype.JobState(internal.State),
Tags: internal.Tags,
UniqueKey: internal.UniqueKey,
UniqueStates: uniquestates.UniqueBitmaskToStates(uniqueStatesByte),
}, nil Postgres has timezones in its timestamps, so it can tolerate a time being put in on one timezone, keep track of timezone, and then compare to another time in a different zone, but SQLite is much less tolerant of all of that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff man. The new query seems easier to me to read anyway — fewer CTEs is always better.
No, actually, it just passes in CI bc it's on UTC :) I fixed it now locally and in CI.
I found a handful of places in the driver that were sending timestamps to the DB without converting them to UTC, including the helpers |
@brandur I don't think there's any risk to merging here w/ the sqlite changes I made but you should definitely check them out and see if I did something dumb or unnecessary, or if we're missing test coverage and I fixed real issues.
8000
I for sure had to make the change for |
@@ -196,7 +196,7 @@ func (e *Executor) JobCancel(ctx context.Context, params *riverdriver.JobCancelP | |||
return dbutil.WithTxV(ctx, e, func(ctx context.Context, execTx riverdriver.ExecutorTx) (*rivertype.JobRow, error) { | |||
dbtx := templateReplaceWrapper{dbtx: e.driver.UnwrapTx(execTx), replacer: &e.driver.replacer} | |||
|
|||
cancelledAt, err := params.CancelAttemptedAt.MarshalJSON() | |||
cancelledAt, err := params.CancelAttemptedAt.UTC().MarshalJSON() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the only thing is that although Postgres has a timestamptz
type, that doesn't extend to jsonb
fields, so if we're UTCing everything in SQLite here, then we should probably be doing the same in Postgres too I think right?
That said, I think we should've avoided UTCing everything in the first place. Most people already operate in UTC, but in case they don't, we shouldn't force them to use it.
It might be too late for Postgres in that it's possibly breaking, but given SQLite is brand new, there might still be time. I'll take a pass to see if I can vet it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To recap, Postgres handles timestamptz
in text mode by converting everything to utc prior to storing it in the DB; it does this based on session level settings. In binary mode, the client must do this conversion prior to sending the values over the wire.
In sqlite, however, there is no proper datetime type like timestampt
, though there are some helpers for comparing and manipulating appropriately-formatted strings. It looks like these do actually support timezones.
What I was seeing prior to 7129534 is a test that was failing because time was being provided in my local UTC-5 zone, and then when returned after JobInsertFullMany
it was returning that same time value but with a 0 offset for UTC—i.e. off by 5 hours. So somewhere along the line we were losing the timezone info and not converting correctly. If you revert that commit locally, you should see similar failures. I left it in a separate commit in this PR specifically to make that easy if you want to play around with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, all that makes sense. As written, moving everything to UTC was the right move given the current framework.
I was commenting on this line in particular though, which stores a timestamp to a JSON blob (metadata -> cancel_attempted_at
) — in that case you just have a normal JSON string in both databases. i.e. timestamptz
does not activate in Postgres.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh gotcha. Yeah I think in this specific case since that's an internal system-level timestamp it's probably fine to keep it all UTC
I meant to do this as part of #904 because the `riverdrivertest.go` file is already super long, and the last thing we need is to add another growing group of benchmarks to it.
I meant to do this as part of #904 because the `riverdrivertest.go` file is already super long, and the last thing we need is to add another growing group of benchmarks to it.
As part of a recent experiment, I was paying a lot of attention to the job completion query. I did some analysis of it and figured out that this refactoring can speed it up by about 15%. It also appears to increase throughput significantly.
In the process of figuring this out I decided to start a driver benchmark suite similar to our test suites. I also needed access to
JobInsertFullMany
to insert many jobs in a variety of states, so I pulled that over from Pro.Query stats
Collected stats for benchmark runs, resetting between each.
master
With this change
Benchmark
These benchmarks are completing 2000 jobs using a variety of job states to hit all clauses in the query.
On master
With this change
riverbench
The overall results are inconsistent as I run this several times (as always) but the pattern is consistent: this change bumps throughput by around 8-10k jobs/sec.
On master
With change: