Preliminary "foot in the door" SQLite support #870

brandur · 2025-05-01T04:36:38Z

Here, add an implementation for SQLite. The new driver passes the entire
driver test suite (including some newly added tests), and checks out
against a new end-to-end suite of client tests targeting it. I think at
this point we can consider it to be largely functional, although with
minimal real-world vetting, but at least alpha quality that we can start
putting in front of users.

The main reason this turned out to be quite an effort is that SQLite,
although nominally supporting a lot of the same syntax Postgres does,
just has a lot of unexpected limitations and quirks that all need to be
tracked down separately and a workaround added. For example:

SQLite doesn't really have data types. Everything is either an
integer, real, text, or blob, and anything more sophisticated just has
to be shoehorned into one of these types. For example, there's no
boolean. Despite the fanfare recently, there isn't even a jsonb. Jsonb
gets pushed into a blob.
The most annoying missing data type is a timestamp. Date/times are
stored as either Unix integers or strings, and SQLite provides a
number of obnoxious, error-prone, built-ins to work with them.
No arrays.
No modification operations in CTEs. (Although counting my lucky stars
that CTEs are supported at least.)
No listen/notify.

Aside from that though, we had to get some testing infrastructure in
place, and our testing has traditionally been quite affixed to Postgres,
which would've made a lot of changes necessary. However, the schema
additions in #848 do help to make this quite a bit easier with the
driver-based TestSchema and TestTx functions, and because they raise
schema automatically, it doesn't complicate test setup instructions in
any way by adding extra steps.

A new TestDriverRiverSQLite test has been added to run the full
barrage of driver tests for SQLite. I've also added a new family of
TestClientWithDriver tests that are similar to TestDriver* in that
they run for each of our supported drivers, but in this case they do a
series of end-to-end tests to more fully exercise the entire River
client. I added a respectable set of base client driver tests that walks
each driver through all basic functionality (including for features that
are higher risk of failing from one database compared to the next like
JobList), but as with most tests, they could stand to be extended
further, which I'm sure we'll get to in the future.

An example test for SQLite has been added, demonstrating the use of
River with a SQLite driver and in-memory SQLite database.

A few caveats:

I've found that the combination of SQLite's reduced capabilities +
sqlc's bugginess [1] make bit batch insert/update operations basically
impossible (I tried every workaround I could possibly thin of), so
these are done in loops of individual operations instead. I think
this is okay for now though. For one, sqlc will hopefully get these
limitations fixed eventually, and for two, SQLite databases will often
be running locally, meaning the round trip cost per operation is much
lower than what we'd see in a hosted Postgres somewhere.
It's not totally clear that having a SQLite implementation will be
"worth it enough" in the long run given that it will add some
difficulty to non-trivial future database-related operations. My hope
is that as it's in a prerelease state, we can gauge how bad it is to
keep up to date. If it feels like way more effort than it's worth, we
can still axe it before it ever becomes a functional driver.

[1] sqlc-dev/sqlc#3802 (comment)

travisby · 2025-05-01T17:53:01Z

Just to highlight: I'm super stoked for this.

I know that the river recommendation is to spin up postgres for tests, but I really like (external) dependency-less tests. I have some pretty gross/extensive mocks for testing and for a "local development" version of our app.

This is going to let me do sooo much cleanup, thanks! Hoping that

It's not totally clear that having a SQLite implementation will be "worth it enough"

we find enough usecases for that to not be the case! I understand my more niche example is likely not enough reason on its own!

brandur · 2025-05-02T06:49:40Z

@travisby Great! Thanks for the feedback. We'll try to get this over the line.

brandur · 2025-05-03T21:01:07Z

Alright, I added implementations for all of river_queue as well, along with river_client + river_client_queue although these aren't actually used and we'll probably delete them. That leaves only river_job left.

I ran into a bit of a problem in that I'm pretty sure sqlc + SQLite + use of jsonb is completely broken, and had to implement everything using normal json for now (which TBH, probably isn't that much worse). I opened a discussion about that on sqlc, but given some of the track record of other questions/comments I've left on sqlc issues over the last couple weeks, I'm not super optimistic that it'll go anywhere:

sqlc-dev/sqlc#3953

I did fix at least one existing bug during the implementation too, namely that river_queue being unpaused would target an already unpaused queue despite comments in tests clearly indicating that this was unintentional (and missing assertions to verify a no-op), which is kind of a nice side benefit.

brandur · 2025-05-06T08:01:40Z

@bgentry Hey, how would you feel about me bringing this in? I know it's a little unconventional, but I'm now really close — I got almost all of river_job implemented, and am down to only two queries left. Unfortunately it's the two worst ones, and I'll have to do some work at the client level for sure to make sure we have adequate test coverage, but I'm feeling pretty good that this is going to work.

I ended up adding a bunch of new test cases in the driver tests to help tease out problems with SQLite. Unfortunately that makes the PR more prone to collision, but more positively, the added coverage will help tease out more any other potential problems that might still exist in #883.

If brought into mainline, I figure I'll be able to keep chipping away at it and probably have the whole thing working within a week or two.

brandur · 2025-05-11T01:22:21Z

client.go

@@ -715,7 +725,7 @@ func NewClient[TTx any](driver riverdriver.Driver[TTx], config *Config) (*Client
 	// we're actually going to be working jobs (as opposed to just enqueueing
 	// them):
 	if config.willExecuteJobs() {
-		if !driver.HasPool() {
+		if !driver.PoolIsSet() {


I ended up renaming this so that it sorts more closely to its with PoolSet in the driver files.

bgentry

made it through again 😅

bgentry · 2025-05-12T16:15:16Z

CHANGELOG.md

@@ -7,6 +7,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ## [Unreleased]

+### Added
+
+- Preliminary River driver for SQLite (`riverdriver/riversqlite`). This driver seems to produce good results as judged by the test suite, but so far as minimal real world vetting. Try it and let us know how it works out. [PR #870](https://github.com/riverqueue/river/pull/870).


Suggested change

- Preliminary River driver for SQLite (`riverdriver/riversqlite`). This driver seems to produce good results as judged by the test suite, but so far as minimal real world vetting. Try it and let us know how it works out. [PR #870](https://github.com/riverqueue/river/pull/870).

- Preliminary River driver for SQLite (`riverdriver/riversqlite`). This driver seems to produce good results as judged by the test suite, but so far has minimal real world vetting. Try it and let us know how it works out. [PR #870](https://github.com/riverqueue/river/pull/870).

Thx. Fixed.

bgentry · 2025-05-12T16:17:44Z

client.go

@@ -1997,12 +2014,16 @@ func (c *Client[TTx]) JobList(ctx context.Context, params *JobListParams) (*JobL
 	}
 	params.schema = c.config.Schema

+	if c.driver.DatabaseName() == "sqlite" && params.metadataFragment != "" {
+		return nil, errors.New("JobListResult.Metadata is not supported on SQLite")


The UI doesn't have metadata filters yet but when I add args + metadata filters this will break. We may need to expose a flag in the UI's API about whether these filters should be enabled.

Let's discuss this more, but I'm kind of thinking we should switch to JSON Path for that stuff. It's more general and can be supported by multiple databases.

bgentry · 2025-05-12T16:22:27Z

rivershared/go.mod

@@ -12,16 +12,27 @@ require (
 	github.com/stretchr/testify v1.10.0
 	go.uber.org/goleak v1.3.0
 	golang.org/x/mod v0.24.0
+	modernc.org/sqlite v1.37.0


Double-checking that none of this will get pulled in to a non-sqlite app's deps. Like if I switch the demo project to use this version, does it even download sqlite at all?

Doh, damn, good catch on that. Yeah, that would've added modernc is a dep everywhere.

Basically what happened is that as I was trying to test things, I'd added an import ( _ "modernc.org/sqlite" ) to riversharedtest. It wasn't needed by the end, but because of the underscore import, goimport can't detect that it'd not being used, and therefore left it in.

There is a DBPoolSQLite function in there, but it only needs to use the sql package. The caller tests must be importing modernc.org/sqlite, but that's in the test assembly.

I took out the import and tidied. This dependency is gone now.

bgentry · 2025-05-12T16:35:27Z

go.mod

+// TODO(brandur): Remove this before first release including SQLite driver. It's
+// needed temporarily because there's no riversqlite tag to target (the one
+// referenced above is fake and does not exist).
+replace github.com/riverqueue/river/riverdriver/riversqlite => ../river/riverdriver/riversqlite


Highlighting just to not forget this before release

Yeah, +1. I have a big warning banner at the top of my TODO list as a reminder to myself to remove it.

bgentry · 2025-05-12T17:52:51Z

internal/riverinternaltest/riverdrivertest/riverdrivertest.go

+		// it back early means that a detach can no longer be run on it. It
+		// doesn't matter much if an extra schema is attached until the tests
+		// finish running.
+		// require.NoError(t, exec.SchemaAttach(ctx, &riverdriver.SchemaAttachParams{Schema: schemaVersion2}))


making sure you meant to leave this in here

Ah nope. That was part of a previous attempt in getting things to work and no longer needed. I also removed one other similar block in this file that was also commented out.

Here, add an implementation for SQLite. The new driver passes the entire driver test suite (including some newly added tests), and checks out against a new end-to-end suite of client tests targeting it. I think at this point we can consider it to be largely functional, although with minimal real-world vetting, but at least alpha quality that we can start putting in front of users. The main reason this turned out to be quite an effort is that SQLite, although nominally supporting a lot of the same syntax Postgres does, just has a lot of unexpected limitations and quirks that all need to be tracked down separately and a workaround added. For example: * SQLite doesn't really have data types. Everything is either an integer, real, text, or blob, and anything more sophisticated just has to be shoehorned into one of these types. For example, there's no boolean. Despite the fanfare recently, there isn't even a jsonb. Jsonb gets pushed into a blob. * The most annoying missing data type is a timestamp. Date/times are stored as either Unix integers or strings, and SQLite provides a number of obnoxious, error-prone, built-ins to work with them. * No arrays. * No modification operations in CTEs. (Although counting my lucky stars that CTEs are supported at least.) * No listen/notify. Aside from that though, we had to get some testing infrastructure in place, and our testing has traditionally been quite affixed to Postgres, which would've made a lot of changes necessary. However, the schema additions in #848 do help to make this quite a bit easier with the driver-based `TestSchema` and `TestTx` functions, and because they raise schema automatically, it doesn't complicate test setup instructions in any way by adding extra steps. A new `TestDriverRiverSQLite` test has been added to run the full barrage of driver tests for SQLite. I've also added a new family of `TestClientWithDriver` tests that are similar to `TestDriver*` in that they run for each of our supported drivers, but in this case they do a series of end-to-end tests to more fully exercise the entire River client. I added a respectable set of base client driver tests that walks each driver through all basic functionality (including for features that are higher risk of failing from one database compared to the next like `JobList`), but as with most tests, they could stand to be extended further, which I'm sure we'll get to in the future. An example test for SQLite has been added, demonstrating the use of River with a SQLite driver and in-memory SQLite database. A few caveats: * I've found that the combination of SQLite's reduced capabilities + sqlc's bugginess [1] make bit batch insert/update operations basically impossible (I tried every workaround I could possibly thin of), so these are done in loops of individual operations instead. I *think* this is okay for now though. For one, sqlc will hopefully get these limitations fixed eventually, and for two, SQLite databases will often be running locally, meaning the round trip cost per operation is much lower than what we'd see in a hosted Postgres somewhere. * It's not totally clear that having a SQLite implementation will be "worth it enough" in the long run given that it will add some difficulty to non-trivial future database-related operations. My hope is that as it's in a prerelease state, we can gauge how bad it is to keep up to date. If it feels like way more effort than it's worth, we can still axe it before it ever becomes a functional driver. [1] sqlc-dev/sqlc#3802 (comment)

brandur · 2025-05-14T01:22:36Z

Thanks Blake!

Okay, I changed my mind a bit and am going to merge this in after all. I do still want to try and get an improvement for jsonb in there, but even if I pull that off, the sqlc release cycles are so long that it's going to take some time. I also realized that since json and jsonb are both actually blob in SQLite, there won't be any required DDL for anyone who made a table with json. Given the driver is still pretty alpha, I think I'll just add a note in the changelog with some UPDATE statements that'll convert json to jsonb instead of doing a full migration.

I also added SQLite support to the River CLI to make sure people try to use it have the ability to run migrations and everything.

…ut metadata This one follows up #870 to add an optimization for job completion where we separate out the most common case of setting jobs to `completed` without any metadata required and update all of them in a simplified batch query, then do the rest of the completions afterwards. In any non-degenerate queue, most completions will be setting success states so this should help with real world uses, but it also helps us significantly improve SQLite's benchmarking numbers. Here's a new benchmark run where throughput is ~4x what it was doing before and roughly on par with Postgres: $ go run ./cmd/river bench --database-url "sqlite://:memory:" --num-total-jobs 1_000_000 bench: jobs worked [ 0 ], inserted [ 1000000 ], job/sec [ 0.0 ] [0s] bench: jobs worked [ 88218 ], inserted [ 0 ], job/sec [ 44109.0 ] [2s] bench: jobs worked [ 91217 ], inserted [ 0 ], job/sec [ 45608.5 ] [2s] bench: jobs worked [ 88858 ], inserted [ 0 ], job/sec [ 44429.0 ] [2s] bench: jobs worked [ 77219 ], inserted [ 0 ], job/sec [ 38609.5 ] [2s] bench: jobs worked [ 82045 ], inserted [ 0 ], job/sec [ 41022.5 ] [2s] bench: jobs worked [ 84052 ], inserted [ 0 ], job/sec [ 42026.0 ] [2s] bench: jobs worked [ 72028 ], inserted [ 0 ], job/sec [ 36014.0 ] [2s] bench: jobs worked [ 90047 ], inserted [ 0 ], job/sec [ 45023.5 ] [2s] bench: jobs worked [ 88875 ], inserted [ 0 ], job/sec [ 44437.5 ] [2s] bench: jobs worked [ 89240 ], inserted [ 0 ], job/sec [ 44620.0 ] [2s] bench: jobs worked [ 88842 ], inserted [ 0 ], job/sec [ 44421.0 ] [2s] bench: jobs worked [ 59359 ], inserted [ 0 ], job/sec [ 29679.5 ] [2s] bench: total jobs worked [ 1000000 ], total jobs inserted [ 1000000 ], overall job/sec [ 42822.8 ], running 23.35203575s Here's a normal non-memory file-based database: $ go run ./cmd/river bench --database-url "sqlite://./sqlite/bench.sqlite3" --num-total-jobs 1_000_000 bench: jobs worked [ 0 ], inserted [ 1000000 ], job/sec [ 0.0 ] [0s] bench: jobs worked [ 83657 ], inserted [ 0 ], job/sec [ 41828.5 ] [2s] bench: jobs worked [ 76648 ], inserted [ 0 ], job/sec [ 38324.0 ] [2s] bench: jobs worked [ 88036 ], inserted [ 0 ], job/sec [ 44018.0 ] [2s] bench: jobs worked [ 75473 ], inserted [ 0 ], job/sec [ 37736.5 ] [2s] bench: jobs worked [ 82604 ], inserted [ 0 ], job/sec [ 41302.0 ] [2s] bench: jobs worked [ 84048 ], inserted [ 0 ], job/sec [ 42024.0 ] [2s] bench: jobs worked [ 85508 ], inserted [ 0 ], job/sec [ 42754.0 ] [2s] bench: jobs worked [ 90580 ], inserted [ 0 ], job/sec [ 45290.0 ] [2s] bench: jobs worked [ 83568 ], inserted [ 0 ], job/sec [ 41784.0 ] [2s] bench: jobs worked [ 86062 ], inserted [ 0 ], job/sec [ 43031.0 ] [2s] bench: jobs worked [ 88508 ], inserted [ 0 ], job/sec [ 44254.0 ] [2s] bench: jobs worked [ 75308 ], inserted [ 0 ], job/sec [ 37654.0 ] [2s] bench: total jobs worked [ 1000000 ], total jobs inserted [ 1000000 ], overall job/sec [ 42331.9 ], running 23.622860125s The improved benchmarks only work for fixed job burndown mode (with the `--num-total-jobs` option) because inserting jobs is still pretty slow because it's still done one by one. Once again, I'm pretty sure I'll be able to land some SQLite fixes that'll make batch operations possible using `json_each`, and then we should be able to make all normal operations batch-wise. That'll take some time though, and we can get this optimization out in time for the initial SQLite release.

brandur force-pushed the brandur-sqlite branch 6 times, most recently from 0bc580b to ddb3277 Compare May 1, 2025 05:53

brandur force-pushed the brandur-sqlite branch 6 times, most recently from 8e8cb6c to bfaa56e Compare May 3, 2025 21:00

brandur mentioned this pull request May 3, 2025

river_queue.updated_at should be NOT NULL with a default, or nullable #874

Open

brandur force-pushed the brandur-sqlite branch 12 times, most recently from 44882a3 to 82e8f7a Compare May 6, 2025 07:52

brandur force-pushed the brandur-sqlite branch from 82e8f7a to d21c2f3 Compare May 6, 2025 08:12

brandur force-pushed the brandur-sqlite branch 5 times, most recently from 78c1733 to d05bc02 Compare May 11, 2025 01:21

brandur commented May 11, 2025

View reviewed changes

brandur force-pushed the brandur-sqlite branch 3 times, most recently from 5acd3af to a02c9cc Compare May 11, 2025 01:29

brandur requested a review from bgentry May 11, 2025 01:32

brandur force-pushed the brandur-sqlite branch 4 times, most recently from 8515ecc to cec4162 Compare May 11, 2025 08:25

bgentry approved these changes May 12, 2025

View reviewed changes

brandur force-pushed the brandur-sqlite branch 6 times, most recently from 02e5a41 to d359c14 Compare May 14, 2025 01:17

brandur force-pushed the brandur-sqlite branch from d359c14 to bfa40c4 Compare May 14, 2025 01:19

brandur merged commit d04b82a into master May 14, 2025
10 checks passed

brandur deleted the brandur-sqlite branch May 14, 2025 01:22

brandur mentioned this pull request May 16, 2025

Optimize SQLite's complete happy path where setting completed without metadata #901

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preliminary "foot in the door" SQLite support #870

Preliminary "foot in the door" SQLite support #870

	- Preliminary River driver for SQLite (`riverdriver/riversqlite`). This driver seems to produce good results as judged by the test suite, but so far as minimal real world vetting. Try it and let us know how it works out. [PR #870](https://github.com/riverqueue/river/pull/870).
	- Preliminary River driver for SQLite (`riverdriver/riversqlite`). This driver seems to produce good results as judged by the test suite, but so far has minimal real world vetting. Try it and let us know how it works out. [PR #870](https://github.com/riverqueue/river/pull/870).

Preliminary "foot in the door" SQLite support #870

Preliminary "foot in the door" SQLite support #870

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment