-
-
Notifications
You must be signed in to change notification settings - Fork 152
[wip] Revamp codegen #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wip] Revamp codegen #314
Conversation
Wow! It's taken me a few days to work through this (and likely will a few more...) but I am really looking forward to getting this in. |
because it's not defined correctly in stripe's own spec, the code generation doesn't correctly define stuff like https://stripe.com/docs/api/prices/create#create_price-currency_options, which should be edit: also cashapp is missing from the payment method enums, which i also suspect is a problem on stripe's end edit: sorry about spamming edits, just want to document whatever issues i find;
because edit: it does seem to be something generally wrong with the |
@mzeitlin11 hi i'm interested in filling out the |
Hi @databasedav, good timing, was just hoping to start getting back to this soon! I need to refamiliarize myself with what's going, rebase, and clean stuff up. Will probably be hard to work off this branch since lots of changes still required that will determine how |
Wip Revert gen changes wip Revert gen changes Revert gen changes REF: split out code writing more Add back generated files Keep boolean params out of constructor [wip] Revamp codegen Wip [wip] Revamp codegen Remove files to ignore Remove more accidently added Remove unused deleted Remove redundant prelude [skip ci] Remove some unused codegen wip add pre rebase
Status Update: Main outstanding TODO's: Crates / Features
Pagination
Miniserde
Deduplication
|
Hey! Thanks for continuing on this. I am back from holidays today 🏄 and plan on addressing all your questions with some ideas. |
Ok! I will have comments about the openapi crate as well soon... CratesI am going to find some time to update my code-splitting PR and see if I can apply the same analysis to this PR and see if there are any clean boundaries between modules. Pagination
I think this is OK, we have enough hacks in the previous impl so I don't mind leaving this and coming back to it later if necessary since the scope of the 'damage' is probably quite limited.
That makes sense to me. Happy to brainstorm implementations if you want. MiniserdeGreat! Also happy to help prototype on this. My intuition says it will be fairly straight forward to just spit out a manual trait impl for enums. DedupeThere are a few interesting parts here, but I think this should be tackled later. Some points to consider:
|
Sounds great! Will push up an initial implementation of this soonish (hopefully), but that will be more about setting up the structure of having the codegen understand different crates and crate deps than finding smart module boundaries. Initial efforts gave some annoying cycles like in your PR, due to cycles from cases like
combined with
Without ending up with just a giant The annoyance with that (other than more codegen complexity) is that right now requests are namespaced as
which would have to change if EDIT: this kind of approach would also have the benefit of solving the webhook deserialization problem by keeping all the event types in the same crate so webhooks could be enabled without requiring everything else |
Nice. There is of course also the possibility of not splitting at all, since the code size with miniserde is so so much better, but let's give it a shot and see if we can find something that works. |
Added some code messing around with type deduplication within files. Definitely needs cleanup / docs (as does everything else), but wanted to include a POC since it influences compile time. Need to finish off Could also introduce granular features to help compile time pretty easily (since we have ideas of dependencies and features could just gate the related Related to #314 (comment), cc @arlyon if you have any thoughts about the
|
Hi there! My company uses async-stripe in our backend. We're only using a small fraction of the Stripe API. So, I'd be happy to test out migrating breaking changes for a realistic use-case (I suspect there's a fair number of other users who also only use a small part of the API). IMO, breaking changes are OK as long as there's a good migration guide. I think Clap 4.0 had a great migration guide as did Dropshot 9.0 here -- explaining how to migrate is good, breaking changes have to happen eventually. And especially if there's a good reason to make them -- I think many users would be thrilled by the compile time improvements, even if it requires some manual migrations. If this PR is in an OK state, I'd be happy to start a WIP branch on my work's backend and report what my migration experience was like. |
@adamchalmers thanks for volunteering to help out. I think the general structure of the codegen is mostly there. We (well, @mzeitlin11) may move some things around internally to try to break the code up better and reduce the size of the compilation units and, as you saw, there is still the open question about which syntax to use for generating the requests. If you are OK with these changes down the line then go ahead. Some feedback from an integration branch would be great. Speaking of which @mzeitlin11 do you think it would be handy to schedule a synchronous call soon to hash out some of the remaining items? You can book something here https://usemotion.com/meet/alexander-lyon/meeting |
Started work on it. One note so far is that this PR requires https://github.com/mzeitlin11/async-stripe/blob/types/async-stripe/Cargo.toml#L110-L111 But the I'd suggest allowing users to use time-core 0.1.x in the final PR. |
Booked time for next week! Crate splitting is now implemented with this initial structure: Compile Time Notes
The majority of the code (so largest compile time costs) then lives in the other generated crates since the request-related structs are the majority of what we generate. Since requests don't depend on each other, feature-gating different API resources should be straightforward, without having to worry about incompatible feature sets. Once those request-related features are added, the compile time should shrink toward how long TODOS
Open Questions
Miniserde Notes
|
Pagination added back (closes #283, closes #246). Slightly different organization, now implemented on both
can do
And can also paginate any
|
Added back webhook handling in new crate Also migrated the examples into standalone example binaries in This is readyish for review / testing (With caveat that these changes require much more testing / docs before being ready for production. @adamchalmers any feedback on issues / annoyances using for real-world use case would be super helpful). I'd still plan to make more smaller changes around naming + deduplicating some generated types (one example right now is recognizing shared types between component definitions and requests, e.g. Also if there are any opinions on MSRV - once that's settled can remove the |
Built(ish) on #311. This initially started as an attempt to better types involved in the codegen, but turned into more of a codegen refactor to handle (almost) all of the OpenAPI spec.
form/multipart
format)Also, all the ext files are no longer needed!
Left to do in this PR
new
) for parameters, figure out how to handle lifetimes / borrowed data without making type definitions more complexDelete
requests with body params. Instead of the currently separatedelete
anddelete_query
methods, would be nice to combine thedelete / post
functions into some builder style system so that delete and post can be handled identically.x-stripeResource
can be used to organize around a structure whereinner_classes
can just be in a module with their parent, making the structure much clearer (along with crate-splitting / features). For example,account
and friends could becomeinstead of the current
. We could then avoid exporting everything from
resources
and have some namespacing.Open Questions
Handling Breaking Changes
Features / crate-splitting
--graph
argument to the codegen which writes a graphviz-compatible outputs of the component dependency graph instead of generating code (but that will hopefully become much clearer if the file organization mentioned in TODOs is implemented. Right now it includes all components so is very hard to learn from).My hope is that we could define features along the lines of feature
x
includes objectsaccount
,customer
, featurey
includes componentbalance, ...
. Dependency tracking would then enable understanding which features necessarily enable other features to fix #137. With a split into crates, this would make clear what cross-crate deps must need to be.Hopefully, circular dependencies could be handled by just mandating that some specific items live in some crate
stripe-core
.Improve handling of unions
With respect to the compile time pains mentioned above, another solution is to continue exploring usage of something like
miniserde
to improve compile times. Right now the unions are the main deterrent to trying this since they are not supported. Currently, we usedserde(untagged)
for these, but I think for most of these unions of object cases we could use theobject
field as a discriminant which hopefully allowsserde
to write more efficient (and smaller) implementations with better error messages. On theminiserde
front, this would allow a reasonable automated generation of adeserialize
impl through a structure likeWhen writing other unions, we often end up with a structure like
This happens because we only generate simple enums without fields or enums with fields. It would be great to simplify this representation into
which would generate less code (hopefully help with compile time on its own) and make implementing
Deserialize
for unions easierThe other common union pattern that appears in the OpenAPI spec is a union of something like
Account
andDeletedAccount
(which is returned by deleting account and some fetching methods). This was previously handled by adding adeleted
field to top-level components with aDeleted
counterpart and hardcoding the return value of aDELETE
request as aDeleted<>
struct parameterized by an id type. This doesn't seem to match the spec however - anAccount
should never have adeleted
field and the return value of a deleting an account can also be theAccount
itself, theDeleted<>
loses a bunch of information and forces hardcoding the request return value. However the automatically generated representationis a breaking change. Deserialization could be done easily by using presence of the
deleted: true
field, but having to match on variantsNaming request functions
Naming requests is now harder since handling everything means fewer assumptions can be made. For now, this PR just names based on the request "operationId" field since that ensures unique names, but it gives pretty verbose / unexplanatory names. Not sure of best solutions here, best ideas I have are to perhaps just have some simple heuristic-based naming for common patterns (create / delete, etc.) then maybe just manually provide name mappings for other requests? (a pain, but there really aren't that many requests).
Handling duplicate generated content (but different name)
The OpenApi schema can only avoid duplicating content by using reference links. Especially in the case of request parameters, this is not done.
For example, the enum specifying webhook enabled events (which is huge) is specified in 2 separate parameters. The content is identical, but since they are not linked in the spec, they are created for each parameter with a different name following the spec. This was a problem as well before, but some cases were explicitly handled by adding a manual ext file and a mapping of field name / object to that specific type name. This works, but doesn't generalize and has the added issue that manually written types can fall out of date easily (for example the
ApiVersion
enum).Since this PR introduces strict definitions of the content of types we generate, we can detect duplicates like these. The question becomes
The main benefit of deduplicating would be (hopefully generating enough less code to improve compile time) and giving a better user experience. However, deduplicating a generated type will usually be a breaking change since the paths will change
(more questions/edits coming soon :))