8000 Editions support · Issue #746 · protobuf-c/protobuf-c · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Editions support #746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mkruskal-google opened this issue Jan 10, 2025 · 8 comments
Open

Editions support #746

mkruskal-google opened this issue Jan 10, 2025 · 8 comments

Comments

@mkruskal-google
Copy link

Hey folks,

I work on protobuf and wrote a lot of the new editions code. We'd be interested in getting support added to protobuf-c, but it looks like you hit various issues in #711. I also came across https://groups.google.com/g/protobuf-c/c/do1XdobUJxs related to this.

Is there some channel where we can discuss these issues more? I wasn't able to even find an email address to contact maintainers, and the google group seems pretty dead. So this seemed like the best way to reach out :)

Also note: idk if you found this or not, but I wrote up some documentation for plugin owners: https://protobuf.dev/editions/implementation/

@edmonds
Copy link
Member
edmonds commented Jan 10, 2025

Hi, Mike!

We don't really have any channels other than the mailing list, and the GitHub issues/pulls/discussions. Myself and @lipnitsk are the current maintainers of the GitHub repository. I'd be happy to email you if you want to discuss offline, or on this issue is fine?

Dave Benson is the original author of protobuf-c and he still pops up on the mailing list from time to time. Ilya and I sort of took over the maintainership of the project about ten years ago and migrated the code from Google Code to GitHub, modernized the build system, etc. If I recall correctly, Dave Benson's original goal was to contribute the protobuf-c code generator and C runtime library to the upstream (Google) protobuf project, but it wasn't accepted and Dave released it as an independent project.

It would be great to have more help keeping protobuf-c working with the latest protobuf releases (while still maintaining compatibility with the protobuf versions shipped with Ubuntu LTS / RHEL / Debian stable, etc.), even if we don't necessarily support the latest and greatest features.

Thanks!

@edmonds
Copy link
Member
edmonds commented Jan 12, 2025

I think I figured out what was going on in #711. Would you mind reviewing 8000 a4d0480 and seeing if that's the right way for our CodeGenerator to report that it supports proto2 and proto3, only? Apparently I wasn't replicating the build failures because I was using protoc --c_out= which invokes our protoc-gen-c compiler plugin, while the build failures that people were reporting were being caused by protoc-c, which is a legacy binary that links in libprotoc and calls CommandLineInterface::RegisterGenerator().

Basically:

  uint64_t GetSupportedFeatures() const { return 0; }
  Edition GetMinimumEdition() const { return Edition::EDITION_PROTO2; }
  Edition GetMaximumEdition() const { return Edition::EDITION_PROTO3; }

@mkruskal-google
Copy link
Author
mkruskal-google commented Jan 14, 2025

Yep, that's how you tell protoc you don't support editions or proto3-optional. The GetMinimumEdition/GetMaximumEdition overrides should be optional too, the intention was that pre-existing plugins wouldn't report that they supported editions until they explicitly override these methods. Note: --experimental_editions will disable these protoc checks to allow for early testing with editions.

I don't mind discussing here if you want. Getting editions support for your gencode should be pretty straightforward (especially since you use our C++ plugin framework), the tricky case is runtime reflection and dynamic messages. From what I can tell you do support reflection, but don't do dynamic messages. In that case, the big question is how to pass the appropriate feature settings to the runtime descriptors. The simplest thing would be to embed relevant resolved features into the gencode from your plugin (where we provide access through CodeGenerator).

Looking through your code, I noticed the following quirks related to the 2023 features:

  • syntax: you're using the legacy descriptor APIs we added as a temporary measure. Code generators actually have privileged access in order to validate their own features, and you can access this information via CodeGenerator::GetEdition (EDITION_PROTO2 and EDITION_PROTO3 are the legacy editions). Similarly CodeGenerator::GetResolvedSourceFeatures will get you the fully resolved features of any descriptor. In principle you shouldn't need either of these unless you want to leverage editions for your plugin, and any use before then would point to some non-conformance in editions support.
  • delimited: it looks like you currently crash for proto2 group fields. In edition 2023 we removed group syntax, but left the DELIMITED feature to support group encoding. Adding support for this likely isn't too bad, but as a quick fix you might want to fail more smoothly (right now it looks like it would dereference NULL and hopefully crash)
  • enum type: I don't see any behavior difference in enums between proto2 and proto3, so I assume you treat all enums as closed (proto2 behavior). Open enums are the new 2023 default, and behave a lot better under schema changes. If I'm wrong and you treat all enums as open (proto3 behavior), there's precedent to just leave it as-is
  • presence: It looks like you don't support proto3-optional, but ignoring that (which is safe since you don't advertise support to protoc today anyway) many of your syntax checks can just be replaced by a call to FieldDescriptor::has_presence. This returns false for repeated or implicit presence (proto3 non-optional) fields.
  • packed: you're currently checking descriptor_->options().packed() to see what was specified in the .proto file. This doesn't work under edition 2023, so we recommend FieldDescriptor::is_packed. In 30.x and beyond we'll be stripping this field option from all descriptors to avoid misuse.
  • utf8: I couldn't find any utf8 validation. In principle proto3 is supposed to enforce utf8 validity on string fields. Under editions we moved this behavior change from proto2 to a feature

@mkruskal-google
Copy link
Author
mkruskal-google commented Jan 14, 2025

Apparently I wasn't replicating the build failures because I was using protoc --c_out= which invokes our protoc-gen-c compiler plugin, while the build failures that people were reporting were being caused by protoc-c, which is a legacy binary that links in libprotoc and calls CommandLineInterface::RegisterGenerator().

I don't follow the difference between these two things. protoc-c/main.cc looks like a normal plugin binary, which links in libprotoc and calls CommandLineInterface::RegisterGenerator. Is there a second binary somewhere I'm missing? This line looks like one is just a softlink for the other

@edmonds
8000 Copy link
Member
edmonds commented Jan 14, 2025

Hi, Mike!

Thanks for all these details! I will take a closer look at your first message later, but regarding:

Apparently I wasn't replicating the build failures because I was using protoc --c_out= which invokes our protoc-gen-c compiler plugin, while the build failures that people were reporting were being caused by protoc-c, which is a legacy binary that links in libprotoc and calls CommandLineInterface::RegisterGenerator().

I don't follow the difference between these two things. protoc-c/main.cc looks like a normal plugin binary, which links in libprotoc and calls CommandLineInterface::RegisterGenerator. Is there a second binary somewhere I'm missing? This line looks like one is just a softlink for the other

protoc-c and protoc-gen-c are the same binary with different behavior depending on argv[0]. If invoked as the standalone command protoc-c we do the RegisterGenerator logic, if invoked as protoc-gen-c by protoc we behave as a normal protoc plugin. Some of our reverse dependencies still invoke us as protoc-c because originally that was the only way to invoke our code generator.

I was finding that protoc-c and protoc-gen-c were behaving differently in this comment here:

#711 (comment)

I don't know why the behavior is different, but that's what made me miss the behavior difference originally, because I was only testing with protoc --c_out=.

@edmonds
Copy link
Member
edmonds commented Jan 14, 2025
  • syntax: you're using the legacy descriptor APIs we added as a temporary measure.

Which APIs are we using are legacy?

@mkruskal-google
Copy link
Author

protoc-c and protoc-gen-c are the same binary with different behavior depending on argv[0]. If invoked as the standalone command protoc-c we do the RegisterGenerator logic, if invoked as protoc-gen-c by protoc we behave as a normal protoc plugin. Some of our reverse dependencies still invoke us as protoc-c because originally that was the only way to invoke our code generator.

I was finding that protoc-c and protoc-gen-c were behaving differently in this comment here:

#711 (comment)

I don't know why the behavior is different, but that's what made me miss the behavior difference originally, because I was only testing with protoc --c_out=.

Ah yea sorry, I see what it's doing now :P. The protoc-c standalone binary acts like protoc, with protoc-gen-c as a "builtin" generator. I didn't actually realize anyone was doing this, and it's possible the enforcement logic is slightly off in that case. At least in main right now, I think it would correctly fail if either GetSupportedFeatures or GetMaximumEdition signaled that editions weren't supported. It's possible an earlier iteration assumed all built-in generators supported editions though, so it's safer to set both

@mkruskal-google
Copy link
Author
mkruskal-google commented Jan 14, 2025
  • syntax: you're using the legacy descriptor APIs we added as a temporary measure.

Which APIs are we using are legacy?

This is referring to FieldSyntax. Pretty much every direct inspection of syntax or edition signals some editions-hostile code that needs to be migrated to a feature check. AFAICT all of these can be replaced by FieldDescriptor::has_presence. This method is pretty old (~2020), but if you want to support older versions of protobuf you can have a macro switch between field->has_presence() and a condition on field->file()->syntax()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0