-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Introduce custom sympy srepr parser [stable/2.0] #14023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce custom sympy srepr parser [stable/2.0] #14023
Conversation
This commit introduces a custom parser to QPY for parameter expression payloads that were generated using sympy. Prior to QPY version 10 this was the only way we supported serializing parameter expressions in QPY. For QPY version 10, 11, and 12 sympy could optionally be used if the payload was generated explicitly to not use symengine (in qiskit 1.0 it defaulted to use symengine). This serialization format relied on sympy to generate a string representation of the expression which we then put in the payload. On deserialization we called sympy's `parse_expr()` function which internally is calling sympy's `sympify()` internally. Sympy documents that `sympify()` relies on Python's `eval()` for string input and should not be used with untrusted input. But `parse_expr()` didn't have such a warning (at the time, I plan to contribute adding one), so using this function provides an avenue for arbitrary code execution during QPY deserialization. This commit fixes this issue by writing a custom parser for the string repesentation in a QPY payload based on python's ast module. This parser walks the abstract syntax tree and builds the sympy expression object as it it goes. It is restricted to the operations that `ParameterExpression` supports and if any part of the string tries to use functionality outside that set it will error.
One or more of the following people are relevant to this code:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I just have two minor comments and then I think it's ready.
Is the plan to also add version changes to this PR and use it to release like in #14022? |
Let me do that now and apply your suggestions, that will save us some CI time. |
Co-authored-by: Elena Peña Tapia <57907331+ElePT@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PGO fails when we build qiskit-cext both as cdylib in addition to rlib, due to missing Python symbols. This could be due to us not setting the pyo3/extension-module correctly; which we want enabled when we compile the crate for use in the Python extension crate qiskit-pyext, but don't want for the standalone cdylib usage. This PR changes to only building the rlib per default, meaning that building the qiskit-pyext crate (and running PGO) will not build the cdylib. To build the cdylib for C standalone mode, one has to request that crate type via cargo rustc .... --crate-type cdylib. We're investigating further solutions but this now fixes PGO and the C standalone tests pass as well.
Summary
This commit introduces a custom parser to QPY for parameter expression payloads that were generated using sympy. Prior to QPY version 10 this was the only way we supported serializing parameter expressions in QPY. For QPY version 10, 11, and 12 sympy could optionally be used if the payload was generated explicitly to not use symengine (in qiskit 1.0 it defaulted to use symengine).
This serialization format relied on sympy to generate a string representation of the expression which we then put in the payload. On deserialization we called sympy's parse_expr() function which internally is calling sympy's sympify() internally. Sympy documents that sympify() relies on Python's eval() for string input and should not be used with untrusted input. But parse_expr() didn't have such a warning (at the time, I plan to contribute adding one), so using this function provides an avenue for arbitrary code execution during QPY deserialization.
This commit fixes this issue by writing a custom parser for the string representation in a QPY payload based on python's ast module. This parser walks the abstract syntax tree and builds the sympy expression object as it it goes. It is restricted to the operations that ParameterExpression supports and if any part of the string tries to use functionality outside that set it will error.
Details and comments