8000 Make preparser can handle large(4301+ digits) integers by soon-haari · Pull Request #40178 · sagemath/sage · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Make preparser can handle large(4301+ digits) integers #40178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

soon-haari
Copy link
@soon-haari soon-haari commented May 28, 2025

I'm pretty sure sage wants to use all kinds of numbers without losing precision, or errors, since I can see sys.set_int_max_str_digits(0) in all.py.

However, even with the limit set to 0 (unlimited) in all.py which is called during import, it still cannot handle 4301+ digits integers, because Python refuses to run if there's 4301+ digit number in the first place.

For example, the following sage code has the following error (the 11...11 is actually 5000 digits of 1):

long.sage

a = 11...11
sage long.sage
  File "/home/soon_haari/mysage/long.sage.py", line 6
    _sage_const_11...11
SyntaxError: Exceeds the limit (4300 digits) for integer string conversion: value has 5000 digits; use sys.set_int_max_str_digits() to increase the limit - Consider hexadecimal for huge integer literals to avoid decimal conversion limits.

In more details, 11...11 parses to Integer(11...11) where 11...11 in Integer is supposed to be <class 'int'> in Python.
I suggest it parses to Integer('11...11') instead.

Funnily enough, real/complex numbers like 1.1 or 1j or even 1.1j have no problem due to long digits, because it parses to RealNumber('1.1'), ComplexNumber(0, '1'), ComplexNumber(0, '1.1') which use string as arguments already. I suggest the same fix.

This fixes #40179.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

@soon-haari
Copy link
Author

Here's the corresponding issue:
#40179

@vincentmacri
Copy link
Contributor

Have you tested the performance impact of this on integers under 4301 digits? When I test it, Integer('1234567890') is much slower than Integer(1234567890).

Instead, I think you should only parse it as a string if it's over the limit that int can handle.

Make sure to also test that this works with octal, hex, and binary values.

@soon-haari
Copy link
Author
soon-haari commented May 28, 2025

Have you tested the performance impact of this on integers under 4301 digits? When I test it, Integer('1234567890') is much slower than Integer(1234567890).

Instead, I think you should only parse it as a string if it's over the limit that int can handle.

Make sure to also test that this works with octal, hex, and binary values.

Hello, thanks for the suggestion! I now made it to only add quotes when the length is longer than 4300.

For octal, hex, and binary values, it is on a completely different elif branch:

sage/src/sage/repl/preparse.py

Lines 1315 to 1316 in aaf84e1

elif len(num) < 2 or num[1] in 'oObBxX':
num_make = "Integer(%s)" % num
so it's kept used without quotes. Which works fine without length limit of 4300 since the base is the power of 2 for all '0b...', '0o...', '0x...'.
(python/cpython#95778 : the mitigation of length limit 4300 at the first place is to prevent O(bitlength^2) integer calculation, however its O(bitlength) for base2, 8, 16.)

edit) apparently Python already use better algorithm than O(n^2)? But anyways, just wanted to mention that it doesn't affect base of power of 2.

Copy link
Contributor
@vincentmacri vincentmacri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the description of the quotes parameter in the docstring to mention that it's also used for Integer when the integer is given in base 10 and is too large for the python limit.

Add a TESTS block to the function docstring and add a test there that makes sure your change fixes the bug you're trying to fix. I'm suggesting to make a TESTS block rather than add it to the EXAMPLES block because examples are included in the generated documentation, and this is kind of a weird esoteric thing that doesn't need to be in the generated documentation (and it will take up too much space in the documentation if it were included).

Looks good otherwise.

@vincentmacri
Copy link
Contributor

Since you're a first-time contributor the CI won't automatically run without someone approving it. I don't have permissions to approve CI runs. For the sake of saving time, please run the Sage test suite locally and make sure all tests are passing with your changes before we ping someone else to approve the CI run.

@soon-haari
Copy link
Author
soon-haari commented May 29, 2025

👀 Thanks, I updated the docs!

One question though, I'm not quite sure how should I do all tests, running sage -t --all still fails lot's of tests with the develop branch, is it normal behavior? Or does it all pass in the ideal environment?

So I couldn't verify if all the tests pass yet.

Also, the following part is the test i added:

sage/src/sage/repl/preparse.py

Lines 1192 to 1197 in f43e339

sage: preparse_numeric_literals("1" * 4300) == f"Integer({"1" * 4300})"
True
sage: preparse_numeric_literals("1" * 4301) == f"Integer({"1" * 4301})"
False
sage: preparse_numeric_literals("1" * 4301) == f"Integer('{"1" * 4301}')"
True

sage/src/sage/repl/preparse.py

Lines 1263 to 1266 in f43e339

sage: preparse_numeric_literals("1" * 4301, quotes=None) == f'Integer(str().join(map(chr, {[49] * 4301})))'
True
sage: preparse_numeric_literals("1" * 4301, quotes="'''") == f"Integer('''{"1" * 4301}''')"
True

But I noticed they both are still included when I ran sage -t ./sage/src/sage/repl/preparse.py so didn't bother to add TEST:: block. Please tell me if you still think it would be better to add the TEST:: block!

@user202729
Copy link
Contributor

A few failures are expected since the test suite is somewhat flaky. (I remember reading this in the documentation but can't find it now ><) But if too many fails, either you discovered some bugs or something is wrong on your machine.

TESTS:: (note the S) is a special marker that it excludes the following content from being rendered in the documentation. Otherwise all examples are tested as well. (that's why they're called doctest) So up to you whether you think they're useful to be included in the documentation.

@vincentmacri
Copy link
Contributor
vincentmacri commented May 29, 2025

👀 Thanks, I updated the docs!

One question though, I'm not quite sure how should I do all tests, running sage -t --all still fails lot's of tests with the develop branch, is it normal behavior? Or does it all pass in the ideal environment?

So I couldn't verify if all the tests pass yet.

Most of the tests should pass, maybe only a handful should fail. Make sure the tests for the function you are changing pass, and that no related functions are breaking.

You should also make an example .sage file and look at the generated .sage.py that it outputs when you run sage on it, making sure that small integers aren't converted to strings and large ones are, and that binary, hex, and octal integers still work as they should. That file doesn't need to be committed though.

@@ -1253,6 +1260,10 @@ def preparse_numeric_literals(code, extract=False, quotes="'"):
'RealNumber(str().join(map(chr, [51, 46, 49, 52])))'
sage: preparse_numeric_literals('5j', quotes=None)
'ComplexNumber(0, str().join(map(chr, [53])))'
sage: preparse_numeric_literals("1" * 4301, quotes=None) == f'Integer(str().join(map(chr, {[49] * 4301})))'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that specifically address a bug should provide a link to the bug they are testing a fix for. Add a brief explanation of what this test is doing before hand and use :issue:`40179` to add a link to it.

Copy link
Contributor
@vincentmacri vincentmacri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what bug you're testing.

@vincentmacri
Copy link
Contributor

Also please edit your PR description to contain the phrase "fixes #40179" in order to make sure that the issue is closed once this is merged.

@vincentmacri
Copy link
Contributor

LGTM. @tscrim can you approve a CI run?

soon-haari and others added 2 commits May 31, 2025 11:53
Co-authored-by: Travis Scrimshaw <clfrngrown@aol.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects 4BBD
None yet
Development

Successfully merging this pull request may close these issues.

Sage doesn't support 4301+ digit decimal integers
4 participants
0