Make preparser can handle large(4301+ digits) integers #40178

soon-haari · 2025-05-28T17:52:20Z

I'm pretty sure sage wants to use all kinds of numbers without losing precision, or errors, since I can see sys.set_int_max_str_digits(0) in all.py.

However, even with the limit set to 0 (unlimited) in all.py which is called during import, it still cannot handle 4301+ digits integers, because Python refuses to run if there's 4301+ digit number in the first place.

For example, the following sage code has the following error (the 11...11 is actually 5000 digits of 1):

long.sage

a = 11...11

sage long.sage
  File "/home/soon_haari/mysage/long.sage.py", line 6
    _sage_const_11...11
SyntaxError: Exceeds the limit (4300 digits) for integer string conversion: value has 5000 digits; use sys.set_int_max_str_digits() to increase the limit - Consider hexadecimal for huge integer literals to avoid decimal conversion limits.

In more details, 11...11 parses to Integer(11...11) where 11...11 in Integer is supposed to be <class 'int'> in Python.
I suggest it parses to Integer('11...11') instead.

Funnily enough, real/complex numbers like 1.1 or 1j or even 1.1j have no problem due to long digits, because it parses to RealNumber('1.1'), ComplexNumber(0, '1'), ComplexNumber(0, '1.1') which use string as arguments already. I suggest the same fix.

This fixes #40179.

📝 Checklist

The title is concise and informative.
The description explains in detail what this PR is about.
I have linked a relevant issue or discussion.
I have created tests covering the changes.
I have updated the documentation and checked the documentation preview.

⌛ Dependencies

soon-haari · 2025-05-28T18:06:35Z

Here's the corresponding issue:
#40179

vincentmacri · 2025-05-28T22:27:06Z

Have you tested the performance impact of this on integers under 4301 digits? When I test it, Integer('1234567890') is much slower than Integer(1234567890).

Instead, I think you should only parse it as a string if it's over the limit that int can handle.

Make sure to also test that this works with octal, hex, and binary values.

soon-haari · 2025-05-28T22:57:19Z

Have you tested the performance impact of this on integers under 4301 digits? When I test it, Integer('1234567890') is much slower than Integer(1234567890).

Instead, I think you should only parse it as a string if it's over the limit that int can handle.

Make sure to also test that this works with octal, hex, and binary values.

Hello, thanks for the suggestion! I now made it to only add quotes when the length is longer than 4300.

For octal, hex, and binary values, it is on a completely different elif branch:

sage/src/sage/repl/preparse.py

Lines 1315 to 1316 in aaf84e1

    
           elif len(num) < 2 or num[1] in 'oObBxX': 
        
               num_make = "Integer(%s)" % num

so it's kept used without quotes. Which works fine without length limit of 4300 since the base is the power of 2 for all '0b...', '0o...', '0x...'.
(python/cpython#95778 : the mitigation of length limit 4300 at the first place is to prevent O(bitlength^2) integer calculation, however its O(bitlength) for base2, 8, 16.)

edit) apparently Python already use better algorithm than O(n^2)? But anyways, just wanted to mention that it doesn't affect base of power of 2.

vincentmacri

Update the description of the quotes parameter in the docstring to mention that it's also used for Integer when the integer is given in base 10 and is too large for the python limit.

Add a TESTS block to the function docstring and add a test there that makes sure your change fixes the bug you're trying to fix. I'm suggesting to make a TESTS block rather than add it to the EXAMPLES block because examples are included in the generated documentation, and this is kind of a weird esoteric thing that doesn't need to be in the generated documentation (and it will take up too much space in the documentation if it were included).

Looks good otherwise.

vincentmacri · 2025-05-28T23:31:31Z

Since you're a first-time contributor the CI won't automatically run without someone approving it. I don't have permissions to approve CI runs. For the sake of saving time, please run the Sage test suite locally and make sure all tests are passing with your changes before we ping someone else to approve the CI run.

soon-haari · 2025-05-29T01:47:18Z

👀 Thanks, I updated the docs!

One question though, I'm not quite sure how should I do all tests, running sage -t --all still fails lot's of tests with the develop branch, is it normal behavior? Or does it all pass in the ideal environment?

So I couldn't verify if all the tests pass yet.

Also, the following part is the test i added:

sage/src/sage/repl/preparse.py

Lines 1192 to 1197 in f43e339

    
                   sage: preparse_numeric_literals("1" * 4300) == f"Integer({"1" * 4300})" 
        
                   True 
        
                   sage: preparse_numeric_literals("1" * 4301) == f"Integer({"1" * 4301})" 
        
                   False 
        
                   sage: preparse_numeric_literals("1" * 4301) == f"Integer('{"1" * 4301}')" 
        
                   True

sage/src/sage/repl/preparse.py

Lines 1263 to 1266 in f43e339

    
                   sage: preparse_numeric_literals("1" * 4301, quotes=None) == f'Integer(str().join(map(chr, {[49] * 4301})))' 
        
                   True 
        
                   sage: preparse_numeric_literals("1" * 4301, quotes="'''") == f"Integer('''{"1" * 4301}''')" 
        
                   True

But I noticed they both are still included when I ran sage -t ./sage/src/sage/repl/preparse.py so didn't bother to add TEST:: block. Please tell me if you still think it would be better to add the TEST:: block!

user202729 · 2025-05-29T09:51:46Z

A few failures are expected since the test suite is somewhat flaky. (I remember reading this in the documentation but can't find it now ><) But if too many fails, either you discovered some bugs or something is wrong on your machine.

TESTS:: (note the S) is a special marker that it excludes the following content from being rendered in the documentation. Otherwise all examples are tested as well. (that's why they're called doctest) So up to you whether you think they're useful to be included in the documentation.

vincentmacri · 2025-05-29T16:01:03Z

👀 Thanks, I updated the docs!

One question though, I'm not quite sure how should I do all tests, running sage -t --all still fails lot's of tests with the develop branch, is it normal behavior? Or does it all pass in the ideal environment?

So I couldn't verify if all the tests pass yet.

Most of the tests should pass, maybe only a handful should fail. Make sure the tests for the function you are changing pass, and that no related functions are breaking.

You should also make an example .sage file and look at the generated .sage.py that it outputs when you run sage on it, making sure that small integers aren't converted to strings and large ones are, and that binary, hex, and octal integers still work as they should. That file doesn't need to be committed though.

vincentmacri · 2025-05-29T16:13:54Z

src/sage/repl/preparse.py

@@ -1253,6 +1260,10 @@ def preparse_numeric_literals(code, extract=False, quotes="'"):
        'RealNumber(str().join(map(chr, [51, 46, 49, 52])))'
        sage: preparse_numeric_literals('5j', quotes=None)
        'ComplexNumber(0, str().join(map(chr, [53])))'
+        sage: preparse_numeric_literals("1" * 4301, quotes=None) == f'Integer(str().join(map(chr, {[49] * 4301})))'


Tests that specifically address a bug should provide a link to the bug they are testing a fix for. Add a brief explanation of what this test is doing before hand and use :issue:`40179` to add a link to it.

vincentmacri

Document what bug you're testing.

vincentmacri · 2025-05-29T16:14:59Z

Also please edit your PR description to contain the phrase "fixes #40179" in order to make sure that the issue is closed once this is merged.

vincentmacri · 2025-05-30T23:56:05Z

LGTM. @tscrim can you approve a CI run?

src/sage/repl/preparse.py

Co-authored-by: Travis Scrimshaw <clfrngrown@aol.com>

update preparser

45aaaa9

github-actions bot added the s: needs review label May 28, 2025

soon-haari mentioned this pull request May 28, 2025

Sage doesn't support 4301+ digit decimal integers #40179

Open

1 task

unify quote usage

49ff108

vincentmacri added s: needs work and removed s: needs review labels May 28, 2025

add length condition

aaf84e1

github-actions bot added s: needs review and removed s: needs work labels May 28, 2025

vincentmacri suggested changes May 28, 2025

View reviewed changes

vincentmacri added s: needs work and removed s: needs review labels May 28, 2025

update docs

f43e339

github-actions bot added s: needs review and removed s: needs work labels May 29, 2025

vincentmacri reviewed May 29, 2025

View reviewed changes

vincentmacri suggested changes May 29, 2025

View reviewed changes

add issue link

50c9eb1

tscrim reviewed May 30, 2025

View reviewed changes

src/sage/repl/preparse.py Outdated Show resolved Hide resolved

soon-haari and others added 2 commits May 31, 2025 11:53

Update src/sage/repl/preparse.py

de3a3da

Co-authored-by: Travis Scrimshaw <clfrngrown@aol.com>

fix typo

ee21b0c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make preparser can handle large(4301+ digits) integers #40178

Make preparser can handle large(4301+ digits) integers #40178

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Make preparser can handle large(4301+ digits) integers #40178

Are you sure you want to change the base?

Make preparser can handle large(4301+ digits) integers #40178

Uh oh!

Conversation

Uh oh!

📝 Checklist

⌛ Dependencies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!