8000 the multiply and div operate of 2^n are much slower than left/right shifts in arm Linux · Issue #556 · aleaxit/gmpy · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content 10000
the multiply and div operate of 2^n are much slower than left/right shifts in arm Linux #556
Open
@l1t1

Description

@l1t1
from gmpy2 import mpz

def halley(digits):
    BASE = mpz(2)**int(digits/0.301)#mpz(10)**digits
    m, n = mpz(9369319), mpz(6625109)
    x = (m * BASE) // n  
    for _ in range(int(mpz(digits).bit_length() // 2 + 2)):  
        x_sq = (x * x) // BASE
        numerator = 2 * (2 * BASE - x_sq)  # 2*(2 - x^2)
        denominator = 3 * x_sq + 2 * BASE  # 3x^2 + 2
        correction = (BASE + (numerator * BASE // denominator) ) 
        x = (x * correction) // BASE

    return (x*mpz(10)**digits)//BASE 

def halley2(digits):
    bin_shift = int(digits / 0.301)  
    BASE = mpz(2)**bin_shift
    DEC_BASE = mpz(10)**digits
    m, n = mpz(9369319), mpz(6625109)
    x = (m * BASE) // n
    
    for _ in range(int(mpz(digits).bit_length() // 2 + 2)): 
        x_sq = (x * x) >> bin_shift
        numerator = ((BASE << 1) - x_sq)<<1  # 2*(2 - x^2)
        denominator = 3 * x_sq + (BASE << 1)  # 3x^2 + 2
        correction = BASE + (numerator * BASE) // denominator
        x = (x * correction) >> bin_shift
    
    return (x * DEC_BASE) >> bin_shift

halley(1000000) runs as fast as halley2(1000000) in x64 Linux, but it runs much slower than halley2 in arm Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0