8000 fastcalls by telamon · Pull Request #2 · holepunchto/simdle-native · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fastcalls #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed

fastcalls #2

wants to merge 8 commits into from

Conversation

telamon
Copy link
@telamon telamon commented Apr 20, 2025

I ported the original macros to c++ templates. I also checked the compiler output
and verified that they produce close to the same instructions.

Caveat: unless the usecase contains one large typedarray then
the overhead of repeated tiny nativecalls quickly stacks up
and simdle-universal/fallback.js will be faster.

$ bare bench.mjs
# cnt-16
    # ops 2.74e+6 avg 2742256 total 5487255 runtime 2001
    # ops 2.66e+6 avg 2702742 total 10816375 runtime 4002
    # ops 2.74e+6 avg 2715479 total 16301021 runtime 6003
    # ops 2.76e+6 avg 2724424 total 20000000 runtime 7341
    # ops 1.44e+7 avg 14357502 total 20000000 runtime 1393
    # native 2.76e+6 vs. js 1.44e+7; -80.74%
ok 1 - cnt-16 # time = 8736ms
# clz-32
    # ops 1.38e+5 avg 137653 total 275444 runtime 2001
    # ops 1.39e+5 avg 138290 total 553435 runtime 4002
    # ops 1.38e+5 avg 138097 total 828996 runtime 6003
    # ops 1.40e+5 avg 138427 total 1000000 runtime 7224
    # ops 2.92e+6 avg 2923977 total 1000000 runtime 342
    # native 1.40e+5 vs. js 2.92e+6; -95.21%
ok 2 - clz-32 # time = 7567ms

@telamon telamon marked this pull request as ready for review April 22, 2025 16:21
@telamon telamon closed this Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0