[LA64_DYNAREC]Add basic avx support for la64. #2745

phorcys · 2025-06-14T09:02:47Z

 basic infra for la64 avx
 some basic ops for avx 
   VMOVDQU/VMOVDQA/VMOVUPS/VMOVAPS/VMOVUPD/VMOVAPD 
   VZEROUPPER/VZEROALL 
   VMOVD/VMOVSD/VMOVSS 
   VINSERTF128/VINSERTI128/VEXTRACTF128/VEXTRACTI128 
   VBROADCASTSS/VBROADCASTSD/VBROADCASTF128

This pullrequest add basic AVX support for la64.

LoongArch LASX is 256bits SIMD, this patch use native 256-width reg/inst to imp avx insts.
Box64 interpreter has split YMM storage(x64emu_t.ymm[] only keep upper 128bits of YMM).
In this patch, 256bits AVX reg push/pop can't used xvld/xvst,
but use vld,vld+xvpermi.q or vst,xvpermi.q+vst instead.

unlike arm64's ymm zero trace method, this patch use avxcache_t to store avx reg width and upper 128bits zero-fill states.

reg conv:

conv	additional transform
SSE->AVX	clear ssecache
AVX->SSE	purge upper 128bits, clear avxcache
AVX128 -> AVX256	land upper 128bits zero-fill if needed
AVX256 -> AVX128	no special transform.

Relay on x64 compiler's auto vzeroup/vzeroall generate. No AVX-SSE Transition Penalties imp yet.

ksco · 2025-06-14T09:18:21Z

👍 Thank you for working on this. At first glance, it looks promising. But it's still a big PR, so I need to find a couple of hours to review, which will be early next week.

ksco · 2025-06-14T09:19:01Z

I should've started the AVX support on RV64 too...

phorcys · 2025-06-14T09:53:31Z

feel free for review this .
the state sync in this patch is a little complicated,I'm not sure that I wrote everything right.
especially when BOX64_DYNAREC_TEST=1.

ptitSeb · 2025-06-14T11:17:22Z

That's an impressive PR! Thank you for your involvement in box64 developpement.

ksco

Overall, this looks good. I left some comments. Please do smaller PRs next time!

src/dynarec/la64/dynarec_la64_avx.c

src/dynarec/la64/dynarec_la64_avx_0f.c

src/dynarec/la64/dynarec_la64_avx_66_0f.c

src/dynarec/la64/dynarec_la64_helper.c

src/dynarec/la64/dynarec_la64_helper.h

src/dynarec/la64/la64_emitter.h

src/dynarec/la64/la64_printer.c

phorcys

updated code.

phorcys · 2025-06-18T09:55:47Z

src/dynarec/la64/dynarec_la64_avx_66_0f.c

+            } else {
+                VINSGR2VR_W(q0, ed, 0);
+            }
+            YMM0(gd);


I means YMM0 vs XVXOR_V both set high 128bits to zero. both causing a pending store(store high 128) in current code. So they are same now.

XVXOR_V would be better when reg is reused in next in-block insts as 256-width.
And to use XVXOR_V , you need to mark reg as 256-width, or cleared high bits will be ignored in current code.
Maybe we can do this opt later when most avx code imp landed.

phorcys · 2025-06-18T09:56:44Z

src/dynarec/la64/dynarec_la64_avx_f2_0f.c

+                    VEXTRINS_D(q0, q2, 0);
+                }
+            } else {
+                VXOR_V(q0, q0, q0);


VMOVD's comment.

src/dynarec/la64/dynarec_la64_helper.c

ksco · 2025-06-20T07:04:42Z

If this is ready for another round, please use the re-request review button.

phorcys · 2025-06-20T07:33:25Z

If this is ready for another round, please use the re-request review button.

You didn't push "request change", so there is no "request review" button

ksco

Another round.

ksco · 2025-06-20T07:53:52Z

src/dynarec/la64/dynarec_la64_avx_66_0f.c

+            } else {
+                VINSGR2VR_W(q0, ed, 0);
+            }
+            YMM0(gd);


Sorry I don't understand. My suggestion here is to change line 153 to XVXOR_V and change line 159 to zero_upper = 0, so we don't have to clear the upper 128 bits again later.

ksco · 2025-06-20T07:56:01Z

src/dynarec/la64/dynarec_la64_avx_f2_0f.c

+                    VEXTRINS_D(q0, q2, 0);
+                }
+            } else {
+                VXOR_V(q0, q0, q0);


Could you just repeat the comment? I can't find that easily via the web UI...

src/dynarec/la64/dynarec_la64_helper.c

ksco · 2025-06-20T08:11:59Z

You didn't push "request change", so there is no "request review" button

Sorry, I must have selected "Comment" instead of "Request changes".

ksco · 2025-06-20T09:18:55Z

So there are only a few pending comments left. After we resolved all that, I'll do a full review again. @phorcys

* basic infra for avx * some basic ops for avx VMOVDQU/VMOVDQA/VMOVUPS/VMOVAPS/VMOVUPD/VMOVAPD VZEROUPPER/VZEROALL VMOVD/VMOVSD/VMOVSS VINSERTF128/VINSERTI128/VEXTRACTF128/VEXTRACTI128 VBROADCASTSS/VBROADCASTSD/VBROADCASTF128

phorcys · 2025-06-20T10:52:03Z

I've fixed all the fix required, please review them ag when you have free time. Thanks. @ksco

ksco self-requested a review June 14, 2025 09:18

phorcys force-pushed the la64_avx_base branch from 8061435 to 204c291 Compare June 16, 2025 15:56

ksco requested changes Jun 17, 2025

View reviewed changes

phorcys force-pushed the la64_avx_base branch from 204c291 to 233ccc9 Compare June 17, 2025 15:01

phorcys requested a review from ksco June 17, 2025 15:01

phorcys commented Jun 18, 2025

View reviewed changes

phorcys force-pushed the la64_avx_base branch from 233ccc9 to a79ae44 Compare June 18, 2025 10:01

ksco requested changes Jun 20, 2025

View reviewed changes

phorcys requested a review from ksco June 20, 2025 08:56

phorcys force-pushed the la64_avx_base branch from a79ae44 to 975519c Compare June 20, 2025 10:37

[LA64_DYNAREC]Add basic avx support for la64.

f1fc7aa

* basic infra for avx * some basic ops for avx VMOVDQU/VMOVDQA/VMOVUPS/VMOVAPS/VMOVUPD/VMOVAPD VZEROUPPER/VZEROALL VMOVD/VMOVSD/VMOVSS VINSERTF128/VINSERTI128/VEXTRACTF128/VEXTRACTI128 VBROADCASTSS/VBROADCASTSD/VBROADCASTF128

phorcys force-pushed the la64_avx_base branch from 975519c to f1fc7aa Compare June 20, 2025 10:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[LA64_DYNAREC]Add basic avx support for la64. #2745

[LA64_DYNAREC]Add basic avx support for la64. #2745

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[LA64_DYNAREC]Add basic avx support for la64. #2745

Are you sure you want to change the base?

[LA64_DYNAREC]Add basic avx support for la64. #2745

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!