add x86_64 optimization to Makefile by joneqdaniel · Pull Request #402 · ec-/Quake3e

joneqdaniel · 2026-05-19T15:14:07Z

use -march=native if not cross-compiling
use -mfpmath=sse if target is x86_64
add -O3 -ftree-vectorize -fopenmp -fopenmp-simd to OPTIMIZE
add -fdata-sections -ffunction-sections to RELEASE_CFLAGS

ensiform · 2026-05-20T02:04:32Z

What's the rationale behind adding openmp et all?

ec- · 2026-05-20T02:34:10Z

Please provide rationale and performance comparison

joneqdaniel · 2026-05-20T06:40:22Z

The reason for using OpenMP is platform independent SIMD instruction set and compiler support.
OpenMP is the only platform independent SIMD that is supported by MSVC compilers.

Compile the following arbitrary length SIMD dot product on any OpenMP supported target platform using any compiler supported.

OpenMP Compilers & Tools

Compile with CFLAGS="-march=native -ftree-vectorize -fopenmp -fopenmp-simd -std=std2y" adding -O0 -to -O3 and view the assembler generated with -S for different target platforms and optimization levels"

#include <stdlib.h>
#include <stdint.h>
#include <stddef.h>
#include <stdbool.h>

#include "q_shared.h"

#ifdef __clang__
#define addr(a) (__typeof__((a)[0])*) __builtin_addressof((a))
#else
#define addr(a) &(a)[0]
#endif

/* dup - non-parallel memcpy supporting array and vector types */
#define dup(n,dst,src) \
({ \
    _Pragma("omp simd") \
    for(size_t i = 0; i < n; i++) \
        (dst)[i] = (src)[i]; \
})

/* dot - non-parallel scalar product supporting array and vector types */
#define dot(n,a,b,i,t) \
({ \
    t dst = (t)i; \
    _Pragma("omp simd reduction(+:dst)") \
    for(size_t j = 0; j < n; j++) \
        dst += (t)(a)[j] * (t)(b)[j]; \
    dst; \
})

int main(int argc, char** argv)
{
        evec(4,vec_t) a = { 0, 1, 2, 3 };
        vec_t b[3];
        dup(3,b,a);
        vec_t c = dot(3,a,b,0,vec_t);
        printf("%f\n", c);
        exit(EXIT_SUCCESS);
}

ec- · 2026-05-20T08:41:06Z

Did you tried compiling code from PR?
Not counting compilation errors, passing -march=native is simply not acceptable because it will make release binaries incompatible for A LOT of systems if compiled with AVX512 on a host for example.
Also I'd like to see performance measurements for an actual Quake3e code

joneqdaniel · 2026-05-20T09:36:59Z

passing -march=native is simply not acceptable because it will make release binaries incompatible for A LOT of systems if compiled with AVX512 on a host for example.

Adding target platform dependent SIMD restrictions to the RELEASE_CFLAGS is recommended, e.g. -mno-avx512

ec- · 2026-05-20T10:11:18Z

Adding target platform dependent SIMD restrictions to the RELEASE_CFLAGS is recommended, e.g. -mno-avx512

You're just changing ceiling which bottom is at SSE2 for x86_64, same for non-x86 architectures.

Please fix compilation errors in a first order, do not submit non-working code here

KG7x · 2026-05-20T10:50:37Z

ioquake/ioq3#869
ioquake/ioq3#870

Y can try compile use build in local version before merge

joneqdaniel · 2026-05-20T11:07:39Z

Y can try compile use build in local version before merge

only .gitignore is usable at these links.

joneqdaniel · 2026-05-20T11:09:04Z

workflow should complete now with #warning being removed

ec- · 2026-05-20T11:14:26Z

It breaks VC2005 build as well, not present in GitHub Actions though.

Also could you measure actual performance difference?

joneqdaniel · 2026-05-20T11:51:51Z

I've checked in game performance with Intel UHD Graphics 630 and NVIDIA GeForce RTX 2060/PCIe/SSE2 card using opengl1 renderer and map Q3DM1.

Intel/Nvidia framerate compare

High: 23/140

seta r_fbo 1
seta r_vbo 1
seta r_bloom 1
seta r_hdr 1
seta r_ext_supersample 1
seta r_ext_multisample 8

Low: 500/200

seta r_fbo 0
seta r_vbo 0
seta r_bloom 0
seta r_hdr 0
seta r_ext_supersample 0
seta r_ext_multisample 0

FBO/VBO only: 300/200

seta r_fbo 1
seta r_vbo 1
seta r_bloom 0
seta r_hdr 0
seta r_ext_supersample 0
seta r_ext_multisample 0

I'll get +20 with using the aligned one.
I still need to compare the vector type differences and write a profiling benchmark.

joneqdaniel · 2026-05-20T11:54:27Z

It breaks VC2005 build as well

Please submit and post a link to the error log.
Could you please trigger workflow run again. So I am able to check that MSVC compiles except VC2005.

ec- · 2026-05-20T15:22:30Z

Please test changes on your own GitHub Actions / compilers first, I will not approve test runs anymore

ec- · 2026-05-20T15:33:43Z

I'll get +20 with using the aligned one.

+20 from what base, what those High: 23/140 numbers means?
Did you measured difference that came with -O3 -mfpmath=sse + forced inlining alone?

ensiform · 2026-05-20T16:39:44Z

Fyi openmp support is excluded in 2005 and 2008 express editions, may need to be extra download with regular too.

joneqdaniel · 2026-05-20T17:55:30Z

Yes, Visual Studio 2005/2008 Express Editions need the extra download, while the Standard Editions include it.

joneqdaniel · 2026-05-21T14:34:17Z

Please approve another workflow run.
I'll look into coding a benchmark example soon.

KG7x · 2026-05-21T15:33:53Z

Please approve another workflow run. I'll look into coding a benchmark example soon.

You havent even launched actions in your GitHub repository once in all this time...
How y can сonfirm that you fixed this and there will be no warnings or errors?

joneqdaniel · 2026-05-21T15:56:46Z


 #if defined( _MSC_VER ) && _MSC_VER >= 1400 // MSVC++ 8.0 at least
 #define OS_STRING "win_msvc"
+#define ID_INLINE __forceinline __flatten inline


Fixed mispelled MSVC syntax here!

Wow now start build it in you repo and check it

No, there are no action workflow builds possible in a fork.

All the scripts are not setup.

Ah, I see the run workflow now.

- assure ID_INLINE inlines everything - add vec(4,byte) and reindent not dependent on ts - inline ColorBytes and move to q_shared.h - add avec(N,T) and use it for avec3_t/avec5_t - add clang support - add evec(N,T) SIMD vector extension type - use __typeof__ instead of typeof

ec- · 2026-05-21T19:18:37Z

Sorry, that's enough

add x86_64 optimization to Makefile

346757a

joneqdaniel commented May 21, 2026

View reviewed changes

lndpj added 2 commits May 21, 2026 18:44

use #define instead of typedef for _MSC_VER

79494a0

ec- closed this May 21, 2026

Conversation

joneqdaniel commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ensiform commented May 20, 2026

Uh oh!

ec- commented May 20, 2026

Uh oh!

joneqdaniel commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ec- commented May 20, 2026

Uh oh!

joneqdaniel commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ec- commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KG7x commented May 20, 2026

Uh oh!

joneqdaniel commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joneqdaniel commented May 20, 2026

Uh oh!

ec- commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joneqdaniel commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joneqdaniel commented May 20, 2026

Uh oh!

ec- commented May 20, 2026

Uh oh!

ec- commented May 20, 2026

Uh oh!

ensiform commented May 20, 2026

Uh oh!

joneqdaniel commented May 20, 2026

Uh oh!

joneqdaniel commented May 21, 2026

Uh oh!

KG7x commented May 21, 2026

Uh oh!

joneqdaniel May 21, 2026

Choose a reason for hiding this comment

Uh oh!

KG7x May 21, 2026

Choose a reason for hiding this comment

Uh oh!

joneqdaniel May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joneqdaniel May 21, 2026

Choose a reason for hiding this comment

Uh oh!

joneqdaniel May 21, 2026

Choose a reason for hiding this comment

Uh oh!

joneqdaniel May 21, 2026

Choose a reason for hiding this comment

Uh oh!

ec- commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

joneqdaniel commented May 19, 2026 •

edited

Loading

joneqdaniel commented May 20, 2026 •

edited

Loading

joneqdaniel commented May 20, 2026 •

edited

Loading

ec- commented May 20, 2026 •

edited

Loading

joneqdaniel commented May 20, 2026 •

edited

Loading

ec- commented May 20, 2026 •

edited

Loading

joneqdaniel commented May 20, 2026 •

edited

Loading

joneqdaniel May 21, 2026 •

edited

Loading