Jump to content
The Dark Mod Forums

Pull Request: C++14 check for CMake + fix for ThirdParty/devil


Recommended Posts

Hello everyone! 🙂 I made 2 changes in the source code. Could you, please, take a look at this?

 

- CMake: Added check of compiler for support C++14 standard

CXX_STANDARD supported values are 98, 11, 14, 17 and 20.

Ref: https://cmake.org/cmake/help/latest/prop_tgt/CXX_STANDARD.html

 

- ThirdParty/devil: fix for MCST lcc compiler

LCC-Win32 and MCST lcc compilers define same identifier (LCC).
Have been added check of MCST Elbrus 2000 (e2k) architecture.

Ref: https://en.wikipedia.org/wiki/Elbrus_2000

 

Attached the patch with my changes.

C++14_check_for_CMake_+_fix_for_ThirdParty-devil.patch

  • Like 1
Link to post
Share on other sites

Hello!

I never imagined I will meet anything Elbrus-related in person 🤩
I must admit I have more questions than I should 😁

 

Regarding the changes.
I like them being so compact, but I think I have problems with both of them.

The check for C++ standard is both unnecessary and incorrect.
The support for C++14 in Visual C++ compiler is not tied to -std=c++14 keyword (which is not supported, I guess). I think CMake added special keywords for C++ standard exactly for this reason: different compiler implement this setting differently. And according to documentation, CXX_STANDARD_REQUIRED should already produce an error if C++14 is not supported. So I'd better not add an explicit check and more code.

By the way, does LCC support C++? And C++14 in particular?
Did you manage to build the engine in the end?

The situation with il.h is more complicated.
This header is merely a build artifact produced by conan.
So if this change is applied as is, it will persist until someone decides to rebuild DevIL and commits new artefacts. Which probably won't happen soon though.
Moreover, fixing this in DevIL repository also won't help, because DevIL has recently migrated from C to C++, which I personally won't like, so we stay on older version.
I had an idea of wiping DevIL out of TDM, but that won't be easy.

Link to post
Share on other sites
6 hours ago, stgatilov said:

The check for C++ standard is both unnecessary and incorrect.
The support for C++14 in Visual C++ compiler is not tied to -std=c++14 keyword (which is not supported, I guess). I think CMake added special keywords for C++ standard exactly for this reason: different compiler implement this setting differently. And according to documentation, CXX_STANDARD_REQUIRED should already produce an error if C++14 is not supported. So I'd better not add an explicit check and more code.

Understood. I did not know that such a check appeared in CMake 🙂

6 hours ago, stgatilov said:

By the way, does LCC support C++? And C++14 in particular?

The current version of the MCST lcc compiler (version 1.24.10) fully supports C++03, C++11, C++14 standards and has experimental support of C++17. This compiler has a nominal compatibility with g++-7.3.0. The compiler (by default) has the -std=gnu++14 mode enabled (C++14 with gnu extensions). Also, it can compile with using of Intel Intrinsics 😉

6 hours ago, stgatilov said:

Did you manage to build the engine in the end?

No, I was able to compile only at 88% 😅

After fixing in /ThirdParty/artefacts/devil/include/IL/il.h file, there was a problem with the /idlib/math/Simd.cpp file. It contains "#include <cpuid.h>" string. There is no such header file on e2k.

Can you modify this file, similar to how it is done in RBDOOM-3-BFG (and with adding "USE_INTRINSICS" option)?

Ref: https://github.com/RobertBeckebans/RBDOOM-3-BFG/blob/master/neo/idlib/math/Simd.cpp

RBDOOM-3-BFG compiles on e2k without problems (and runs very well).

7 hours ago, stgatilov said:

So if this change is applied as is, it will persist until someone decides to rebuild DevIL and commits new artefacts. Which probably won't happen soon though.

Yes, such a change in the source code would be very helpful for e2k.

Link to post
Share on other sites

Which GPU exactly it has? Which version of OpenGL it supports?

Do I understand it right that you have physical access to the machine, and in theory you can run the game (if everything is OK) ?

It is not the best time to spend time on this now due to ongoing beta for the new release. But maybe I could try to fix build myself, if it won't take too much time. Although I don't think any changes will get into trunk before 2.08 is out.

Link to post
Share on other sites

And by the way, do you try to compile it for x86 architecture or for Elbrus architecture?
The Doom 3 code was pretty much tied to the specifics or x86 architecture, and I guess nobody had time to clean this up.

 

Link to post
Share on other sites
25 minutes ago, stgatilov said:

Which GPU exactly it has? Which version of OpenGL it supports?

Do I understand it right that you have physical access to the machine, and in theory you can run the game (if everything is OK) ?

I have access to "Elbrus 801-PC". It comes with a AMD Radeon R5 230 graphics card (OpenGL 4.1). But various AMD Radeon graphics cards can be used (e.g. Radeon RX 580 with OpenGL 4.5 and Vulkan).

Ref: http://www.ineum.ru/elbrus_801-pc_gen4

23 minutes ago, stgatilov said:

And by the way, do you try to compile it for x86 architecture or for Elbrus architecture?

I have successfully compiled RBDOOM-3-BFG for e2k architecture. And I have made some changes to the source code of the game to initially support this architecture.

Ref: https://github.com/RobertBeckebans/RBDOOM-3-BFG/pull/432

33 minutes ago, stgatilov said:

The Doom 3 code was pretty much tied to the specifics or x86 architecture, and I guess nobody had time to clean this up.

ARM64/Aarch64 architecture support has recently been added (for the Nvidia Jetson board).

Ref: https://github.com/RobertBeckebans/RBDOOM-3-BFG/pull/473

Link to post
Share on other sites
32 minutes ago, r.a.sattarov said:

ARM64/Aarch64 architecture support has recently been added (for the Nvidia Jetson board).

I meant "nobody had time to clean in in TDM".

Link to post
Share on other sites

Meanwhile, I have managed to build TDM for Linux/Elbrus platform.
We did not try to run it yet, but that is planned for the near future.
 

I'm attaching a full patch against current SVN.
I will not commit changes to ThirdParty/artefacts/ffmpeg/* and ThirdParty/artefacts/ogg/*, but will commit changes in ThirdParty/artefacts/devil/include/IL/il.h.
Also, keep in mind that ThirdParty/custom/openal/conanfile.py.original is a new file --- it has no function but to mark TDM-specific changes to original recipe.


Note that I did not do anything specific to Elbrus, only added missing checks for x86 architecture, and removed one weird check for LCC compiler.
Fixed some generic bugs (i.e. wrong filename case on Linux --- was already reported by someone).
Also, I extended third-party build system, now it is possible (although not very straightforward) to use it and build TDM for any architecture. I hope that guy with 64-bit PowerPC will also be pleased 😁

 

@duzenko, @cabalistic, I wonder what do you think about it.
Can I merge it to trunk? Should I merge it to release branch?

 

 

elbrus.patch

Link to post
Share on other sites

Don't really have an opinion on this - prior to this thread I hadn't even heard of Elbrus :)

I don't see anything problematic in the patch. If it helps, by all means.

Link to post
Share on other sites
4 hours ago, stgatilov said:

Meanwhile, I have managed to build TDM for Linux/Elbrus platform.
We did not try to run it yet, but that is planned for the near future.
 

I'm attaching a full patch against current SVN.
I will not commit changes to ThirdParty/artefacts/ffmpeg/* and ThirdParty/artefacts/ogg/*, but will commit changes in ThirdParty/artefacts/devil/include/IL/il.h.
Also, keep in mind that ThirdParty/custom/openal/conanfile.py.original is a new file --- it has no function but to mark TDM-specific changes to original recipe.


Note that I did not do anything specific to Elbrus, only added missing checks for x86 architecture, and removed one weird check for LCC compiler.
Fixed some generic bugs (i.e. wrong filename case on Linux --- was already reported by someone).
Also, I extended third-party build system, now it is possible (although not very straightforward) to use it and build TDM for any architecture. I hope that guy with 64-bit PowerPC will also be pleased 😁

 

@duzenko, @cabalistic, I wonder what do you think about it.
Can I merge it to trunk? Should I merge it to release branch?

 

 

elbrus.patch 15.07 kB · 2 downloads

Sure, I'd love to see some performance benchmarks on that platform

Hopefully it could be a good stepping stone for potential ARM build (chromebooks, Raspberry Pi, hypothetical MacOS on ARM, etc)

Link to post
Share on other sites

The built binary seems to work as expected on Elbrus machine (with RX250 graphics).

The performance is pretty bad: on default settings it gives about 30 FPS on the start of The New Job.
We will look further what is the culprit, although I suspect it is the CPU/memory.

If you ask me, the platform runs pretty well overall, but is not a competitor to commodity x86 in terms of performance 😞
It's quite enough for office work, but games is another thing.

Link to post
Share on other sites
3 hours ago, stgatilov said:

The built binary seems to work as expected on Elbrus machine (with RX250 graphics).

The performance is pretty bad: on default settings it gives about 30 FPS on the start of The New Job.
We will look further what is the culprit, although I suspect it is the CPU/memory.

If you ask me, the platform runs pretty well overall, but is not a competitor to commodity x86 in terms of performance 😞
It's quite enough for office work, but games is another thing.

com_smp?

r_showsmp?

Any Linux cpu benchmark to compare?

Link to post
Share on other sites
5 hours ago, duzenko said:

com_smp?

Enabled.

Quote

r_showsmp?

Frontend.

Quote

Any Linux cpu benchmark to compare?

Hard to say. On some simple things like sorting/bin search/whatever it was 2-6 times slower than my Ryzen (single core).

 

The first run was with shadow maps. It seems that switching to stencil shadows improved performance a lot.

Link to post
Share on other sites
4 hours ago, stgatilov said:

Enabled.

Frontend.

Hard to say. On some simple things like sorting/bin search/whatever it was 2-6 times slower than my Ryzen (single core).

 

The first run was with shadow maps. It seems that switching to stencil shadows improved performance a lot.

Curious thing

What's the clock speed?

Any SIMD support?

Link to post
Share on other sites
6 minutes ago, duzenko said:

What's the clock speed?

1200 Mhz according to specs. If you can read Russian, there is a link in this thread.

Quote

Any SIMD support?

It is VLIW architecture by itself, and probably there is SIMD for some less-than-64bit types in addition to that.

Compiler supports Intel SSE intrinsics properly, although the result is not SSE instructions of course.
We are yet to see if these intrinsics do more harm than good.
In my experience, going from scalar code to intrinsics rarely improves performance. Speaking of TDM, com_forceGenericSimd seems to have no effect on performance (it runs with SSE2 SIMD by default).

Link to post
Share on other sites
9 minutes ago, stgatilov said:

1200 Mhz according to specs. If you can read Russian, there is a link in this thread.

It is VLIW architecture by itself, and probably there is SIMD for some less-than-64bit types in addition to that.

Compiler supports Intel SSE intrinsics properly, although the result is not SSE instructions of course.
We are yet to see if these intrinsics do more harm than good.
In my experience, going from scalar code to intrinsics rarely improves performance. Speaking of TDM, com_forceGenericSimd seems to have no effect on performance (it runs with SSE2 SIMD by default).

It says 250 gflops on 8 cores, or 30 Gflops per core 😕

Something's not right

Link to post
Share on other sites

You mean 10 GFlops lost to rounding? 😂

This is VLIW CPU, its instructions are statically grouped into "packs" or "runs", even pack/run gets executed in parallel (scheduling is static). To do so many computations you have to do many independent computations all the time. In the best case, only dgemm from BLAS3 achieves such performance. Ordinary programs leave most of the execution units unused due to instruction dependencies and mostly sequental code. It is like AMD vs NVIDIA: AMD GPUs are VLIW, they have insane peak performance, and miners love them, but NVIDIA GPUs work faster in games, although they are pretty useless for mining.

Aside from little instruction level parallelism, there is also memory bandwidth. Perhaps it is not very good on the test machine.

 

Anyway, I have committed the fixes to everywhere.
I'm afraid it won't help the guy with PowerPC, more thorough cleaning would be necessary for that.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...