MoroseTroll 24 Posted August 18, 2016 Report Share Posted August 18, 2016 The idea behind Bulldozer was sound. Moving Floating Point operations to the GPU is what should be done...If you remember the old Nvidia AMD motherboards, (nForce) that was one of the things it did (reduced CPU workloadby intercepting instructions at the chipset and pre-processing them seamlessly... No "evangelize the coders and rewrite" needed. ).I'm afraid you're confused a bit. Yes, GPUs are fantastically fast, in terms of calculations, but they have a different and very difficult-to-learn paradigm. Yes, you can make GPUs work for you, but you'll have to learn OpenCL/CUDA/DirectCompute or create a very specific shaders in GLSL/HLSL. Yes, there some specific frameworks exist that can make your program run on both CPU and GPU, but those ones are so named JIT (just-in-time) translators, not compilers. Also, GPU, even on the same die with CPU, has a very limited connection speed with CPU, so even if you write a code that can use GPU at full speed, you'll have a lot of lags when you try to move a chunk of information from CPU to GPU and vice versa. Actually, Direct3D 12 and Vulkan (did you see how amazingly it works in Doom 4?) can reduce these lags almost to zero, but again, you'll have to 1) learn these APIs, 2) persuade TDM gamers to migrate to DX12/Vulkan-capable graphics cards (i.e. GeForce 600+ and Radeon HD 7700+). About the nVidia's nForce DASP: it was just a cache prefetch system, no more. There were no instruction interceptions but just a several tables with memory addresses processed by smart enough analyzer inside the chipset. That's it. Every modern CPU already has something inside. Quote Link to post Share on other sites
nbohr1more 2163 Posted August 18, 2016 Report Share Posted August 18, 2016 Shouldn't the instruction decode be able to hardcode the API translation needed to feed CPU floating point requests to the GPU?I get that application designers need to rewrite to get around the quirks of hardware access via the host OS but if you are in controlof the actual hardware, actual instruction code, CPU driver stack, Chipset driver stack, and GPU driver stack, you're telling me thatyou cannot make data routing better than someone without all that access who has to work from the outside via Vulkan or DX12? Quote Please visit TDM's IndieDB site and help promote the mod: http://www.indiedb.com/mods/the-dark-mod (Yeah, shameless promotion... but traffic is traffic folks...) Link to post Share on other sites
lowenz 608 Posted August 18, 2016 Report Share Posted August 18, 2016 Zen is so much Intel "Core" inspired AMD stated in the brief that power consumption and efficiency was constantly drilled into the engineers, and as explained in previous briefings, there ends up being a tradeoff between performance and efficiency about what can be done for a number of elements of the core (e.g. 1% performance might cost 2% efficiency) Quote Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S. Link to post Share on other sites
lowenz 608 Posted August 18, 2016 Report Share Posted August 18, 2016 (edited) Fingers crossed for the Zen APU releases. Maybe someone in the engineering department will sneak this idea back intothe architecture unbeknownst to the management at AMD (and do it right this time).LOL. Good old romantic nbohr! The "suddenly unemployed engineer hero" Edited August 18, 2016 by lowenz Quote Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S. Link to post Share on other sites
Bikerdude 3741 Posted August 18, 2016 Report Share Posted August 18, 2016 It seems they are just gonna mimic Intel and hope they can compete on price and efficiency. (Good luck.) Fingers crossed for the Zen APU releases. Maybe someone in the engineering department will sneak this idea back intothe architecture unbeknownst to the management at AMD (and do it right this time). Still, it looks like good news for price\performance competition in the PC market for a little while.Does this mean the current issue where an AMD CPU is paired with an nVidia GPU which results in crap perf in IDtech4 wont be an issue with Zen? I really want the option to get away from intel (due to the shit boot issue & Management chip) Quote Link to post Share on other sites
nbohr1more 2163 Posted August 18, 2016 Report Share Posted August 18, 2016 Right, Zen should run IDtech4 much better than current AMD Bulldozer based CPU's. That'll be pretty cool to see.It should also run the Wii\Gamecube emulator "Dolphin" much better and any old engine that is CPU bound ( Oblivion\Skyrim...). Quote Please visit TDM's IndieDB site and help promote the mod: http://www.indiedb.com/mods/the-dark-mod (Yeah, shameless promotion... but traffic is traffic folks...) Link to post Share on other sites
MoroseTroll 24 Posted August 18, 2016 Report Share Posted August 18, 2016 (edited) Shouldn't the instruction decode be able to hardcode the API translation needed to feed CPU floating point requests to the GPU?There are at least two things that make CPU<->GPU conversation extremely difficult. First, every GPU generation even from one vendor has its own instruction set (ISA). For example, nVidia's Pascal is not equal to Maxwell, i.e. they are similar, but not the same. So if you write a low-level code for Maxwell, you'll have to adapt it for Pascal, too. And for Kepler. And for Fermi. Should I continue ? The same situation with AMD: GCN 4th gen, 3rd gen, 2nd gen, 1st gen, VLIW4, VLIW5...Second, when you communicate with GPU, you use PCI-e interface, i.e. you fill a structure with data you need in RAM and then call the driver to transfer it to GPU. It takes thousands CPU clocks just for a simple commands (say, glBegin) and dozens and even hundreds of thousands CPU clocks to upload a texture. You know why? Because PCI-e always works on a constant frequency declared by PCI-SIG (yes, PCI-e has different versions and number of channels, but these ones don't change the situation drastically). Of course, in case of integrated GPU, the latencies could be much lower, but just in case if the vendor has created a very fast channel between CPU and GPU on the same die. I'm not sure how fast those channels are in the modern AMD and Intel APUs (CPU + GPU), but I doubt that they are way faster than PCI-e. Why would those vendors bother to invent a fast channel for the integrated graphics ?I get that application designers need to rewrite to get around the quirks of hardware access via the host OS but if you are in controlof the actual hardware, actual instruction code, CPU driver stack, Chipset driver stack, and GPU driver stack, you're telling me thatyou cannot make data routing better than someone without all that access who has to work from the outside via Vulkan or DX12?Um... I was saying that Vulkan and DX12 are the best ways to make GPU run at full speed in current situation, with current hardware. Can Intel and/or AMD create a chip with half or totally new design, where GPU = FPU, and every program can use this feature? Yes, they can, but they never won't. The reason is simple: binary compatibility. Like I said earlier, every GPU generation has its own ISA, different from the previous ones. The differencies can be small or big, but they always are. If you freeze GPU ISA, you'll have to live with it for dozen years, which means that your rivals can optimize their GPU architectures, whereas you can't - it's a straight way to lose your market share and then Chapter 11. Edited August 18, 2016 by MoroseTroll Quote Link to post Share on other sites
nbohr1more 2163 Posted August 18, 2016 Report Share Posted August 18, 2016 Those are good points but I would suggest that they should make the APU a core product so they could make that fast interconnect and itwould make their cores more competitive in Scientific GP-GPU fields. It would allow them to tier calculations depending on requirement.Quick turnaround for branching = local floating point units vs Large dataset with lots of SIMD (etc) = dedicated GPU work. I thought that was originally the point of Torrenza? Heterogeneous architecture? Quote Please visit TDM's IndieDB site and help promote the mod: http://www.indiedb.com/mods/the-dark-mod (Yeah, shameless promotion... but traffic is traffic folks...) Link to post Share on other sites
lowenz 608 Posted August 18, 2016 Report Share Posted August 18, 2016 AMD has HSA to share resources between GPU and CPU https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Quote Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S. Link to post Share on other sites
lowenz 608 Posted August 18, 2016 Report Share Posted August 18, 2016 (edited) Um... I was saying that Vulkan and DX12 are the best ways to make GPU run at full speed in current situation, with current hardware. Can Intel and/or AMD create a chip with half or totally new design, where GPU = FPU, and every program can use this feature? Yes, they can, but they never won't. The reason is simple: binary compatibility. Like I said earlier, every GPU generation has its own ISA, different from the previous ones. The differencies can be small or big, but they always are. If you freeze GPU ISA, you'll have to live with it for dozen years, which means that your rivals can optimize their GPU architectures, whereas you can't - it's a straight way to lose your market share and then Chapter 11.It's why LLVM - https://en.wikipedia.org/wiki/LLVM - and IR - https://en.wikipedia.org/wiki/Intermediate_representation - are the future. At version 3.4, LLVM supports many instruction sets, including ARM, Qualcomm Hexagon, MIPS, Nvidia Parallel Thread Execution (PTX; called NVPTX in LLVM documentation), PowerPC, AMD TeraScale,[25] AMD Graphics Core Next (GCN), SPARC, z/Architecture(called SystemZ in LLVM documentation), x86/x86-64, and XCore. Some features are not available on some platforms. Most features are present for x86/x86-64, z/Architecture, ARM, and PowerPC. Edited August 18, 2016 by lowenz Quote Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S. Link to post Share on other sites
MoroseTroll 24 Posted August 19, 2016 Report Share Posted August 19, 2016 (edited) Those are good points but I would suggest that they should make the APU a core product so they could make that fast interconnect and itwould make their cores more competitive in Scientific GP-GPU fields. It would allow them to tier calculations depending on requirement.Quick turnaround for branching = local floating point units vs Large dataset with lots of SIMD (etc) = dedicated GPU work.I hope some day this will happen, maybe even with Zen-based APUs, but I personally wouldn't hold my breath. Some paradigms are too tough to be spreaded all over the world.I thought that was originally the point of Torrenza? Heterogeneous architecture?AFAIK, Torrenza was a failure.HSA is a very good idea, but right now, AFAIK, this technology is working on AMD hardware only, if we speak about the x86 world. Will Intel and/or nVidia ever support HSA? I'm not sure, but if they won't, then HSA will be limited by mobile market and will very rarely be used in x86 (at least in client sector, not sure about server and workstation ones). It's why LLVM - https://en.wikipedia.org/wiki/LLVM - and IR - https://en.wikipedia.org/wiki/Intermediate_representation - are the future.Maybe, maybe not. How much do you know universal applications written on LLVM, that can run straight out of the box on many modern OSes (Windows, MacOS, Linux, Android, iOS)? How fast do they perform on different hardware? How bug-less are they? I know - these questions perhaps are not the simple ones. Sure, LLVM is quite a cool thing, but that's just not enough to became a de facto standard. If this happens, it would be nice.About IR: the simpler IR, the simpler the reverse engineering of it. I know, there some obfuscation technologies exist, but I think that many big companies would prefer to stay with their old-fashioned x86-code just to make sure that nobody knows their algorithms. Edited August 19, 2016 by MoroseTroll Quote Link to post Share on other sites
lowenz 608 Posted August 19, 2016 Report Share Posted August 19, 2016 (edited) I hope some day this will happen, maybe even with Zen-based APUs, but I personally wouldn't hold my breath. Some paradigms are too tough to be spreaded all over the world.AFAIK, Torrenza was a failure.HSA is a very good idea, but right now, AFAIK, this technology is working on AMD hardware only, if we speak about the x86 world. Will Intel and/or nVidia ever support HSA? I'm not sure, but if they won't, then HSA will be limited by mobile market and will very rarely be used in x86 (at least in client sector, not sure about server and workstation ones). Maybe, maybe not. How much do you know universal applications written on LLVM, that can run straight out of the box on many modern OSes (Windows, MacOS, Linux, Android, iOS)? How fast do they perform on different hardware? How bug-less are they? I know - these questions perhaps are not the simple ones. Sure, LLVM is quite a cool thing, but that's just not enough to became a de facto standard. If this happens, it would be nice.About IR: the simpler IR, the simpler the reverse engineering of it. I know, there some obfuscation technologies exist, but I think that many big companies would prefer to stay with their old-fashioned x86-code just to make sure that nobody knows their algorithms.LLVMPipe is terrific. I can run the old Unreal in OpenGL 2.1 with the *main processor* as GPU @25/30 FPS Edited August 19, 2016 by lowenz Quote Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S. Link to post Share on other sites
jaxa 259 Posted August 19, 2016 Author Report Share Posted August 19, 2016 HSA is a very good idea, but right now, AFAIK, this technology is working on AMD hardware only, if we speak about the x86 world. Will Intel and/or nVidia ever support HSA? I'm not sure, but if they won't, then HSA will be limited by mobile market and will very rarely be used in x86 (at least in client sector, not sure about server and workstation ones). Even if AMD's desktop offerings died in a fire, AMD has a lot of leverage: it makes the CPUs and GPUs in the Xbox One and PlayStation 4. This should also be the case with the upgraded mid-cycle 4K/VR versions coming within the next year or two. Quote Link to post Share on other sites
MoroseTroll 24 Posted August 20, 2016 Report Share Posted August 20, 2016 LLVMPipe is terrific. I can run the old Unreal in OpenGL 2.1 with the *main processor* as GPU @25/30 FPSDetails, please . Even if AMD's desktop offerings died in a fire, AMD has a lot of leverage: it makes the CPUs and GPUs in the Xbox One and PlayStation 4. This should also be the case with the upgraded mid-cycle 4K/VR versions coming within the next year or two.Sure. But the current game consoles are a totally different ecosystems. While PS4 & XBO use AMD x86-CPUs and Radeon GPUs, they have a different OSes, libraries, etc. You can make your game console's architecture as weird as you want (ask anyone who coded for PS3, and you'll hear a lot swearing), but if you try to do something on the PC market, then be ready to lose. Quote Link to post Share on other sites
jaxa 259 Posted August 20, 2016 Author Report Share Posted August 20, 2016 Details, please . Sure. But the current game consoles are a totally different ecosystems. While PS4 & XBO use AMD x86-CPUs and Radeon GPUs, they have a different OSes, libraries, etc. You can make your game console's architecture as weird as you want (ask anyone who coded for PS3, and you'll hear a lot swearing), but if you try to do something on the PC market, then be ready to lose. PS4 and Xbone have more in common with PC than ever before, because they are all x86 platforms. Quote Link to post Share on other sites
lowenz 608 Posted August 21, 2016 Report Share Posted August 21, 2016 (edited) Details, please .Download this package: http://download.qt.io/development_releases/qtcreator/4.1/4.1.0-rc1/installer_source/windows_vs2013_32/qtcreator.7zExtract opengl32sw(.dll) and rename it opengl32(.dll)Test an OpenGL 2.1 application It's MESA implementation of OpenGL, Gallium3D: compiled to run on a x86 processor ("SoftPipe") with LLVM/Clang optimizations (->"LLVMPipe") Edited August 21, 2016 by lowenz Quote Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S. Link to post Share on other sites
MoroseTroll 24 Posted August 21, 2016 Report Share Posted August 21, 2016 PS4 and Xbone have more in common with PC than ever before, because they are all x86 platforms.So what? The XBO's DirectX differs enough from the Windows' one. PS4's GNM & GNMX are neither DirectX, nor OpenGL, nor Vulkan. Both game consoles are not still cracked in order to run their games right under Windows PC. So it doesn't matter whether they are built upon x86 or not - they are too different from Windows.Like I said, AMD won't risk by investing in and creating a chip that nobody else would support, because the first one has enough bitter experience in this: 3DNow!, E3DNow!, SSE4a, MaSSE, IBS, XOP, LWP, FMA4, TBM. How much do you applications that support these extensions? How much do they benefit from this support? lowenz: Got it. Frankly, I expected something... different - say, a universal application that can run on both CPU and GPU, compiled by some LLVM-based framework. Quote Link to post Share on other sites
Oldjim 154 Posted August 24, 2016 Report Share Posted August 24, 2016 (edited) Rather than start a new thread - as will have been noticed from my appalling results with Deus Ex Mankind Divided - I have decided to get a new graphics card and despite being a long time Nvidia user I have decided to go with a Radeon RX470 4GB from Gigabyte - any thoughts as to whether I have made the right decisionThe things which swayed me towards Gigabyte were their excellent UK RMA system and the fact that it has a metal backplatehttp://forums.hexus.net/graphics-cards/204247-graphics-card-warranties-lets-see-how-good-they-really-look.htmlThe warranty situation had MSI as an option but this rear side temperature rather put me offhttp://www.kitguru.net/components/graphic-cards/zardon/msi-rx-470-gaming-x-8g-review/28/ Edited August 24, 2016 by Oldjim Quote Link to post Share on other sites
lowenz 608 Posted August 24, 2016 Report Share Posted August 24, 2016 (edited) The temperature criterion is always a double-edged sword: is it high because the energy supply circuit is bad or because the dissipation plate is really capable of drain heat? Edited August 24, 2016 by lowenz Quote Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S. Link to post Share on other sites
Bikerdude 3741 Posted August 25, 2016 Report Share Posted August 25, 2016 In DX12 / Vulkan the RX 480 is showing in various benchmarks as being faster than the 1060. But the 1060 is faster in DX11 and below), and if the trend I have seen with IDtech4 prevails then the 1060 will faster with that also. if you do end up getting an RX 480 I will be very interested to see how TDm perfs on it, I cant only hope that OpenGL perf issue has been fixed in this gen compared to previous R9 series. Quote Link to post Share on other sites
Destined 618 Posted August 25, 2016 Report Share Posted August 25, 2016 I built a new PC at the beginning of the year and got a GeForce GTX960. However, while I am pleased with the performance, I experience random freezes of the PC. Usually, it is not even a BSOD, but a complete freeze. Still, I got a crash dump file once or twice and when I googled the error code, I found out, that apparently the current version of DirectX and the latest GeForce drivers are not really compatible. At least, I found a couple of complaints about freezing computers, that are related to incompatibility of these two and strongly suspect my freezes to stem from there. So, just speaking from this experience I would say, Radeon is currently definitely the better choice. Quote Link to post Share on other sites
Oldjim 154 Posted August 25, 2016 Report Share Posted August 25, 2016 Slight change of plan - this is a real RX 470 killer https://www.overclockers.co.uk/sapphire-radeon-rx-480-nitro-4096mb-gddr5-pci-express-rgb-graphics-card-gx-37d-sp.html and the RX 480 4GB knocks spots off the RX 470 Quote Link to post Share on other sites
Bikerdude 3741 Posted August 25, 2016 Report Share Posted August 25, 2016 @Jim, found you better deal - £199 w/free shipping - https://www.overclockers.co.uk/xfx-radeon-rx-480-rs-edition-4096mb-gddr5-pci-express-graphics-card-with-backplate-gx-23c-xf.html That and XFX have since I last look updated the gfx card warranty terms, they now do a 3+2 yr plan. I have asked overclockers to give email me the UK RMA address detail for XFX, should you need to send the card back after 1yr has passed. Quote Link to post Share on other sites
Oldjim 154 Posted August 25, 2016 Report Share Posted August 25, 2016 I may have found a similar one except I am waiting for them to come back on the warranty period as they say 24 months and the other suppliers say 36 months https://www.scan.co.uk/products/nda-4gb-sapphire-radeon-rx-480-nitroplus-14nm-polaris-pcie-30-7000mhz-gddr5-1208mhz-gpu-1306mhz-boosI wouldn't touch XFX with a bargepole - their RMA system as previously reported in the Scan Forums is an absolutes disaster Quote Link to post Share on other sites
Bikerdude 3741 Posted August 25, 2016 Report Share Posted August 25, 2016 Yeah, that why I asked overclockers for that info. And on the subject of warranties its why I have always paid a bit extra and gone for Msi or Gigabyte. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.