Jump to content
The Dark Mod Forums
Sign in to follow this  
cabalistic

Testers and reviewers wanted: BFG-style vertex cache

Recommended Posts

Feel free to experiment with it, but make sure to do very careful GPU profiling on the results, as the performance impacts of switching from glMapBufferRange to glBufferSubData in this context are completely unclear to me...

Also note that the current implementation dynamically resizes the dynamic buffers if needed. If you do want to store static data in the same buffer, you'd need to reupload that in those cases and thus would need to keep that data around.

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites

Having all static geometry duplicated thrice in VBOs would mean serious increase in video memory requirements... which is not cool, I guess...

  • Like 1

Share this post


Link to post
Share on other sites
On 8/12/2019 at 7:51 PM, stgatilov said:

Having all static geometry duplicated thrice in VBOs would mean serious increase in video memory requirements... which is not cool, I guess...

Indeed. I was thinking about a single VBO subdivided into static and 3 dynamic pages.

  • Like 1

Share this post


Link to post
Share on other sites
On 8/4/2019 at 5:41 PM, cabalistic said:

Actually, I may have been mistaken. Reading this (https://www.khronos.org/opengl/wiki/Buffer_Object#Mapping) It may be impossible to use a single VBO for GL3, because if a buffer is mapped by glMapBufferRange, then simultaneously reading from it (i.e. rendering) is apparently not allowed, not even from the regions which are not mapped.

This is solved with persistent mapping, but that's a GL4 feature and thus not something we can do in the core of the engine. Alternatively, you'd have to work with a system RAM shadow copy in the frontend and transfer via glBufferSubData after every frame, which would cost RAM, and the performance implications are unclear to me.

So I still think solving this at the draw call level is the safer approach ;)

Did this in svn.

The persistent mapping path only for now. Will add fallback for incompatible drivers later.

The extension seems to be supported for all GL4 drivers, even though it was only cored in 4.4.

  • Like 1

Share this post


Link to post
Share on other sites

Problem: crash on x64 after map load inside nVidia driver. x86 just works.

Puzzle: qglFlushMappedBufferRange does not seem to be doing anything. x86 works great without it. x64 still crashes after map load.

No crash on Intel. :(

Edit: NM, driver bug. Went away after driver update.

As for flushing, should we not unmap the buffer every frame?

  • Like 1

Share this post


Link to post
Share on other sites

Unmapping is unnecessary with persistent buffers. That's their whole spiel - they are persistently mapped ;)

As for flushing - if you created and mapped the buffer with the GL_MAP_COHERENT_BIT, then you don't need to flush explicitly, as it's taken care of automatically. If you don't, you have to flush so that OpenGL knows parts of the buffer were touched and need to be synchronized before the next GL read from that area.

See here: https://www.khronos.org/opengl/wiki/Buffer_Object#Persistent_mapping

  • Like 1

Share this post


Link to post
Share on other sites

Added GL3 fallback. It's visibly slower on heavy maps like Perilous Refuge but still playable

nVidia 1050: 80 vs 56 fps.

Intel 630: 26 vs 24 fps.

Do we want to duplicate static data on GL3 GPUs so that we can use mapping with a single VBO? At least for indices, which are more dynamic than vertices.

I'm a bit sad that glBufferSubData is this slow. I expected higher speed on nVidia.

It feels like the driver pages the changed buffer portion in on a per-drawcall basis.

Or maybe add one more 'back' buffer, map it, and then just qglCopyBufferSubData from it to main VBO? Must be faster than stalling everything with what I have now. Then we'll have two dynamic frames + back buffer, instead of three frames as now. Although that's really how the driver should be doing it already.

 

  • Like 1

Share this post


Link to post
Share on other sites

Temp buffer turns out even slower

		qglBindBuffer( bufferType, tempBuff );
		qglBufferData( bufferType, length, mapBuff, bufferUsage );
		qglBindBuffer( GL_COPY_READ_BUFFER, tempBuff );
		qglBindBuffer( GL_COPY_WRITE_BUFFER, bufferObject );
		qglCopyBufferSubData( GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, 0, lastMapOffset, length );
		qglBufferData( bufferType, length, NULL, bufferUsage );

 

  • Like 1

Share this post


Link to post
Share on other sites

I was able to get the GL3 speed on par with persistent mapping

Guess how? Pass GL_STATIC_DRAW to qglBufferData

  • Like 2

Share this post


Link to post
Share on other sites

Is is possible to get a sneakpeak build to test on top of TDM 2.0.7 ?

Edited by lowenz

Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S.

Share this post


Link to post
Share on other sites

I can give you a download link but beware - its glprogs folder is not compatible with 2.07. If you have a 'regular' TDM installation with pk4's, you can simply delete that later and go back to 2.07.

  • Like 1

Share this post


Link to post
Share on other sites

Go for the link! :D Thanks ;)


Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S.

Share this post


Link to post
Share on other sites

Crash here too!


Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S.

Share this post


Link to post
Share on other sites
3 minutes ago, lowenz said:

Crash here too!

Try

r_usePersistentMapping 0

but what about GPU model and driver version?

  • Like 1

Share this post


Link to post
Share on other sites
2 hours ago, duzenko said:

https://drive.google.com/file/d/0B9OoHSmkeSeNZWdyZFliQkNsVTA/view?usp=sharing

It crashes for me on one PC. If it does for you as well, post your gpu model and driver version.

On the x64 exe, no crash for me but when playing Volta 2 I see only dark, more or less, I see some AI walking around, and some objects and effects like fog but I see no worldspawn geometry (brushes). 

On the x32 exe, I don't even see the menu, i only hear the music and the mouse interaction sounds. 

GPU AMD R9 270X 8GB

Driver 19.9.2

CON_DUMP.txt

  • Like 1

Share this post


Link to post
Share on other sites
57 minutes ago, duzenko said:

Try


r_usePersistentMapping 0

but what about GPU model and driver version?

On x64 exe that solved the problem, but not on x32 one.

Btw I used to have a water bug in Volta 2, with this exe the water now looks fine, don't know if is related to the cvar above or not. 

Share this post


Link to post
Share on other sites
3 minutes ago, HMart said:

On x64 exe that solved the problem, but not on x32 one.

The test package only includes the x64 notools exe.  The exe's you got left from 2.07 are not compatible with 2.08 glprogs.

  • Like 1

Share this post


Link to post
Share on other sites
23 minutes ago, duzenko said:

The test package only includes the x64 notools exe.  The exe's you got left from 2.07 are not compatible with 2.08 glprogs.

Are you sure? I used your exes, you have two on the 7zip file above one called - TheDarkModNoTools.exe and the other TheDarkModx64NoTools.exe , I assumed the former was 32 bits. 

Share this post


Link to post
Share on other sites
34 minutes ago, HMart said:

Are you sure? I used your exes, you have two on the 7zip file above one called - TheDarkModNoTools.exe and the other TheDarkModx64NoTools.exe , I assumed the former was 32 bits. 

Oh man, there's two of them

The 32 bit is lost in time - it's april build. Old good times when exe's were just 5 MB in size...

  • Like 1

Share this post


Link to post
Share on other sites

Maybe my crash is due to something totally different.....can't even start normal TDM  😐 (yes, I deleted glprogs)

Installed today WinDBG.....

 

Edited by lowenz

Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S.

Share this post


Link to post
Share on other sites
22 hours ago, duzenko said:

Oh man, there's two of them

The 32 bit is lost in time - it's april build. Old good times when exe's were just 5 MB in size...

And I should know that why? IMO you need to start using smiles because I don't know if I should take that reply in a bad direction or not, specially today, when my day job was so stressful and i'm not in a good mood.

I was wrong to mention 32 bits but why call one x64 and not the other when both are 64bits?  In that way there would be no confusion from my part.  🙄

But IMO this should not have been the discussion at hand but the testing of the exe's, I still don't know if my testing helped you or not. 

Share this post


Link to post
Share on other sites
10 hours ago, HMart said:

 

But IMO this should not have been the discussion at hand but the testing of the exe's, I still don't know if my testing helped you or not. 

It helped, on multiple levels

I have made some changes but I think I better test it myself on a Radeon first

And I should know that why? IMO you need to start using smiles because I don't know if I should take that reply in a bad direction or not, specially today, when my day job was so stressful and i'm not in a good mood.

I was wrong to mention 32 bits but why call one x64 and not the other when both are 64bits?  In that way there would be no confusion from my part.  🙄

I suppose, written texts are always ambiguous like that. I was ironic about my own silliness, but you took it in your address.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...