Jump to content


Photo

Testers and reviewers wanted: BFG-style vertex cache


  • Please log in to reply
234 replies to this topic

#151 nbohr1more

nbohr1more

    Darkmod PR, Wordsmith

  • Development Role
  • PipPipPipPipPip
  • 8927 posts

Posted 07 July 2018 - 01:24 PM

I doubt that. The backend is not responsible for making decisions about what to render, that's the frontend's job.


Do you have a savegame or video for me, so that I can actually see what you are talking about?


Quicksave:
 
https://www.dropbox....save_0.zip?dl=0
 
I guess, I was thinking of the distance checking in the portal culling routine in the Renderer
but that technically counts as frontend too. I should end the bad habit of calling Render backend and
Game frontend...
Please visit TDM's IndieDB site and help promote the mod:

http://www.indiedb.c...ds/the-dark-mod

(Yeah, shameless promotion... but traffic is traffic folks...)

#152 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 08 July 2018 - 01:22 AM

Question: are we mapping entire VBO and what driver is doing behind the scene?

Is it copying our entire vertex data (static and dynamic) each frame back and forth to system memory and then to VRAM?



#153 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 08 July 2018 - 03:45 AM

Static vertex cache is only copied once, after level load. The dynamic vertex cache is copied each frame, but only those parts that were actually used. I used glFlushMapBufferRange to mark the parts that need to be copied.



#154 lowenz

lowenz

    Advanced Member

  • Member
  • PipPipPip
  • 1839 posts

Posted 08 July 2018 - 10:03 AM

Don't know if it's FBO related or Vertex Cache related!

 

Accountant 1 sewers:

 

The_Dark_Modx64_2018_07_08_16_18_37_412.


Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S.


#155 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 17 July 2018 - 02:28 PM

I had to disable 32 byte align of static vertex data.

It's the only way to make it work with glDrawElementsBaseVertex.

Hope you don't mind.



#156 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 17 July 2018 - 02:41 PM

Mh, that has the potential to screw things up in subtle ways. I have it working in my GL4 experiments with BaseVertex and similar functions while still keeping alignment. So just dropping the alignment is probably not the desired solution... IIRC I may have increased the alignment to another multiple or so.



#157 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 17 July 2018 - 02:54 PM

See here: https://github.com/f...r/VertexCache.h

 

I don't remember the details anymore, but there are certain alignment requirements on these buffers, so removing the alignment is potentially harmful, even though I could spot no obvious errors in the current SVN version.

 

Btw, could I convince you to rename your depth drawing function to something other than _Multi? There is an actual set of glMultiDraw... functions, which I've used in my GL4 experiments to draw depth and stencil with just one or two draw calls total. It's confusing to see that nomenclature used for something else :D



#158 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 17 July 2018 - 03:15 PM

Trying to remember the details. Actually, I think there are no hard requirements on alignment from OpenGL side. But you do want the buffer to be 16-byte aligned for SSE copies. And you need common multiples for all the kinds of data stored in the VBO, so that BaseVertex works for all of them. We have idDrawVert_t and shadowCache_t, I think. In any case, in the end the least common multiple for all of those turned out to be 240, as I set it in the branch I linked above.



#159 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 18 July 2018 - 12:33 AM

Right you are, common multiple should work as well. Are you sure copying extra bytes can't result in a sudden AccessViolation now that we switched to malloc's?

Where can I see your use of multiDraw's? Could not find it easily in the VR mod repo.



#160 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 18 July 2018 - 01:50 AM

No, it's fine. The alignment is already calculated into the requested size when requesting chunks from the vertex cache (static or dynamic). If the size exceeds the available remaining buffer size, the request fails. So we never write beyond what we have allocated.

 

For the GL4 stuff, either look at the 'gl4renderer' branch in https://github.com/fholger/thedarkmod or at the 'gl4stereo' branch in https://github.com/f...r/thedarkmodvr. The latter is my most recent work, but is probably not usable without VR. Specifically, you should look at GL4_MultiDrawDepth (https://github.com/fholger/thedarkmod/blob/gl4renderer/renderer/gl4/GL4Depth.cpp) or GL4_MultiDrawStencil (https://github.com/fholger/thedarkmod/blob/gl4renderer/renderer/gl4/GL4Interactions.cpp). They work by filling a buffer with all the draw parameters for all the objects that should be drawn and then passing those parameters to OpenGL with a single call to a gl4MultiDraw... command. It practically removes any CPU or driver overhead in drawing (most of) depth and stencil, which is pretty cool. If you're interested in the details and other optimizations, search for the keyword AZDO (approaching zero driver overhead).

 

However, please note that these techniques are not compatible with the existing TDM renderer, so please don't waste your time with them right now. The reason is that they absolutely require GLSL 4 shaders. And in these, any of the fixed function stuff is removed and no longer accessible, including the fixed function matrix pipeline via ftransform. Instead, you have to calculate your own model/view/projection matrices and pass them to the shaders as uniforms when needed. Unfortunately, doing that will result in ever-so-slightly differing results from what the fixed-function GL stuff does. So if you mix those modern shaders with the existing ARB/GLSL 1 shaders, you'll get a lot of z-fighting and other subtle precision issues. They are just not compatible. That's why I'm so interested in moving to a pure GL3 renderer with all of the fixed-function stuff removed. Because then you can just add these GL4 optimizations as an option on top. :)



#161 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 18 July 2018 - 02:21 AM

No, it's fine. The alignment is already calculated into the requested size when requesting chunks from the vertex cache (static or dynamic). If the size exceeds the available remaining buffer size, the request fails. So we never write beyond what we have allocated.

 

For the GL4 stuff, either look at the 'gl4renderer' branch in https://github.com/fholger/thedarkmod or at the 'gl4stereo' branch in https://github.com/f...r/thedarkmodvr. The latter is my most recent work, but is probably not usable without VR. Specifically, you should look at GL4_MultiDrawDepth (https://github.com/fholger/thedarkmod/blob/gl4renderer/renderer/gl4/GL4Depth.cpp) or GL4_MultiDrawStencil (https://github.com/fholger/thedarkmod/blob/gl4renderer/renderer/gl4/GL4Interactions.cpp). They work by filling a buffer with all the draw parameters for all the objects that should be drawn and then passing those parameters to OpenGL with a single call to a gl4MultiDraw... command. It practically removes any CPU or driver overhead in drawing (most of) depth and stencil, which is pretty cool. If you're interested in the details and other optimizations, search for the keyword AZDO (approaching zero driver overhead).

 

However, please note that these techniques are not compatible with the existing TDM renderer, so please don't waste your time with them right now. The reason is that they absolutely require GLSL 4 shaders. And in these, any of the fixed function stuff is removed and no longer accessible, including the fixed function matrix pipeline via ftransform. Instead, you have to calculate your own model/view/projection matrices and pass them to the shaders as uniforms when needed. Unfortunately, doing that will result in ever-so-slightly differing results from what the fixed-function GL stuff does. So if you mix those modern shaders with the existing ARB/GLSL 1 shaders, you'll get a lot of z-fighting and other subtle precision issues. They are just not compatible. That's why I'm so interested in moving to a pure GL3 renderer with all of the fixed-function stuff removed. Because then you can just add these GL4 optimizations as an option on top. :)

I actually aim to have gl4MultiDrawXXX in the end, but as you pointed out, it's not compatible with the current TDM version. Which is why the function name, even though it's not using any of the MultiDrawXXX yet.

I'm going to be changing a small thing after thing for a while, like I did last year with texgens (which was painful enough).

The Z-fighting is an unexpected headbump though. Will have to think about it when it comes to that. I was going to start adding modelView matrix uniform to shaders soon.

 

As for the memory thing, I meant this

tri.ambientCache = vertexCache.AllocStaticVertex( tri.verts, ALIGN( tri.numVerts * sizeof( tri.verts[0] ), VERTEX_CACHE_ALIGN ) );

Suppose size of verts is 600 bytes. Am I correct that you want to put align(600) = 608 bytes here? Do you actually try to copy 608 bytes from tri.verts? What if some of the extra 8 bytes are in "inaccessible" memory?



#162 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 18 July 2018 - 02:56 AM

Ah, you mean on the source side. You're right, that's potentially bogus. Will have to think about that...

 

As for the multiDraw stuff: My advice would be to not waste your time right now. As I said, I've already done much of this work, so you're essentially duplicating effort. And since it can't be integrated into the current renderer, it will have to wait for the GL3 push (after 2.07). So I'd advise to wait until that time and then do it together :)



#163 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 18 July 2018 - 07:18 AM

As for the multiDraw stuff: My advice would be to not waste your time right now. As I said, I've already done much of this work, so you're essentially duplicating effort. And since it can't be integrated into the current renderer, it will have to wait for the GL3 push (after 2.07). So I'd advise to wait until that time and then do it together :)

Makes sense.

However what I could do now is to create infrastructure for multiple renderer support. We'll have to have both renderers coexisting for quite some time until the map-level ARB programs are replaced. (And for testing sake as well)



#164 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 18 July 2018 - 07:48 AM

Looking at the gl4renderer branch though, it's like I can copy paste it to TDM and it will just work.



#165 stgatilov

stgatilov

    Lead Programmer

  • Active Developer
  • PipPipPip
  • 986 posts

Posted 18 July 2018 - 07:59 AM

Makes sense.

However what I could do now is to create infrastructure for multiple renderer support. We'll have to have both renderers coexisting for quite some time until the map-level ARB programs are replaced. (And for testing sake as well)

Why do I have a feeling that creating infrastructure for multiple renderers will have about the same impact on the code and its stability as simply moving to GL3 renderer?...



#166 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 18 July 2018 - 08:29 AM

Looking at the gl4renderer branch though, it's like I can copy paste it to TDM and it will just work.

Well, yeah... except this is basically my first attempt at this, when I didn't really know what I was doing. I'd kind of like to refactor a few things. Besides, copy and pasting it now is not really useful, imho. Even behind its cvar switch, it does not belong into 2.07, imho.

 

@stgatilov: I kind of have the same feeling...



#167 nbohr1more

nbohr1more

    Darkmod PR, Wordsmith

  • Development Role
  • PipPipPipPipPip
  • 8927 posts

Posted 18 July 2018 - 09:30 AM

Strange little circle.

 

Once upon a time, Doom 3 had multiple rendering backends: ARB, Cg, ARB2, NV20, R200, etc. (structure)

 

Then via "code cleanup" there were no backend switches in draw_common (structure removed) but instead a single cvar r_useGLSL switches between ARB2 and GLSL.

 

Now to try new backends, we must "create a structure" to isolate the changes from the current backends.

 

(This is why I favored Pat Raynor's initial approach for just adding the new GLSL backend to the list, instead of what we currently do.)

 

;)


Please visit TDM's IndieDB site and help promote the mod:

http://www.indiedb.c...ds/the-dark-mod

(Yeah, shameless promotion... but traffic is traffic folks...)

#168 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 18 July 2018 - 10:11 AM

Strange little circle.

 

Once upon a time, Doom 3 had multiple rendering backends: ARB, Cg, ARB2, NV20, R200, etc. (structure)

 

Then via "code cleanup" there were no backend switches in draw_common (structure removed) but instead a single cvar r_useGLSL switches between ARB2 and GLSL.

 

Now to try new backends, we must "create a structure" to isolate the changes from the current backends.

 

(This is why I favored Pat Raynor's initial approach for just adding the new GLSL backend to the list, instead of what we currently do.)

 

;)

What "ARB, Cg, ARB2, NV20, R200" all did was interactions (but using legacy ftransform's, etc). Now we have to do everything in GL4.

Adding the new GLSL backend is essentially what I did, except it's an "if" instead of the old "switch".

Just adding the new GL4 interactions backend will achieve nothing but depth fighting (per above).



#169 nbohr1more

nbohr1more

    Darkmod PR, Wordsmith

  • Development Role
  • PipPipPipPipPip
  • 8927 posts

Posted 18 July 2018 - 10:29 AM

Well I guess that's the problem with the idea behind "draw_common".

 

GL versions can have radical changes where even the fundamental aspects of setting up a scene have no "common" denominator.

 

I guess the best you can do is make conceptual abstractions in draw_common that get filled-in for each backend.

Eg move more of the scene setup out of draw_common.


Please visit TDM's IndieDB site and help promote the mod:

http://www.indiedb.c...ds/the-dark-mod

(Yeah, shameless promotion... but traffic is traffic folks...)

#170 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 18 July 2018 - 10:33 AM

Well I guess that's the problem with the idea behind "draw_common".

 

GL versions can have radical changes where even the fundamental aspects of setting up a scene have no "common" denominator.

 

I guess the best you can do is make conceptual abstractions in draw_common that get filled-in for each backend.

Eg move more of the scene setup out of draw_common.

Exactly, but we don't need the Doom3 backends for that. In fact, they would only add more clutter.



#171 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 18 July 2018 - 10:41 AM

Well, the best you can do would be an object-oriented encapsulation for the renderer so that you can actually swap the whole implementation during initialisation and don't have to clutter the code with conditionals on cvars. This is more or less what the id dev did for his Vulkan renderer for Doom3 BFG. It would also give us a fairly clean way to implement a base GL3 renderer and then an optional GL4 renderer on top of that if the hardware supports it. This is the direction I would probably favour if I refactor my GL4 experiments for a future TDM integration.

 

As for the old renderer: I think I would favour the approach to develop the new renderer on a separate branch or Github until it is basically feature-complete, and then merge it back to trunk to replace the old renderer, instead of having both renderers side by side for an unspecified amount of time. This is only going to complicate testing, and when the new renderer is "ready", maintaining both becomes difficult, anyway. For example, any custom shader for fan missions would have to be implemented for both, which does not seem like an appealing prospect.

 

But anyway, this is all far in the future and will require some careful planning at the appropriate time. Let's concentrate on 2.07, first :)


  • stgatilov and nbohr1more like this

#172 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 19 July 2018 - 02:03 AM

About 2.07.

What do we do about occasional missing ambientCache and geometry flickering?


  • stgatilov likes this

#173 cabalistic

cabalistic

    Member

  • Development Role
  • PipPip
  • 375 posts

Posted 19 July 2018 - 02:11 AM

Haven't seen any geometry flickering. Please point me to a reproducible example. As for the ambientCache misses: ideally find the root cause and fix it :)



#174 duzenko

duzenko

    Advanced Member

  • Active Developer
  • PipPipPip
  • 1433 posts

Posted 19 July 2018 - 03:25 AM

Is there some VBO status tracking cvar like we have with e.g. r_showprimitives or r_debugRenderToTexture?

 

Geometry (mostly animated models) misses ambientCache occasionally, which triggers missing model in one or few frames  (thus flickering).

Just noclip to the bad guy's room in Closemouthed Shadows and watch him for a minute.

I believe it happens to all AI models every now and again.


  • stgatilov likes this

#175 stgatilov

stgatilov

    Lead Programmer

  • Active Developer
  • PipPipPip
  • 986 posts

Posted 19 July 2018 - 08:40 AM

Haven't seen any geometry flickering. Please point me to a reproducible example. As for the ambientCache misses: ideally find the root cause and fix it :)

I see guards disappearing every time when a bunch of ambientCache miss warnings appear.

This happens randomly, but with annoying regularity.

 

Link to video. Flicker happens near the end of it.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users