Jump to content
The Dark Mod Forums

Testers and reviewers wanted: BFG-style vertex cache


cabalistic

Recommended Posts

 

 

Quite puzzling to see it faster on your system

 

What is that decal material exactly?

 

@nbohr1more, do we want to switch shadows off for that decals?

?

 

Single pass IS meant to be faster but it shows those alpha issues.

Edited by lowenz

Task is not so much to see what no one has yet seen but to think what nobody has yet thought about that which everybody see. - E.S.

Link to comment
Share on other sites

?

 

Single pass IS meant to be faster but it shows those alpha issues.

It's a performance tuning thing meant to reveal materials that cast shadows while they should not

Otherwise we'd never know that decal is casting shadows and stealing GPU power.

  • Like 2
Link to comment
Share on other sites

This blending issue "candle light vs water" can't be solved, right?

Which water material is it? It's hard to tell from the shot whether it's an alpha sorting problem or a postprocess shader problem in the tradition of http://bugs.thedarkmod.com/view.php?id=3879

Some things I'm repeatedly thinking about...

 

- louder scream when you're dying

Link to comment
Share on other sites

That looks like an old bug not related to the BFG vertex cache beta, it's the same with glare particles and fog. Blend add on blend add or something, bleh, I don't remember. It is annoying to look at for sure. Duzenko actually added the afterfog sorting material keyword for 2.06 per my bug report of it, but 1) relevant glare particles haven't been updated with it (and it is not clear whether or not to change their behaviour now, to begin with) and 2) even if you put afterfog on that particle, I doubt it will solve the issue. Fwiw the keyword more or less does its job, it makes particles sort properly with foglights... last time I tested it, which was not on this beta.

 

It's a performance tuning thing meant to reveal materials that cast shadows while they should not

Otherwise we'd never know that decal is casting shadows and stealing GPU power.

 

I may well be misunderstanding, but you're making it sound like it's purely for diagnostics. If so, why was it made on by default? I'll make my implicit point in my previous posts explicit: it looks broken, so the singlepass mode should either be fixed or be set back to 0 as default (if that's how it's meant to work).

My FMs: The King of Diamonds (2016) | Visit my Mapbook thread sometimes! | Read my tutorial on Image-Based Lighting Workflows for TDM!

 

 

Link to comment
Share on other sites

The reason I asked about heathaze is that I'm under the impression that using _currentRender will make a material postprocess even if that's not the sort order you've stipulated. I'm not sure the distortion fixes for solid foreground objects would have done much about that in other contexts (would a glow particle be writing to the depth buffer?).

Edited by VanishedOne

Some things I'm repeatedly thinking about...

 

- louder scream when you're dying

Link to comment
Share on other sites

 

That is not a decal-specific bug, every alpha surface goes opaque, including brushwork and models.

 

I fixed a few decals that should've been set to noshadows anyway...

 

But Spooks' issue with "all alpha testing" is a problem and may need a better approach.

Please visit TDM's IndieDB site and help promote the mod:

 

http://www.indiedb.com/mods/the-dark-mod

 

(Yeah, shameless promotion... but traffic is traffic folks...)

Link to comment
Share on other sites

@cabalistic, could you shed some light on when the vertex cache needs an update

E.g. at the very start of Closemouthed shadows I can see prelight shadows being re-uploaded every frame, like this:

 	TheDarkModNoTools.exe!idVertexCache::ActuallyAlloc(geoBufferSet_t & vcs={...}, const void * data=0x19b239b0, int bytes=10320, cacheType_t type=CACHE_VERTEX) Line 361	C++
 	[Inline Frame] TheDarkModNoTools.exe!idVertexCache::AllocVertex(const void * bytes, int) Line 106	C++
 	[Inline Frame] TheDarkModNoTools.exe!R_CreatePrivateShadowCache(srfTriangles_s *) Line 74	C++
>	TheDarkModNoTools.exe!R_AddLightSurfaces() Line 831	C++
 	TheDarkModNoTools.exe!R_RenderView(viewDef_s & parms={...}) Line 1123	C++
 	TheDarkModNoTools.exe!idRenderWorldLocal::RenderScene(const renderView_s & renderView={...}) Line 726	C++
 	TheDarkModNoTools.exe!idPlayerView::SingleView(idUserInterface * hud, const renderView_s * view, bool drawHUD=true) Line 535	C++
 	TheDarkModNoTools.exe!idPlayerView::RenderPlayerView(idUserInterface * hud=0x19bbbaa8) Line 997	C++
 	TheDarkModNoTools.exe!idGameLocal::Draw(int clientNum=0) Line 3670	C++
 	TheDarkModNoTools.exe!idSessionLocal::Draw() Line 2556	C++
 	[Inline Frame] TheDarkModNoTools.exe!idSessionLocal::DrawFrame() Line 3026	C++
 	TheDarkModNoTools.exe!idSessionLocal::ActivateFrontend() Line 3109	C++
 	TheDarkModNoTools.exe!idRenderSystemLocal::EndFrame(int * frontEndMsec=0x00000000, int * backEndMsec=0x00000000) Line 613	C++
 	TheDarkModNoTools.exe!idSessionLocal::UpdateScreen(bool outOfSequence=false) Line 2651	C++
 	TheDarkModNoTools.exe!idCommonLocal::Frame() Line 2478	C++
 	TheDarkModNoTools.exe!WinMain(HINSTANCE__ * hInstance, HINSTANCE__ * hPrevInstance=0x00000000, char * lpCmdLine=0x047b5a54, int nCmdShow=10) Line 1363	C++
Link to comment
Share on other sites

Generally, the vertex cache only has two modes: static or per-frame. So either something is in the static cache (thus valid for the entire duration of play until level load happens), or it's only valid for the next frame and needs to be reuploaded thereafter.

 

Shadows, in general, are currently not in the static cache. I initially tried to imitate BFG and move certain shadows into the static cache, but it caused problems when loading save games that I couldn't figure out, and it didn't seem to make any difference whatsoever to the framerate.

  • Like 1
Link to comment
Share on other sites

Hmm.

 

I may be misremembering but I seem to recall that someone determined that the "pre-light" shadows

were not that useful from a performance standpoint so the only shadow mode is "turboshadows"

which does the extrusion on the GPU. Perhaps that change broken the relationship to static cache?

I guess, in ideal terms, you would:

 

1) During DMAP, mark vertices for static casting

2) During Map, run the GPU extrusion routine for static geometry and store in it's own cache block

3) Run the standard dynamic light extrusion and cache it to standard cache

4) Cycle cache validation checks for the data in 3

 

(In the legacy code, DMAP pre-extrudes the static shadow data and stores it in the map format.

I believe that it's slower to load that pre-extruded data than to do the GPU extrusion during the map loading.)

 

One looming issue with this optimization is that we are moving to a predominantly modular "model" based

workflow where the associated entities will always be candidates for shadow casting changes.

(Unless there is some entity flag that makes the target entity ignore noshadows calls from game or script?)

Therefore these static cached shadows would be a very small amount of the overall scene data.

Please visit TDM's IndieDB site and help promote the mod:

 

http://www.indiedb.com/mods/the-dark-mod

 

(Yeah, shameless promotion... but traffic is traffic folks...)

Link to comment
Share on other sites

  • 3 weeks later...

CopyBuffer (BufferObject.cpp)

  1. Can we always call the SIMDProcessor->Memcpy? If there's a problem with Memcpy then we should just fix it there?
  2. Currently all shadow verts and interaction indices get re-uploaded to VBO every frame. How can we reuse "static" shadow and interaction data?
  • Like 1
Link to comment
Share on other sites

What exactly is your concern? I don't understand the question.

If they are truly static (as in, static over the whole level load), you could move them to the static buffer. However, i tried that for static shadow caches, and it caused some issues and did absolutely nothing for the performance, so I dropped it. If they are only static for a certain time, then you can't reuse them. Keep in mind that the original code did that, and the new vertex cache is faster. If you truly want to save some transfers, the best improvement would be to implement GPU skinning, imho.

 

  • Like 1
Link to comment
Share on other sites

1. We already have the SIMD memory copy code, so why duplicate it?

 

2. My concern is there are memory copies every frame that are obviously redundant. I'm talking about static shadows and interaction - i.e. when the light and the occluder have not moved for a long time (most often since the map load). I assume they don't show up as significant on any profiling you did but it's still a problem for IGPs and it's will cost you more when you move with further parallelization of the frontend.

Link to comment
Share on other sites

It was part of the original BFG vertex cache code that I copied. It is an SSE2 optimized copy, and at the time at least TDM did not have an SSE2 memcpy, so I left it in. Feel free to replace it if there is no performance loss.

Like I said, the vertex cache only has two modes - static or dynamic. Static means it must be uploaded at level load and cannot be updated at all after. If the interactions you have in mind fall into that category, you can try to move them there. But if not, it's better not to bother. Implementing a semi-static system carries a massive cost of its own, and it is very likely you will end up with worse performance due to the overhead. At the very least you will make the code a lot more complicated, which is not a good thing for further parallelization.

 

  • Like 1
Link to comment
Share on other sites

I think the current SIMD memory copy is weird.

Using MMX for copying memory is stupid this day, and in 64-bit mode it is excluded anyway (no MMX).

 

Someone should replace the MMX version with the SSE one.

To be honest, I think custom memcpy should accept a flag which tells if the copy should be cached or not (or implement two versions).

SSE has ordinary writes and "streaming" writes, and which is faster depends on circumstances (and the first one is faster in most cases).

Link to comment
Share on other sites

  • 7 months later...

Keep in mind there are three dynamic vertex buffers (one for frontend updates, one for backend draw calls, one for GL driver sync), so you'd have to replicate the static data three times.

I think it is theoretically possible to use a single large buffer instead of the three separate targets, but it does complicate certain things and is not a trivial change.

What I think would be easier to do is to separate draw calls by static/dynamic nature and issue the static calls first (per pipeline stage), which would give you almost the same benefit. In fact, for some of the GL3/GL4 optimizations I did in experiments I already did that.

Link to comment
Share on other sites

One thing to sort ambient surfaces by vbo but what about per light surface chains? There's already grouped by light and noselfshadow flags, as well as translucency, and need multiple passes for shadows and interactions.

So for me having a single big vbo for vertices feels like lesser pain.

Link to comment
Share on other sites

Actually, I may have been mistaken. Reading this (https://www.khronos.org/opengl/wiki/Buffer_Object#Mapping) It may be impossible to use a single VBO for GL3, because if a buffer is mapped by glMapBufferRange, then simultaneously reading from it (i.e. rendering) is apparently not allowed, not even from the regions which are not mapped.

This is solved with persistent mapping, but that's a GL4 feature and thus not something we can do in the core of the engine. Alternatively, you'd have to work with a system RAM shadow copy in the frontend and transfer via glBufferSubData after every frame, which would cost RAM, and the performance implications are unclear to me.

So I still think solving this at the draw call level is the safer approach ;)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recent Status Updates

    • Petike the Taffer

      I've finally managed to log in to The Dark Mod Wiki. I'm back in the saddle and before the holidays start in full, I'll be adding a few new FM articles and doing other updates. Written in Stone is already done.
      · 0 replies
    • nbohr1more

      TDM 15th Anniversary Contest is now active! Please declare your participation: https://forums.thedarkmod.com/index.php?/topic/22413-the-dark-mod-15th-anniversary-contest-entry-thread/
       
      · 0 replies
    • JackFarmer

      @TheUnbeholden
      You cannot receive PMs. Could you please be so kind and check your mailbox if it is full (or maybe you switched off the function)?
      · 1 reply
    • OrbWeaver

      I like the new frob highlight but it would nice if it was less "flickery" while moving over objects (especially barred metal doors).
      · 4 replies
    • nbohr1more

      Please vote in the 15th Anniversary Contest Theme Poll
       
      · 0 replies
×
×
  • Create New...