Jump to content
The Dark Mod Forums

cabalistic

Development Role
  • Posts

    1579
  • Joined

  • Last visited

  • Days Won

    47

Everything posted by cabalistic

  1. It was part of the original BFG vertex cache code that I copied. It is an SSE2 optimized copy, and at the time at least TDM did not have an SSE2 memcpy, so I left it in. Feel free to replace it if there is no performance loss. Like I said, the vertex cache only has two modes - static or dynamic. Static means it must be uploaded at level load and cannot be updated at all after. If the interactions you have in mind fall into that category, you can try to move them there. But if not, it's better not to bother. Implementing a semi-static system carries a massive cost of its own, and it is very likely you will end up with worse performance due to the overhead. At the very least you will make the code a lot more complicated, which is not a good thing for further parallelization.
  2. What exactly is your concern? I don't understand the question. If they are truly static (as in, static over the whole level load), you could move them to the static buffer. However, i tried that for static shadow caches, and it caused some issues and did absolutely nothing for the performance, so I dropped it. If they are only static for a certain time, then you can't reuse them. Keep in mind that the original code did that, and the new vertex cache is faster. If you truly want to save some transfers, the best improvement would be to implement GPU skinning, imho.
  3. What's the baseline fps you get without it on that map? The enhancement is only effective if frontend and backend take roughly the same amount of time. There are situations where this is not the case; in those instances the enhancement will not help.
  4. Generally, the vertex cache only has two modes: static or per-frame. So either something is in the static cache (thus valid for the entire duration of play until level load happens), or it's only valid for the next frame and needs to be reuploaded thereafter. Shadows, in general, are currently not in the static cache. I initially tried to imitate BFG and move certain shadows into the static cache, but it caused problems when loading save games that I couldn't figure out, and it didn't seem to make any difference whatsoever to the framerate.
  5. Stability (and speed) is very likely to improve with 2.07. Right now, most of the trouble people might experience is probably due to the two threads both calling to OpenGL. 2.07 introduces the vertex cache enhancements which removes the need of the frontend thread to access OpenGL. As mentioned, the multi-core enhancement is really just a dual-core enhancement at the moment, splitting the renderer into a "frontend" and "backend". Frontend is basically the CPU path collecting what and how to render, whereas the backend is the path actually issuing OpenGL render commands. I did experiment with splitting up the frontend further over multiple cores, and I do have some working code lying around that improves frontend speed. However, the gain is not as spectacular, and it currently does not help, because in those locations where it actually helps, the backend uses at least as much time, and so speeding up the frontend without also speeding up the backend does not result in any net FPS gain. When we get around to overhauling the GL backend, this might be worth reinvestigating
  6. Well, my near-term roadmap currently looks like this: fix one outstanding bug to get the preliminary port to 2.06/2.07 workingget UI elements including menu to render to a separate plane so that you can actually see and read them in VRdecouple vertical mouse movement from the head motion as it's incredibly confusingAfter that, I'll see if I can quickly add some sort of "pointer" like a 3D crosshair to give an indication of where the mouse is currently aiming at. This isn't just important for bow and arrow, but also basic things like looting and picking up stuff, because that's still bound to the mouse aim. As for roomscale: it's always been my distant goal to make this a full roomscale experience, but it is a very long road. And performance is still a huge concern. There is one major change and a few quick wins that are specific to VR that I'm going to try to implement that will probably help a little. But even then there are going to be maps which are not playable in VR, and probably few will be playable without reprojection. Some further performance optimisations will have to go through improvements in the base TDM project, so that will take a while, and it won't do miracles, either. I'll just have to see what's possible in what timespan. So for a first step, I'm just trying to make it a decent seated experience, because I figure that's better than nothing Whether roomscale will ever truly happen remains to be seen.
  7. Haven't looked at fhDoom's approach. But a class-based design is the goal, yes. It will especially come in handy to cleanly separate the optional GL4 improvements.
  8. Keep in mind, please, that I also do have some early prototypes lying around for a move to GL3/4 with significantly reduced state changes. I'd suggest postponing any such experiments until after 2.07 and then coordinating our efforts towards the GL upgrade. Because otherwise we're just duplicating each other's work...
  9. Afaik, we don't have any problems with AA and FBOs currently, that's all been resolved. And yes, we do want to encapsulate them in their own class. In fact, I already created a prototype for this which has already been discussed in the internal dev forum, so you might be a little late to the party But it's been postponed till after 2.07 and is planned for the major renderer upgrade to GL3. Right now, this change is just too risky for a release aimed at bugfixing...
  10. Yeah, that's the modern tiled/clustered forward+ type of renderer. Not doable while we still have ARB shaders around
  11. I have started porting over the VR changes to the latest trunk version. The initial merge is done, but it's going to take a couple days to fix the remaining issues and get it to actually work again. @Samson: I don't know what time zone you're in (western Europe here), but perhaps we can find some time on a weekend or so to chat or something. Then I could walk you through the current state of things, the particular challenges compared to Doom3 and we could perhaps decide how to proceed? Let me know if you're interested.
  12. The point is that a Vulkan renderer is just not very realistic right now. It would be a complete rewrite of the backend, and no, it doesn't help much that there is an implementation for Doom3 BFG. Keep in mind that TDM is based on the old Doom3 engine, not BFG. There are significant differences in the renderer so that a port is not nearly as straight-forward as you'd think. Besides, so far TDM has always tried to support as much older hardware as possible, so a Vulkan only renderer is not going to happen any time soon. Which would mean that it would need to keep maintained in addition to a GL renderer, which means additional maintenance overhead for an already fairly small team What is much more realistic and something I'm personally pushing for is upgrading the current GL renderer to GL 3. This does mean getting rid of the remaining ARB shaders (which is tricky, because missions can in principle ship their own ARB shaders which are outside the direct control of the game engine) and removing the remaining fixed function stuff, which is still a lot of work. But once that is done, adding in optional GL4 optimizations based on availability is much simpler to do and maintain than a Vulkan renderer at the moment.
  13. You don't need Vulkan to reduce the draw calls; much of that is possible with modern OpenGL, as well. I've already done some experiments on it. However, the major blocker right now is that TDM is still stuck on a very old GL version using ARB shaders, and these are not compatible at all with those modern techniques. So first we need to migrate *all* shaders to a newer GLSL version, and *then* we can potentially optimise some aspects with GL4+ techniques. That is a far goal, but nothing that's going to happen any time soon
  14. Yes, but I'm afraid it's quite outdated by now. TDM has moved on quite a bit since then, anx I have a couple of branches lying around with partial experiments, some of which I want to consolidate. Unless yoz are bored to death right now and desperately looking for work, I'd suggest to wait until I've at least updated the code base to the current TDM version. There is one thing left to do for me on the base game concerning fbos, though, before that really makes sense.
  15. Thanks. Though I have to point out that adding roomscale is actually the majority of the work, and that's not even been started So it's a long way to go. As for differences between the eyes, for the shadows at least there is a workaround. See here: https://github.com/fholger/thedarkmodvr/issues/4
  16. Mh, I have to go through it carefully again to understand the impact. However, keep in mind that we must absolutely, positively, make sure that the backend is reading caches that are not at the same time updated by the frontend for the next frame. That was the whole reason why I had to add the separate backend geo in the first place (or why BFG copies those cache handles). Possibly your copy solves that, but we better be sure about it... Don't think it's going to do anything for performance, though.
  17. Uhm, but there are still two geos in the struct?! You just made one not a pointer. I'm not really sure what benefits or drawbacks that brings (without thinking about it some more), but I'm not seeing a fundamental change here. The BFG engine actually does not replicate the whole geo structure for the backend, it merely copies the caches. I think it might be better to try and use that approach, as well. I did not initially, because I did not fully understand the implications. BFG works only by caches, whereas the original Doom3 engine did not, so it seemed risky to do it that way. But I think we are at a point where the cache is also always guaranteed to be set in the backend, so the backend no longer needs access to the full geo. By the way, imho you should remove code that you replace, not comment it out. It makes reading the diffs harder, and it leaves a lot of dead code in the repository over time. Source control exists for a reason
  18. Yeah, and I'm not against it, just no need to leave a comment
  19. @revelator: Just a quick remark, I don't think it's necessary that you flag every place you replaced a 0 or NULL with nullptr with a comment. In my personal opinion, comments should be written to help developers understand the *current* state of the code, not its history. The reason being that in most instances, developers won't know the history and the comment is thus fairly useless to them, and also code evolves fairly quickly and these comments are then often left untouched and completely outdated. Commit messages are the right place to comment the history, imho (Also, your comments are somewhat misleading. They make it sound as if the replaced 0 or NULL were a bug in the code that needed to be addressed. That is not the case. It is safe and well defined to assign 0 to a pointer. It's just no longer best practice, because the nullptr keyword introduced with C++11 has the advantage that it isn't ambiguous and can thus prevent potential subtle bugs if function signatures change or are overloaded.)
  20. If you enable multi-core support in 2.06, then TDM uses two threads which both have their own OpenGL context and talk to the driver. It's little wonder that disabling threaded optimizations would be hurtful in that scenario. Might be more interesting to try that with one of the experimental 2.07 builds, though. Those have eliminated the secondary context, so they might behave differently.
  21. Can't speak about two years ago, but the site currently explicitly lists OpenGL support for the frame analyzer, at least...
  22. com_smp no longer relies on the fence sync, because the frontend no longer accesses OpenGL functions. Still, fence syncs are core in GL 3.2, and even the old generation Intel cards which only go up to Gl 3.1 support it via extension. So that case you describe just doesn't exist And there's no way the fence sync could be implemented so poorly that it is worse than glFinish
  23. By the way, this is a perfect situation for a graphical profiler and debugger. You can't use nSight with an Intel GPU, obviously, but there is an Intel equivalent: https://software.intel.com/en-us/gpa I have no experience with it, but you might want to give it a try to gain insight into why that scene is slow on your GPU.
  24. The situation has a very simple answer: your GPU can't keep up. What happens is that the backend submits OpenGL calls for one frame, then finishes that frame. The OpenGL driver, being asynchronous, accepts those requests and sends them to the GPU, which will try its best to render the commands. But the asynchronous nature means that, in principle, the CPU is free to do other stuff. In particular, you can start submitting draw calls for the second frame. However, at the end of the second frame, the triple-buffered vertex cache needs to switch back to the first buffer. And it can only do that safely if the GPU is finished rendering from it. This is what the fence sync enforces. If the fence sync blocks, then it means that the GPU is lagging more than a full frame behind. The fence sync itself is innocent, it's just that there are too many draw commands in the driver's queue and the GPU wasn't fast enough to process them. Imho, in this situation the only sensible thing to do is really to wait for the GPU to catch up. You don't want to run farther away from the GPU because that would just increase the discrepancy further and further. If you want to improve framerates in these situations, you should not look at the CPU, but at shaders (or rather, at the number and complexity of draw calls issued).
  25. I'm just concerned that we already have way too many cvars affecting rendering behaviour, and this could be confusing for some players experimenting with them. At the very least, you might want to add to the description that this should not be messed with. May I ask what exactly you want to test without fence syncs?
×
×
  • Create New...