Popular Post greebo Posted February 25, 2022 Popular Post Report Posted February 25, 2022 I'm opening this topic to summarise the technical changes that have been made to DR's renderer and get some feedback from my fellow coders. I'd love to get a peer review on the code changes, but going through that by looking at a pull request of that renderer branch would be a terrible experience, I assume, so instead I'd like to give an overview over what is done differently now. General things to know about DR's renderer DarkRadiant needs to support three different render views or modes: orthographic view, editor preview (fullbright) and lighting preview. Each of them has very different needs, but the lit preview is the most complex one, since it ideally should resemble what the TDM engine is producing. Apart from the obvious things like brush faces and model geometry, it needs to support drawing editor-specific things like path connection lines, light volumes, manipulators (like the rotation widget) or patch vertices. Nodes can be selected, which makes them appear highlighted: they display a red overlay and a white outline in the camera preview, whereas the orthoview shows selected item using a thicker red dashed line to outline selected items. DarkRadiant cannot specialise its renderer on displaying triangles only. Path lines for instance are using GL_LINE_STRIPs, Single brush faces (windings) are using GL_POLYGON for their outline (triangulation of brush faces in the ortho view or the camera (when selected) introduce a lot of visual noise, we just want the outline), patches want to have their control mesh rendered using GL_QUADS. Model surfaces (like .ASE and .LWO models) on the other hand are using GL_TRIANGLES all the way. Almost every object in DarkRadiant is mutable and can change its appearance as authors are manipulating the scene. CPU-intensive optimisations like generating visportal areas is not a likely option for DR, the scene can fundamentally change between operations. The Renderer before the changes DR's rendering used to work like this: all the visible scene nodes (brushes, patches, entities, models, etc.) were collected. They have been visited and were asked to forward any Renderable object they'd like to display to a provided RenderableCollector. The collector class (as part of the frontend render pass) sorted these renderables into their shaders (materials). So at the end of the front end pass, every shader held a list of objects it needed to display. The back end renderer sorted all the material stages by priority and asked each of them to render the objects that have been collected, by calling their OpenGLRenderable::render() method. After all objects rendered their stuff, the shader objects were emptied for the next frame. Culling of invisible objects has been happening by sorting objects into an Octree (which is a good choice for ortho view culling), some culling has been done in the render methods themselves (both frontend and backend calls). The problems at hand Doing the same work over and over again: it's rare that all the objects in the scene change at once. Usually prefabs are moved around, faces are textured, brushes are clipped. When flying through a map using the camera view, or by shifting the ortho view around, the scene objects are unchanged for quite a number of frames. Separation of concerns: every renderable object in the scene has been implementing its own render() method that invoked the corresponding openGL calls. There were legacy-style glBegin/glEnd rendering (used for path nodes), glDrawElements, glCallList, including state changes like enabling arrays, setting up blend modes or colours. These are render calls that should rather be performed by the back end renderer, and should not be the responsibility of, let's say, a BrushNode. Draw Calls: Since every object has been submitting its own geometry, there has been no way to group the calls. A moderately sized map features more than 50k brush faces, and about half as many patch surfaces. Rendering the whole map can easily add up to about 100k draw calls, with each draw call submitting 4 vertices (using GL_POLYGON). Inconsistent Vertex Data: since each object was doing the rendering on its own, it has been free to choose what format to save its data in. Some stored just the vertex' 3D coordinate, some had been adding colour information, some were using full featured vertices including normal and tangents. State Changes: since every object was handled individually, the openGL state could change back and forth in between a few brush windings. The entity can be influencing the shader passes by altering e.g. the texture matrix, so each renderable of the same material triggered a re-evaluation of the material stage, leading to a massive amount of openGL state changes. Then again, a lot of brushes and patches are worldspawn, which never does anything like this, but optimisation was not possible since the backend knew nothing about that. Lighting mode rendering: Lighting mode had a hard time figuring out which object was actually hit by a single light entity. Also, the object-to-entity relationship was tough to handle by the back end. Seeing how idTech4 or the TDM engine is handling things, DR has been doing it reversed. Lighting mode rendering has been part of the "solid render" mode, which caused quite a few if/else branches in the back end render methods. Lighting mode and fullbright mode are fundamentally different, yet they're using the same frontend and backend methods. The Goals openGL calls moved to the backend: no (frontend) scene object should be bothered with how the object is going to be rendered. Everything in terms of openGL is handled by the back end. Reduced amount of draw calls: so many objects are using the same render setup, they're using the same material, are child of the same parent entity, are even in almost the same 3D location. Windings need to be grouped and submitted in a single draw call wherever possible. Same goes for other geometry. Vertex Data stored in a central memory chunk: provide an infrastructure to store all the objects in a single chunk of memory. This will enable us to transition to store all the render data in one or two large VBOs. Support Object Changes: if everything should be stored in a continuous memory block, how do we go about changing, adding and removing vertex data? Changes to geometry (and also material changes like when texturing brushes) is a common use-case and it must happen fast. Support Oriented Model Surfaces: many map objects are influenced by their parent node's orientation, like a torch model surface that is rotated by the "rotation" spawnarg of its parent entity. A map can feature a lot of instances of the same model, the renderer needs to support that use-case. On the other hand, brush windings and patches are never oriented, they are always using world coordinates. Unified vertex data format: everything that is submitted as renderable geometry to the back end must define its vertex data in the same format. The natural choice would be the ArbitraryMeshVertex type that has been around for a while. All in all, get closer to what the TDM engine is doing: by doing all of the above, we put ourselves in the position to port more engine render features over to DR, maybe even add a shadow implementation at some point. 10 Quote
greebo Posted February 25, 2022 Author Report Posted February 25, 2022 Fulllbright Approach The front-end render pass is reduced to a single call: onPreRender(const VolumeTest&) - when invoked, each node has the ability to check for any updates that have been happening since the last frame, like material changes or changed texture coordinates, new target lines. Nodes are no longer submitting renderables to the collector. Instead, they grab a reference to the Shader from the RenderSystem (like before), and attach their geometry to it. The geometry will stay attached to the shader until it is updated or removed by the Node during a future onPreRender call or if it's removed from the scene. Shaders provide a specialised API for the most common use cases: an API for brush windings (IWindingRenderer), an API for general purpose geometry (path boxes, target lines, vertices, quads) called IGeometryRenderer and an API for triangulated, oriented surfaces (models) called ISurfaceRenderer. The Nodes will not know how the shader is dealing with their data, but they will receive a numeric Slot Handle that will allow them to update or remove their geometry later. The above IWhateverRenderer implementations are designed to internally combine as many objects as possible. No distinction between Orthoview rendering and Camera rendering (renderWireframe and renderSolid are gone). It's all about the shaders, they know whether they are suitable for rendering in one of these view types, or both. The Shader implementation provide a drawSurfaces() method that is invoked by a shader pass during the back end rendering phase. This will set up the glEnableClientState() calls and submit the data through glDrawElements. Windings To achieve fewer draw calls, all windings of a given size (more than 90% of the faces have 4 vertices) will be packed together into a single CompactWindingVertexBuffer that stores all windings of that material into a single large, indexed vertex array. Winding removal and re-addition is fast the buffer will keep track of empty slots and is able to re-fill them quickly with a new winding of the same size. Index generation is using a templated WindingIndexer class that is creating indices for GL_LINES, GL_POLYGON and GL_TRIANGLES. It is up to the Shader to decide which indexing method is used, orthoview shaders are using GL_LINES, while camera preview is using GL_TRIANGLES. Every winding is specified in world coordinates. Geometry This is the API used by patches, entity boxes, light volumes, vertices, etc. Objects can choose the GeometryType they are rendering: Lines, Points, Triangles and Quads. The Shader will internally sort the objects into separate buffers for each primitive type, to submit a single draw call for all the objects sharing the same type. All Geometry is using world coordinates. Surfaces This API is similar to the Geometry API, but here no data is actually submitted to the shader. Instead, IRenderableSurface objects are attached to the shader, which provide a getSurfaceTransform() method that will be used to set up the model matrix before submitting the draw calls. Surface vertices are specified in local coordinates. Highlighting The shader API provides an entry point to render a single object when it is selected. This is going to be much slower than the usual draw calls, but the assumption is that only a small portion of all map objects is selected at the same time. Vertex Storage While the data is now stored in the shader, it's still in the main RAM. No VBOs have been used yet, that would be a logical next optimisation step. Results With the above changes, the amount of draw calls in a fairly sized map when from 80k down to a few hundred. While the first attempts of combining the brushes doubled the frame rate of my benchmark map (using the same position and view angles, drawing it 100 times), this later went down to a 30% speed improvement after migrating the model surfaces. It turns out that rendering the models using display lists is really fast, but it violated the principle of moving the calls to the backend. It has to be taken into account that after the changes, the vertex data is still stored the main memory, not in the VBO. 1 Quote
Popular Post greebo Posted February 25, 2022 Author Popular Post Report Posted February 25, 2022 Lighting Mode Approach All EntityNodes are registering themselves in the RenderSystem on scene insertion. Same for the Lights, they register themselves as RenderableLight objects, so the OpenGLRenderSystem knows about every IRenderEntity and each RendererLight. The frontend phase is using the same onPreRender() call. All scene nodes are aware of their parent IRenderEntity (this had already been done before), which enables them to attach IRenderableObjects to their IRenderEntity. This way every entity knows about the renderable objects it is a parent of. During back end rendering, the algorithm makes use of the IRenderEntity::foreachRenderableTouchingBounds() method, to select those objects that are intersecting with a light's volume. IGeometryStore Every IRenderableObject has its vertex data stored in the central IGeometryStore owned by the render system. It doesn't know where that is exactly, but it will receive a Slot Handle to be able to update the geometry or remove it later. The IRenderableObject::getStorageLocation() method will expose the storage handle and enables the back end renderer to access the data by object. The geometry store handles two RAM buffers that are protected by glFenceSync, in preparation of moving all that data to a GPU buffer and not running the risk of altering the data while it's still in use. The number 2 can be increased to a higher number if needed (TDM is using 3 of them). Changes and updates to the geometry buffer are recorded during the front-end render pass and are propagated to the secondary buffer when switching between frames, to keep the amount of vertex data copied around reasonably low. The back end is currently processing all the IRenderableObjects one by one, it will use the same glDrawElementsBaseVertex call for every encountered object (so there's room for optimisation here, possibly bunching the calls together and then using glMultiDrawElementsBaseVertex). Windings Windings are special again, and not very optimised as of yet. BrushNodes don't do anything, it's the Shader (in their role as WindingRenderer) that is grouping the windings per entity and clustering them into one large IRenderableObject per entity. It is likely to intersect with far too many lights in the scene, so there's room for using a flexible space partitioning system here. Geometry and Surfaces The base implementations provide a convenient attachToEntity() method which takes care of the bureaucracy. The nodes just need to call it with the correct IRenderEntity* argument. Backend I tried to use the TDM renderer as blue print. There's a dedicated RenderSystem::renderLitScene() method which is called by the CamWnd when in lighting mode. The steps are (see render/backend/LightingModeRenderer.cpp): For every known light, check each known entity and intersect the objects Every intersecting object will produce an interaction, objects are sorted by entity and material All collected objects will be considered for the depth fill pass: only the suitable materials provide that pass Interaction Pass: draw the objects per light and entity, using the correct GLSL program Blend Pass: draw all other passes like blend stages or skyboxes. The cubemap program needed to render skyboxes has been implemented using GLSL. It doesn't handle reflective stages yet, only regular cubemaps. Results Everything is still pretty rough and not optimised yet, but it's working. Particle rendering, skyboxes, blend stages and regular light interactions are properly showing up, so it's at least at the same feature level before the changes, which was what I've been aiming for in this branch. 7 2 Quote
OrbWeaver Posted February 27, 2022 Report Posted February 27, 2022 Overall these changes sound excellent. You have correctly (as far as I can tell) identified the major issues with the DR renderer and proposed sensible solutions that should improve performance considerably and leave room for future optimisations. In particular, trying to place as much as possible in a big chunk of contiguous RAM is exactly the sort of thing that GPUs should handle well. Some general, high-level comments (since I probably haven't even fully understood the whole design yet, much less looked at the code). Wireframe versus 3D I always thought it was dumb that we had different methods to handle these: at most it should have been an enum/bool parameter. So it's good to see that you're getting rid of this distinction. Unlit versus lit renders As you correctly point out, these are different, particularly in terms of light intersections and entity-based render parameters (neither of which need to be handled in the unlit renderer), so it makes sense to separate them and not have a load of if/then statements in backend render methods which just slow things down. However, if I'm understanding correctly, in the new implementation almost every aspect will be separate, including the backend data storage. Surely a lot of this is going to be the same in both cases — if a brush needs to submit a bunch of quads defined by their vertices, this operation would be the same regardless of whatever light intersection or GLSL setup calculations were performed first? Even if lighting mode needs extra operations to handle lighting-specific tasks, couldn't the actual low-level vertex sorting and submission code be shared? If double RAM buffers and glFenceSync improves performance in lit mode, wouldn't unlit mode also benefit from the same strategy? I guess another way of looking at is is: could "unlit mode" actually be a form of lit mode where lighting intersections were skipped, submitted lights were ignored, and the shader was changed to return full RGB values for every fragment? Or does this introduce performance problems of its own? Non-const shaders I've never liked the fact that Shaders are global (non-threadsafe) modifiable state — it seems to me that a Shader should know how to render things but should not in itself track what is being rendered. Your changes did not introduce this problem and they don't make it any worse, so it's not a criticism of your design at all, but I wonder if there would be scope to move towards a setup whereby the Shaders themselves were const, and all of the state associating shaders with their rendered objects was held locally to the render operation (or maybe the window/view)? This might enable features like a scrollable grid of model previews in the Model Selector, which I've seen used very effectively in other editors. But perhaps that is a problem for the future rather than today. Winding/Geometry/Surface Nothing wrong with the backend having more knowledge about what is being rendered if it helps optimisation, but I'm a little unclear on the precise division of responsibilities between these various geometry types. A Winding is an arbitrary convex polygon which can be rendered with either GL_LINES or GL_POLYGON depending on whether this is a 2D or 3D view (I think), and most of these polygons are expected to be quads. But Geometry can also contain quads, and is used by patches which also need to switch between wireframe and solid rendering, so I guess I'm not clear on where the boundary lies between a Winding and Geometry. Surface, on the other hand, I think is used for models, but in this case the backend just delegates to the Model object for rendering, rather than collating the triangles itself? Is this because models can have a large variation in the number of vertices, and trying to allocate "slots" for them in a big buffer would be more trouble than it's worth? I've never had to write a memory allocator myself so I can can certainly understand the problems that might arise with fragmentation etc, but I wonder if these same problems won't rear their heads even with relatively simple Windings. Render light by light Perfect. This is exactly what we need to be able to implement things like shadows, fog lights etc (if/when anybody wishes to work on this), so this is definitely a step in the right direction. Overall, these seem like major improvements and the initial performance figures you quote are considerable, so I look forward to checking things out when it's ready. 1 Quote DarkRadiant homepage ⋄ DarkRadiant user guide ⋄ OrbWeaver's Dark Ambients ⋄ Blender export scripts
greebo Posted February 28, 2022 Author Report Posted February 28, 2022 First of all, thanks for taking the time to respond, it has been getting wordier than I anticipated. 7 hours ago, OrbWeaver said: Wireframe versus 3D I always thought it was dumb that we had different methods to handle these: at most it should have been an enum/bool parameter. So it's good to see that you're getting rid of this distinction. Yes, the distinction is in the shaders now. There is still a possibility to distinguish these two, since the VolumeTest reference provides the fill() check, so some onPreRender() methods are reacting to this and prepare different renderables. It's still necessary at this point, since some wireframe renderables are calling for a different appearance. This doesn't mean that it can't get any simpler though. The objects are still calling for a coloured line shader, like <0 0 1> for a blue one. In principle, now that the vertex colour is shipped along with the geometry data, the colour distinction in the shader itself is maybe not even necessary anymore. There could be a single line shader, used to draw stuff in the orthoview. Quote
greebo Posted February 28, 2022 Author Report Posted February 28, 2022 7 hours ago, OrbWeaver said: Unlit versus lit renders .... However, if I'm understanding correctly, in the new implementation almost every aspect will be separate, including the backend data storage. Surely a lot of this is going to be the same in both cases — if a brush needs to submit a bunch of quads defined by their vertices, this operation would be the same regardless of whatever light intersection or GLSL setup calculations were performed first? Even if lighting mode needs extra operations to handle lighting-specific tasks, couldn't the actual low-level vertex sorting and submission code be shared? If double RAM buffers and glFenceSync improves performance in lit mode, wouldn't unlit mode also benefit from the same strategy? I guess another way of looking at is is: could "unlit mode" actually be a form of lit mode where lighting intersections were skipped, submitted lights were ignored, and the shader was changed to return full RGB values for every fragment? Or does this introduce performance problems of its own? I suspect it's all about the draw calls. In fullbright mode DR is now invoking much fewer GL calls compared to lit render mode. There's not so much difference when it comes to the oriented model surfaces, these are really almost the same (and are using the same vertex storage too), here it's the brushes and patches. Lit mode is grouping by entity - to regain the advantage of submitting everything in one go, it would need to dissolve that grouping information, which has to happen every frame. I think this is going to be too taxing. Maybe when lit mode is more optimised, we can try to merge the two modes. You've made a correct observation about the backend data storage though: the geometry and winding renderers are (at the moment) not sharing their vertex data between the two modes, memory is duplicated and copied around often. That's not good. The main reason for this duplication is the chronological order I adjusted the renderer. I was chewing by this starting with fullbright mode, first brushes, then patches, then models, finally the visual aids like lines and points. After that I was moving forward to do the research on lit mode, and all that reflects in the code. I admit that I took this approach on purpose: when starting, I didn't have the full grasp of what is going to be necessary, I had to learn along the way (and aim for not getting burnt out half-way through). Now the full picture is available, the thing can be further improved, and the storage is probably among the first things that need to be optimised. Quote
greebo Posted February 28, 2022 Author Report Posted February 28, 2022 7 hours ago, OrbWeaver said: Non-const shaders I've never liked the fact that Shaders are global (non-threadsafe) modifiable state — it seems to me that a Shader should know how to render things but should not in itself track what is being rendered. Your changes did not introduce this problem and they don't make it any worse, so it's not a criticism of your design at all, but I wonder if there would be scope to move towards a setup whereby the Shaders themselves were const, and all of the state associating shaders with their rendered objects was held locally to the render operation (or maybe the window/view)? This might enable features like a scrollable grid of model previews in the Model Selector, which I've seen used very effectively in other editors. But perhaps that is a problem for the future rather than today. Yes, this is interesting. It's achievable, with some cost, of course. Right now, the Shaders themselves implement the interfaces IWindingRenderer, IGeometryRenderer and ISurfaceRenderer. A different authority could implement these interfaces, but it needs to map the objects to the Shaders somehow (likely by using a few std::maps). The renderer then calls that authority to deliver that information, this way we can separate that information. The fullbright backend renderer needs that info when processing the sorted shader passes. Currently they ask their owning shader to draw its surfaces, this has to be moved elsewhere. The lighting mode renderer is using the objects as delivered by the render entities, this is not involving the Shader doing the housekeeping. So this renderer is already heading more towards this direction. Quote
greebo Posted February 28, 2022 Author Report Posted February 28, 2022 8 hours ago, OrbWeaver said: Winding/Geometry/Surface Nothing wrong with the backend having more knowledge about what is being rendered if it helps optimisation, but I'm a little unclear on the precise division of responsibilities between these various geometry types. A Winding is an arbitrary convex polygon which can be rendered with either GL_LINES or GL_POLYGON depending on whether this is a 2D or 3D view (I think), and most of these polygons are expected to be quads. But Geometry can also contain quads, and is used by patches which also need to switch between wireframe and solid rendering, so I guess I'm not clear on where the boundary lies between a Winding and Geometry. It's the way they are internally stored to reduce draw calls, but they are indeed similar. I implemented the IWindingRenderer first, since that was the most painful spot, and I tailored it exactly for that purpose. The CompactWindingVertexBuffer template is specialised to the needs of fixed-size Windings, and the buffer is designed to support fast insertions, updates and (deferred) deletions. I guess it's not very useful for the other Geometry types, but I admit that I didn't even try to merge the two use cases. I tackled one field after the other, it's possible that the CompactWindingVertexBuffer can now be replaced to use some of the pieces I implemented for the lit render mode - there is another ContinuousBuffer<> template that might be suitable by the IWindingRenderer, for example. It's very well possible that the optimisation I made for brush windings was premature and that parts of it can be handled by the less specialised structures without sacrificing much performance. 8 hours ago, OrbWeaver said: Surface, on the other hand, I think is used for models, but in this case the backend just delegates to the Model object for rendering, rather than collating the triangles itself? Is this because models can have a large variation in the number of vertices, and trying to allocate "slots" for them in a big buffer would be more trouble than it's worth? I've never had to write a memory allocator myself so I can can certainly understand the problems that might arise with fragmentation etc, but I wonder if these same problems won't rear their heads even with relatively simple Windings. The model object is not involved in any rendering anymore, it just creates and registers the IRenderableSurface object. The SurfaceRenderer is then copying the model vertices in the large GeometryStore - memory duplication again (the model node needs to keep the data around for model scaling). The size of the memory doesn't seem to be a problem, the data is static and is not updated very often (except when scaling, but the number of vertices and indices stays the same). The thing that makes surfaces special is their orientation, they have to be rendered one after the other, separated by glMultMatrix() calls. Speaking about writing the memory allocator: I was quite reluctant to write all that memory management code, but I saw no escape routes for me. It must have been the billionth time this has been done on this planet. Definitely not claiming that I did a good job on any of those, but at least it doesn't appear in the profiler traces. 1 Quote
OrbWeaver Posted February 28, 2022 Report Posted February 28, 2022 16 hours ago, greebo said: The objects are still calling for a coloured line shader, like <0 0 1> for a blue one. In principle, now that the vertex colour is shipped along with the geometry data, the colour distinction in the shader itself is maybe not even necessary anymore. There could be a single line shader, used to draw stuff in the orthoview. That would be something worth profiling, for sure. I actually have no idea what is better for performance: setting a single glColor and then rendering all vertices without colours, or passing each colour per-vertex even if they are all the same colour. Perhaps it varies based on the GPU hardware. 16 hours ago, greebo said: The main reason for this duplication is the chronological order I adjusted the renderer. I was chewing by this starting with fullbright mode, first brushes, then patches, then models, finally the visual aids like lines and points. After that I was moving forward to do the research on lit mode, and all that reflects in the code. I admit that I took this approach on purpose: when starting, I didn't have the full grasp of what is going to be necessary, I had to learn along the way (and aim for not getting burnt out half-way through). Now the full picture is available, the thing can be further improved, and the storage is probably among the first things that need to be optimised. That's perfectly reasonable of course. I probably would have approached things the same way. Minimising divergent code paths is good for future maintainability but it doesn't need to happen right away, and can be implemented piecemeal if necessary (e.g. the Brush class still has separate methods for lit vs unlit rendering, but they can delegate parts of their functionality to a common private method). 15 hours ago, greebo said: Yes, this is interesting. It's achievable, with some cost, of course. Right now, the Shaders themselves implement the interfaces IWindingRenderer, IGeometryRenderer and ISurfaceRenderer. A different authority could implement these interfaces, but it needs to map the objects to the Shaders somehow (likely by using a few std::maps). The renderer then calls that authority to deliver that information, this way we can separate that information. Yes, that's what I would imagine to be the hurdle with const shaders — the mapping between Shader and objects has to happen somewhere, and if it isn't in the shader itself then some external map needs to be maintained, which might be a performance issue if relatively heavyweight structures like std::maps need to be modified thousands of times per frame. 15 hours ago, greebo said: It's the way they are internally stored to reduce draw calls, but they are indeed similar. I implemented the IWindingRenderer first, since that was the most painful spot, and I tailored it exactly for that purpose. The CompactWindingVertexBuffer template is specialised to the needs of fixed-size Windings, and the buffer is designed to support fast insertions, updates and (deferred) deletions. I guess it's not very useful for the other Geometry types, but I admit that I didn't even try to merge the two use cases. I tackled one field after the other, it's possible that the CompactWindingVertexBuffer can now be replaced to use some of the pieces I implemented for the lit render mode - there is another ContinuousBuffer<> template that might be suitable by the IWindingRenderer, for example. I would certainly give consideration to whether the windings and geometry could use the same implementation, because it does seem to me that their roles are more or less the same: a buffer of vertices in world space which can be tied together into various primitive types. This is something that VBOs will handle well — it should be possible to upload all the vertex data into a single buffer, then dispatch as many draw calls using whatever primitive types are desired, making reference to particular subsets of the vertices. This could make a huge difference to performance because once the data is in the VBO, you don't need to send it again until something changes (and even then you can map just a subset of the buffer and update that, rather than refreshing the whole thing). 15 hours ago, greebo said: The model object is not involved in any rendering anymore, it just creates and registers the IRenderableSurface object. The SurfaceRenderer is then copying the model vertices in the large GeometryStore - memory duplication again (the model node needs to keep the data around for model scaling). The size of the memory doesn't seem to be a problem, the data is static and is not updated very often (except when scaling, but the number of vertices and indices stays the same). The thing that makes surfaces special is their orientation, they have to be rendered one after the other, separated by glMultMatrix() calls. Ah, I didn't spot the difference in coordinate spaces. That is one fundamental difference between models and other geometry which might merit keeping a separate implementation. So I guess we might end up with a TransformedMeshRenderer for models and a WorldSpacePrimitiveRenderer for everything else, or some distinction like that. 15 hours ago, greebo said: Speaking about writing the memory allocator: I was quite reluctant to write all that memory management code, but I saw no escape routes for me. It must have been the billionth time this has been done on this planet. Definitely not claiming that I did a good job on any of those, but at least it doesn't appear in the profiler traces. Unfortunately this is one of the times when manual memory management really is necessary: if we want to (eventually) put things in a VBO, the buffer has to be managed C-style with byte pointers, offsets and the like. I certainly don't envy you having to deal with it, but the work should be valuable because it will transition very neatly into the sort of operations needed for managing VBO memory. Quote DarkRadiant homepage ⋄ DarkRadiant user guide ⋄ OrbWeaver's Dark Ambients ⋄ Blender export scripts
Popular Post greebo Posted March 26, 2022 Author Popular Post Report Posted March 26, 2022 I managed to port some of the shadow mapping code over to DR, it can now support up to 6 shadow-casting lights. Of course, it'll never be as pretty as TDM, and nowhere near as fast, but it's a start. 8 1 Quote
thebigh Posted March 26, 2022 Report Posted March 26, 2022 This looks impressive. It's really going to make designing my lighting much easier. Quote My missions: Stand-alone Duncan Lynch series Down and Out on Newford Road the Factory Heist The Wizard's Treasure A House Call The House of deLisle
OrbWeaver Posted March 29, 2022 Report Posted March 29, 2022 An amazing leap forward for the DR renderer. Although all of this manual synchronisation work makes me think that it would be really nice to have some of the common code split into a DLL which could be used from DR as well as the game engine, allowing both editor and game to behave the same without needing a whole bunch of duplicated code. But of course that introduces difficulties of its own, especially when the two projects are using entirely different source control systems. 1 Quote DarkRadiant homepage ⋄ DarkRadiant user guide ⋄ OrbWeaver's Dark Ambients ⋄ Blender export scripts
greebo Posted March 29, 2022 Author Report Posted March 29, 2022 Not only the version control system, also the data structures and coding paradigms are like from two different planets. I could merely use the engine code as rough blueprint, and it was immensely helpful for me, I could learn a lot about more modern openGL. Speaking about sharing code, what would be really nice would be to have a plugin containing the DMAP algorithm. But from what I remember when trying to port this over to DarkRadiant years ago, that code is also tied to the decl system and the materials and even image loading. Maybe @stgatilov, having worked on dmap recently, might share some insight on whether this piece of code would be feasible to isolate and move to a DLL with a nice interface. Having leak detection and portalisation code available to DarkRadiant would be beneficial for renderer performance too. Right now, it's completely unportalised and slow as heck. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.