Jump to content
The Dark Mod Forums

revelator

Development Role
  • Posts

    1174
  • Joined

  • Last visited

  • Days Won

    21

Everything posted by revelator

  1. Well i be damned score 1 for the steroid stuff (guess duke nukem knows his stuff) the swelling is allmost gone, still a bit itchy, the fewers down to allmost normal all in one night. Guess im back though i feel bit worse for wear its nothing compared to the last few days .
  2. The grim reaper will not get me alive !!! ... oh feels worse than it probably is these kinda irks usually do (looks like crap to heh) but yeah this was kinda a complication i could have been without. Normally only allergic to pine needles so it came as a bit of a surprise when i got this sick from exposure to a plant oil. Doc had his doubts that its even the oil im allergic to, but rather that it comes from a curing ingredient in the leather i was applying it to "chromium" which got released when i was rubbing it in. The fewer is probably the worst part but i can bring it down to more resonable levels with normal kodimagnyl tablets, bit hard on the stomach though so i hope the new medicin works before i burn a hole through my intestines with stomach acid. And steve, ill follow up as much as im able to atm, no worries
  3. Ah damn yeah i made most of the GLEW changes before comitting sorry im not quite myself high fewer (40'ish") changes mostly affect about 3 files from the vanilla source allthough i since split up parts of the original GLEW implement so it may be more now. From the top of my head look at changes to qgl.h qgl.cpp win_glimp.cpp RenderSystem_init.cpp and RenderSystem.h. The most critical change is in this function GLimp_Init compare it to the original because you have to reorder a few things for GLEW to work correctly, should be easy spotting the difference. Aside from that you need to add the GLEW include dir and GLEW library to the build, best place for MSVC is probably _Common.props which i use to tell the compiler where things are. Seems im not getting out of bed anytime soon normal medication has failed and im now on steroid type medication to brute force the inflammation down, hopefully it works before this crap kills me.
  4. Thanks guys. Should be in one of the first commits i think.
  5. Sorry for a late reply, but i been bedridden with a nasty allergic reaction to olive oil of all things :S. Im still not on 100% high fever and i look like something frankenstein threw away, but yeah ill guide you as best possible when im up an about about again.
  6. Foreign try danish -> ja vores sprog er verdens sværeste
  7. Heh thanks peeps hope im not getting to old to learn things like that im near on my fifties agh. If you need a hand with electronics im a major allthough its been years since i last used my skills. Im also one of the last graduates in denmark who can repair and build tube amplifiers Guess my teacher sucked as much at algebra as i ended up doing but ok my school years where back in the days where more advanced math only just surfaced in normal school here in denmark so i guess he had just as much trouble explaining it as i had understanding it.
  8. Its not that bad but yeah for a first timer it might sound like gibberish once you get cracking with some code it all starts to come together pretty fast though The Math is the bitch here that takes years to learn and im still not done besides being on the scene for around 20 years hehe. Why oh why do i suck so much at algebra im rather good at the rest of the math but i could newer wrap my head around algebra.
  9. GLEW is pretty easy to add as a replacement for the hardcoded api calls in Vanilla theres one 'but' though the old gllog functions need to go as they are incompatible with GLEW. Should go anyway since they newer worked to begin with and they are a leftover from idtech3 which used OpenGL 1.1 functions a lot more than Vanilla does. Besides that you need to rearrange a few calls in win_glimp.cpp (two to be exact i think), else its easy going . And thanks for the heads up about the shortcomming of macros, that one i did not know makes one wonder if a function would have been more appropriate.
  10. Also im a bit stumped why id's devs created those macros with an asm call for every single line when you can just do __asm { tons of assembler keywords, another ton of assembler keywords, etc } and get away with a single __asm line Oo
  11. Should save you a lot of work GLEW supports upto OpenGL 4.2 so that should do nicely. And your welcome to contact me for help if something seems unclear. As for the SSE bug i actually found that you can kill both the __asm calls before the macros and it still works atleast i havent seen anything odd yet. Im a bit surprised about this as you might guess but removing the above two seems to not trigger intellisense either and it builds without warnings hmm ?!?.
  12. Aye its a bit hard to wrap the head around reason i found out that it was wrong was because i formatted my source and what was previously a single line of code was formatted to now be escaped into two lines. This macro KFLOATOPER funny enough it compiled but i noticed a ton of warnings about non standard syntax suddenly from microsofts intellisense, so i had a look at the function and noticed that compared to the above macro the end bit was missing all the __asm keywords and the macro before that which had a similar end had them in two places. I think what happened was that the dev intended one __asm call on the end function in those macros but put the second one in the wrong spot in the macro above that allready had an __asm keyword defined. So it should actually look like this i think. // operate on a constant and a float array #define KFLOAT_CA( ALUOP, DST, SRC, CONSTANT, COUNT ) \ int pre,post; \ __asm movss xmm0,CONSTANT \ __asm shufps xmm0,xmm0,0 \ KFLOATINITDS( DST, SRC, COUNT, pre, post ) \ __asm and eax,15 \ __asm jne lpNA \ __asm jmp lpA \ __asm align 16 \ __asm lpA: \ __asm prefetchnta [edx+ebx+64] \ __asm movaps xmm1,xmm0 \ __asm movaps xmm2,xmm0 \ __asm ALUOP##ps xmm1,[edx+ebx] \ __asm ALUOP##ps xmm2,[edx+ebx+16] \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpA \ __asm jmp done \ __asm align 16 \ __asm lpNA: \ __asm prefetchnta [edx+ebx+64] \ __asm movaps xmm1,xmm0 \ __asm movaps xmm2,xmm0 \ __asm movups xmm3,[edx+ebx] \ __asm movups xmm4,[edx+ebx+16] \ __asm ALUOP##ps xmm1,xmm3 \ __asm ALUOP##ps xmm2,xmm4 \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpNA \ __asm done: \ __asm mov edx,SRC \ __asm mov edi,DST \ __asm KFLOATOPER( KALUDSS1( ALUOP, [edi+ebx],xmm0,[edx+ebx] ), \ KALUDSS4( ALUOP, [edi+ebx],xmm0,[edx+ebx] ), COUNT ) // operate on two float arrays #define KFLOAT_AA( ALUOP, DST, SRC0, SRC1, COUNT ) \ int pre,post; \ KFLOATINITDSS( DST, SRC0, SRC1, COUNT, pre, post ) \ __asm and eax,15 \ __asm jne lpNA \ __asm jmp lpA \ __asm align 16 \ __asm lpA: \ __asm movaps xmm1,[edx+ebx] \ __asm movaps xmm2,[edx+ebx+16] \ __asm ALUOP##ps xmm1,[esi+ebx] \ __asm ALUOP##ps xmm2,[esi+ebx+16] \ __asm prefetchnta [edx+ebx+64] \ __asm prefetchnta [esi+ebx+64] \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpA \ __asm jmp done \ __asm align 16 \ __asm lpNA: \ __asm movups xmm1,[edx+ebx] \ __asm movups xmm2,[edx+ebx+16] \ __asm movups xmm3,[esi+ebx] \ __asm movups xmm4,[esi+ebx+16] \ __asm prefetchnta [edx+ebx+64] \ __asm prefetchnta [esi+ebx+64] \ __asm ALUOP##ps xmm1,xmm3 \ __asm ALUOP##ps xmm2,xmm4 \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpNA \ __asm done: \ __asm mov edx,SRC0 \ __asm mov esi,SRC1 \ __asm mov edi,DST \ __asm KFLOATOPER( KALUDSS1( ALUOP, [edi+ebx],[edx+ebx],[esi+ebx] ), \ KALUDSS4( ALUOP, [edi+ebx],[edx+ebx],[esi+ebx] ), COUNT )
  13. after some discussion i went over the SSE math in idlib and to my horror i actually uncovered a bug which might actually be the cause of some of the problems we have had with floating point precision. // operate on two float arrays #define KFLOAT_AA( ALUOP, DST, SRC0, SRC1, COUNT ) \ int pre,post; \ KFLOATINITDSS( DST, SRC0, SRC1, COUNT, pre, post ) \ __asm and eax,15 \ __asm jne lpNA \ __asm jmp lpA \ __asm align 16 \ __asm lpA: \ __asm movaps xmm1,[edx+ebx] \ __asm movaps xmm2,[edx+ebx+16] \ __asm ALUOP##ps xmm1,[esi+ebx] \ __asm ALUOP##ps xmm2,[esi+ebx+16] \ __asm prefetchnta [edx+ebx+64] \ __asm prefetchnta [esi+ebx+64] \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpA \ __asm jmp done \ __asm align 16 \ __asm lpNA: \ __asm movups xmm1,[edx+ebx] \ __asm movups xmm2,[edx+ebx+16] \ __asm movups xmm3,[esi+ebx] \ __asm movups xmm4,[esi+ebx+16] \ __asm prefetchnta [edx+ebx+64] \ __asm prefetchnta [esi+ebx+64] \ __asm ALUOP##ps xmm1,xmm3 \ __asm ALUOP##ps xmm2,xmm4 \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpNA \ __asm done: \ __asm mov edx,SRC0 \ __asm mov esi,SRC1 \ __asm mov edi,DST \ __asm KFLOATOPER( KALUDSS1( ALUOP, [edi+ebx],[edx+ebx],[esi+ebx] ), \ __asm KALUDSS4( ALUOP, [edi+ebx],[edx+ebx],[esi+ebx] ), COUNT ) the above is the fixed function. heres what it looked like earlier // operate on two float arrays #define KFLOAT_AA( ALUOP, DST, SRC0, SRC1, COUNT ) \ int pre,post; \ KFLOATINITDSS( DST, SRC0, SRC1, COUNT, pre, post ) \ __asm and eax,15 \ __asm jne lpNA \ __asm jmp lpA \ __asm align 16 \ __asm lpA: \ __asm movaps xmm1,[edx+ebx] \ __asm movaps xmm2,[edx+ebx+16] \ __asm ALUOP##ps xmm1,[esi+ebx] \ __asm ALUOP##ps xmm2,[esi+ebx+16] \ __asm prefetchnta [edx+ebx+64] \ __asm prefetchnta [esi+ebx+64] \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpA \ __asm jmp done \ __asm align 16 \ __asm lpNA: \ __asm movups xmm1,[edx+ebx] \ __asm movups xmm2,[edx+ebx+16] \ __asm movups xmm3,[esi+ebx] \ __asm movups xmm4,[esi+ebx+16] \ __asm prefetchnta [edx+ebx+64] \ __asm prefetchnta [esi+ebx+64] \ __asm ALUOP##ps xmm1,xmm3 \ __asm ALUOP##ps xmm2,xmm4 \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpNA \ __asm done: \ __asm mov edx,SRC0 \ __asm mov esi,SRC1 \ __asm mov edi,DST \ KFLOATOPER( KALUDSS1( ALUOP, [edi+ebx],[edx+ebx],[esi+ebx] ), \ KALUDSS4( ALUOP, [edi+ebx],[edx+ebx],[esi+ebx] ), COUNT ) notice the lack of the __asm keyword in the macro calls. theres a similar function above this macro // operate on a constant and a float array #define KFLOAT_CA( ALUOP, DST, SRC, CONSTANT, COUNT ) \ int pre,post; \ __asm movss xmm0,CONSTANT \ __asm shufps xmm0,xmm0,0 \ KFLOATINITDS( DST, SRC, COUNT, pre, post ) \ __asm and eax,15 \ __asm jne lpNA \ __asm jmp lpA \ __asm align 16 \ __asm lpA: \ __asm prefetchnta [edx+ebx+64] \ __asm movaps xmm1,xmm0 \ __asm movaps xmm2,xmm0 \ __asm ALUOP##ps xmm1,[edx+ebx] \ __asm ALUOP##ps xmm2,[edx+ebx+16] \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpA \ __asm jmp done \ __asm align 16 \ __asm lpNA: \ __asm prefetchnta [edx+ebx+64] \ __asm movaps xmm1,xmm0 \ __asm movaps xmm2,xmm0 \ __asm movups xmm3,[edx+ebx] \ __asm movups xmm4,[edx+ebx+16] \ __asm ALUOP##ps xmm1,xmm3 \ __asm ALUOP##ps xmm2,xmm4 \ __asm movaps [edi+ebx],xmm1 \ __asm movaps [edi+ebx+16],xmm2 \ __asm add ebx,16*2 \ __asm jl lpNA \ __asm done: \ __asm mov edx,SRC \ __asm mov edi,DST \ __asm KFLOATOPER( KALUDSS1( ALUOP, [edi+ebx],xmm0,[edx+ebx] ), \ __asm KALUDSS4( ALUOP, [edi+ebx],xmm0,[edx+ebx] ), COUNT ) notice the KFLOATOPER macro... whoops
  14. SIMD and vectorization are pretty much the same thing -> storing mutiple data in an array, theres some optimization going on behind the scene also ofc but the problem stands its off by 4 bytes ergo vanilla breaks, because its built with the x87 instruction in mind. Now the funny thing when one thinks about it is that idlib actually does use SSE but it uses an inhouse assembler based version of it built against the x87 instructions. Its also a fact that the x87 instructions have a higher internal precision than SSE but SSE has the advantage that it might offload that data on a GPU instead of solely on the CPU somewhere i sense an upcomming directx12 as the reason why they want to force this, with AMD's mantle gaining improvements every day microsoft wants to make sure that directx stays ahead of OpenGL even if they have to lobotomize there own compiler. Funny thing though i have not yet seen an OpenGL game make use of mantle but i guess better safe than sorry heh. Some discussion elsewhere on the matter http://stackoverflow.com/questions/3206101/extended-80-bit-double-floating-point-in-x87-not-sse2-we-dont-miss-it
  15. After reading about others ramming thee heads against the same pitfall im starting to wonder if it's really an AMD / Microsoft thing ?. The problem as of now is that microsoft has changed the floating point precision in all there newest compilers to SSE math which would normally be a good thing or would it ?. See the problem and also the reason why SSE and later are faster than FPU math is that SSE is actually less precise (less strict checking = more speed) in fact its off by exactly 4 bytes yes someone actually tested it. Normally floating point math allows a certain skew which is normally quite ok but not if your application which was built with FPU math in mind suddenly finds that the values are off by the aforementioned 4 bytes. In case of the Doom3 engine we are so screwed its not even funny, you see idlib actually uses its own floating point library built on the FPU or x87 standard guess what happens when the microsoft compiler tries to optimize it with SSE math yup booom right in your face the values the engine expected are now far off = broken engine. The reason i suspect more than foul play between AMD / microsoft is that the SSE instruction is actually intels, AMD had a similar optimization way way back in the K6 days called AMD-3DNOW which is now pretty much all but extinct, instead they use intels algorithms for SSE so does intel also want to chime in on the conspiracy ?. Sadly microsoft has pretty much stated that the x87 instruction will be dead from MSVC 2013 and onward newer to return despite that it actually is more precise. So as they said fix your code to work with SSE and dont call us back we will not call you. Looks like someone will have to port idlib from rbdoom3 or rewrite the old one or port the one from rbdoom3. Well atleast the parts concerning floating point math unless microsoft left other surprises for us to find.
  16. Heh i had a hunch microsoft had some inner workings with AMD, must have started early back in the AMD K6-2 era where AMD started to gain serious momentum over intel. For one i noticed that the most hated microsoft OS of all time ME ran like a charm on all boxes with AMD hardware (coincidence ???).
  17. Hehe i swear im not a microsoft conspirator but yeah that same snag was a real pain in my butt to. Sadly theres one point that is in serious need of work on Open Watcom, the C++ compiler and its standard library are not done, so Doom3 with open watcom will most likely not work yet The C compiler is complete though and works like a charm but we do need help on the C++ STL and compiler. P.Chapin is the lead dev (also the only dev) on the C++ compiler, but since hees alone work has been rather slow. Unfortunatly my C++ fu is not strong enough to help building a standard library from schratch :S so if you know someone with a lot of C++ knowledge we could use his help .
  18. While still working on Doom3 i also pitch in on several other projects where i found i could do some good. One of those projects i happened to chime in on was Open Watcom. Open Watcom was in its days considered the best compiler avaliable and the price tag certainly showed that At one point the comany was bought up by a company named sybase and updated for better support of windows (by that time it was win 3.11). Sadly microsoft by the time of windows 95 caught up and started rolling out there MSVC compiler which was a chore to work with compared to the aging Watcom compiler which btw was used to build Windows 95 in the first place, but Watcoms ide was rather hard to work with compared to MSVC's besides that it still produced higher quality programs but in the end MSVC due to its ease of use won out and Watcom slowly bleed out. At some point sybase threw the blanket and opensourced the watcom compiler minus all the microsoft owned property and a new project was born named Open Watcom which had the goal of creating a good opensourced alternative to the microsoft versions. For a few years everything worked out and the compiler was updated considerably by using mingw's windows API it could once again create windows programs allbeit with some difficulty. A developer then created a new Windows API built specifically for the watcom from scratch unfortunatly it needed some housecleaning which i helped with while also learning a bit more about the inner workings of a compiler. Sadly the original site is mostly dead now but work is still ongoing and help is wanted and appreciated. So if anyone here wants to help with the project here's an official invite . Open Watcom as of now also supports 64 bit OS but still produces 32 bit executables due to the lack of a 64 bit API so thats one thing that could use some work. Quake builds and works with it both dos winquake and OpenGL versions. It has some pretty impressive optimizations compared to pretty much any other compilers like clang. So if you have spare time and or would like to help on development visit here https://github.com/open-watcom/open-watcom-v2and become a member. Btw Open Watcom supports other OS as well, like linux OS/2 haiku and it can still create programs for windows 9x and DOS.
  19. Will be there still trying to fight off new year calamities
  20. Merry christmas all hope you have a nice new year.
  21. Just updated my revelation sources with the latest changes. Added edge of chaos game files, but while they work with my engine now they are rather buggy.. Skyboxes go crazy and shadows flicker. So consider them a work in progress. Besides that merry Xmas
  22. Btw the check for suppression can be used for lightgems to The depthcopy does take its toll though nothing alarming but you will feel it. I suspect checking for suppression is quite expensive. Btw im not quite sure R_DepthBufferImage is 100% correct as the data array was normally just Mem_Alloc(width*height) (not quite sure how a static representation of that would look ?) width and height had a value of 1024 so rather big.
  23. ok reverted but also refined it a bit. new function -> static void RB_T_CopyDepthBuffer(const viewDef_t *viewDef) { // Wooh mama this is sooooooooooo sloooooooooow mblgrblrblr. bool depthCopied = false; for (viewEntity_t *viewEnt = viewDef->viewEntitys; viewEnt; viewEnt = viewEnt->next) { idRenderEntityLocal *ent = viewEnt->entityDef; // copy once update on change. if (depthCopied) { continue; } // check if depth view is suppressed. if (!ent->parms.suppressSurfaceInViewID && (ent->parms.suppressSurfaceInViewID != viewDef->renderView.viewID)) { globalImages->currentDepthImage->CopyDepthbuffer( backEnd.viewDef->viewport.x1, backEnd.viewDef->viewport.y1, backEnd.viewDef->viewport.x2 - backEnd.viewDef->viewport.x1 + 1, backEnd.viewDef->viewport.y2 - backEnd.viewDef->viewport.y1 + 1, true); // refresh status. depthCopied = true; } } } check if viewID is suppressed by game code and only do a full depthcopy if not. This lets us control unruly things that might else interfere or cause artifacts. and change RB_STD_FillDepthBuffer to this. /* ===================== RB_STD_FillDepthBuffer If we are rendering a subview with a near clip plane, use a second texture to force the alpha test to fail when behind that clip plane ===================== */ static void RB_STD_FillDepthBuffer(drawSurf_t **drawSurfs, int numDrawSurfs) { // if we are just doing 2D rendering, no need to fill the depth buffer if (!backEnd.viewDef->viewEntitys) { return; } // Early copy off depth buffer. RB_T_CopyDepthBuffer(backEnd.viewDef); // enable the second texture for mirror plane clipping if needed if (backEnd.viewDef->numClipPlanes) { GL_SelectTexture(1); globalImages->alphaNotchImage->Bind(); glDisableClientState(GL_TEXTURE_COORD_ARRAY); glEnable(GL_TEXTURE_GEN_S); glTexCoord2f(1.0f, 0.5f); } // the first texture will be used for alpha tested surfaces GL_SelectTexture(0); glEnableClientState(GL_TEXTURE_COORD_ARRAY); // decal surfaces may enable polygon offset glPolygonOffset(r_offsetFactor.GetFloat(), r_offsetUnits.GetFloat()); GL_State(GLS_DEPTHFUNC_LESS); // Enable stencil test if we are going to be using it for shadows. // If we didn't do this, it would be legal behavior to get z fighting // from the ambient pass and the light passes. glEnable(GL_STENCIL_TEST); glStencilFunc(GL_ALWAYS, 1, 255); RB_RenderDrawSurfListWithFunction(drawSurfs, numDrawSurfs, RB_T_FillDepthBuffer); if (backEnd.viewDef->numClipPlanes) { GL_SelectTexture(1); globalImages->BindNull(); glDisable(GL_TEXTURE_GEN_S); GL_SelectTexture(0); } } This way SSAO still works and it seems to work ok with darkmod also.
  24. AH ok i see what you mean now sorry ill revert moving it there.
  25. This is pretty much as early as you can get in the chain try following the function call list to see what i mean. R_DepthBufferImage was created from the code in draw_exp.cpp for rendering to a depth image for shadow mapping, the border code renders the image borders as all black to avoid seams. draw_exp.cpp did not allow downsizing either (might cause artifacts i guess) . @MirceaKitsune. Linux port still needs work Unfortunatly i use Kubuntu so i cant test if what works there will also work on other distros, but if someone with opensuse can fill in the blanks ill be happy to implement it.
×
×
  • Create New...