currently working on my own software rasterizer using EditableImage, seems to be running at around 60 fps at 512*256 or whatever I was testing with at that moment with around 2000 triangles i think? i’ll probably show some videos of it but it seems pretty efficient already…
supports backface culling, frustum culling and clipping, allows loading in obj files and textures of whatever quality with a z depth buffer. might add some kind of shader support but that won’t be nearly as efficient, also might make syntax closer to what roblox offers with BaseParts, meshes and whatever else.
bottleneck right now are actually the calculations required for projection, low triangle count easily runs at 60 fps already. theyre using CFrames since that’s easier and might be more optimized, if i really feel like it i might make it use scalar math instead to work well with native code generation.
not working a lot on it right now but felt like sharing it since it’s like this 3d engine too
Yeah the only downside to my engine is that it is pretty slow when trying to render a lot of triangles. This is due to myself not knowing how to optimize the math, however, I might use CFrames or other built in libraries roblox offers like you described here.
I think around 30 fps, specifically because it had to iterate over nearly the whole screen to fill the pixel buffer with all the RBGA values
just a PixelBuffer array that contains all the RBGA values and is passed to WritePixels whenever I want to render it, I write to that directly when rasterizing things
though after a while i realized i should change things around if i want to add some specific features like custom shader support, so gonna have to change a lot of things which might worsen performance. it runs well with native, itll probably be between 1-2x slower without but maybe not since most of this is Vector3/CFrame math which is actually slower with native code generation.
Though with a raycaster engine you don’t need to clear the buffer each frame since you always end up filling it anyways, I can’t guarantee that so I do need to clear it
Yeah clearing does actually hit the FPS quite a bit. My raycaster does have overdraw for things like transparent walls and billboard objects, so the fps hits around 28-35 which is still pretty damn good
Seems like we hit the wall with rasterizers/raycasters. Even with multithreading, native code, and some optimization, Lua isn’t fast enough. This is still quite an achievement for anyone who is trying to do this, though. I’m happy with my engine since it basically runs like the N64 engine, and tbh micro optimizations don’t seem worth it. After all, this is a pretty strange idea running a 3D engine in a 3D engine, so it’s okay.
I managed to get incredibly good framerate for 320x320, which probably an overkill resolution for a software renderer, but a fully fledged 3D game or shader with good framerate is definitely doable now. Just gotta avoid super high resolutions. A good limit I’d probably follow is say like a common 90s computer resolution, like 320x200 or a PS1 game resolution of 320x240.
Yeah, I feel like I definitely could run something with a lot of triangles at like 30 fps, but I can’t optimize the engine since I don’t know better methods. Hats off to whoever does make a renderer/engine that is efficient and optimized. As for resolution, I wonder if there’s a way to compress or minify the table more. I am already using a 1D array, however, maybe compressing and decompressing it somehow will improve it? My dream has and always will be running my own custom engine inside Roblox. Maybe something like Mario 64 (not exactly since copyright), which would be amazingly cool.
I think maybe you could also add a view distance and maybe occlusion culling?
Like maybe implement occlusion culling to stop rendering stuff the player cannot see, and a render distance similar to Minecraft where objects beyond the render distance threshold will not be rendered regardless whether the player can see them or not
Unfortunately there’s not much you can do to improve how you use the 1D table assuming you are doing barebones and simple math to index the table. One thing you could do is ensure you never overdraw (no triangles being drawn on top of eachother). You could make use of your Z-buffer to do this (Assuming you haven’t tried this yet), or do some quake levels of black magic
And frustum culling would heavily improve performance too
Frustum culling does seem promising, and I already do have an adjustable view distance. I’ll try implementing an algorithm and see how it goes. As for overdraw, I have to look into that more. Thank you for these suggestions!
Are you sure that the Parallel Lua was done properly? When I first downloaded the original engine to mess with it myself, you just called task.desynchronize() before the rendering process and then synchronized to write the pixels, which is not how it works. The Parallelization is handled by Actor objects, and if you only have one Actor its basically still doing everything in serial. The more actors you have doing the calculations, the more the operation is split into parallel threads (to a certain limit). The article about parallel lua states that generally the more actors the better. I tried getting your engine properly set up with Parallel Lua (which is very difficult) and couldn’t get past 40-ish FPS (while also sort of breaking the rendering process because I didn’t fully understand how it worked). The bottleneck wasn’t the rendering process but having to copy gigantic tables of information from each of the actors (which could possibly overlap) into a single frame and z-buffer before writing, which was also gobbling up my poor RAM.
A raycasting engine could be much simpler to implement in parallel for certain reasons, and I just want to make sure that you’re actually getting the best performance out of parallel lua.
This is a very cool experiment and I’d like to see how far it can be pushed
I see. I didn’t really have much knowledge about parallel Lua and followed instructions from someone else, however, I’ll research more about it and try to get it running smoother. I am interested, though, how did you split up the code of the engine with actors? If you could provide me with a sample, it would help me understand. Thank you for explaining this and giving this suggestion!
I can’t provide the rbxm file right now but I’d like to respond anyway, let me know if you still want it.
Basically I just had an actor with a localscript in it that would receive events from a central control script (the Engine3D module). That actor would be cloned a certain number of times and initialized with the scene and engine parameters. Then every time render was called in the Engine3D module, it would send a message to each of the actors to render, and each actor would process a specific set of triangles and then transfer the pixel information back to the central script to stitch together into one frame, but every method of actually doing that transfer turned out to be too slow and memory intensive, which basically counteracted the advantage of doing the rendering in parallel entirely.
The reason ray tracing could be done easier is that each actor could process a certain set of pixels instead of a set of triangles, and then write those pixels to the EditableImage as soon as they are done without having to communicate with another script to combine the rendered frame together, and with no worry of two actors fighting over pixel values. Parallel Lua is pretty janky and difficult to work with, but the optimization it provides makes it worth it.
Also one more thing, When values are sent to an actor, they are copied, meaning if you send the triangle information to the actors every frame it becomes horrendously slow. However actors can read from the DataModel in parallel, so I suggest storing vertex and triangle information in the workspace in value objects somehow, so the information doesn’t need to be copied over to the actors all the time.
This is very interesting. I would like to work on something like this to optimize the engine (a LOT of other things too), and from your description I understand now. The only thing stopping me from adding all of these features is lack of motivation. I wish I had infinite time and motivation to improve this engine using research, however, I barely work on it anymore. Thank you once again for this, and I hope to add it soon when I have time. Also, I do not need the rbxm file. Thank you, though.