Pretty sure Vector2s are slower than passing numbers? I’m not sure. I’m working on the next version and I’ll see if I can decrease this time
try avoiding constructors and use either buffers or raw numbers/tables.
Because I saw it myself ColorSequences, NumberSequences, Vectors and all that are a lot slower to access and construct.
When I was making my own project via editable images, I’ve had to replace color/number sequences and vectors and other things with buffers and raw numbers, which drastically improved performance (by x4 or more) (although constructing Vector2’s in WritePixels is inevitable)
He never misses, The absolute legend
This is really great! The ability to use text & custom fonts is going to save me so much time. One question though, could there be some sort of tutorial on how to add custom fonts? I read the docs, but I can’t understand what the instructions are trying to say.
The custom fonts are bitmap fonts (0 represents nothing, 1 represents a pixel) meaning they cannot be resized.
porting a font from online
As for making custom fonts, it’s pretty easy. Take a fontmap, lets take this one:
Cut out each individual letter and save it as a PNGs in a folder.
Then, write a program that goes through each one of these PNGs. it should read the image data, and generate a text output in the style of a lua table. If the colour is transparent, it should be “0”. If there is a pixel, it should be “1” (bitmap fonts). It should make a huge array, something like this:
You can add the array directly to the module (Window > Render > Font), OR, add it during runtime with the function provided.
making your own font
all the bitmap font is, is a dict. it should contain each letter you want. lets say we wanted a 4x4 lowercase letter ‘a’, that could be:
["a"] = {1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,1}
Every 4 pixels would be a single line, making that construct:
as our letter “a”. OSGL handles the rest.
What happens when you use this module after the nested for loop?
Making Vector2’s are slower because it’s constructing an object, assigning a metatable and whatnot… while passing 2 numbers doesn’t do any of that.
This looks very interesting. I will try this out!
I see, I think I understand
Is there an already existing program that does this that I can use? If not, how can I make it? (or if possible, could you send me the one you used?) I don’t know how to write one myself
the problem is using an interpreted language to access & set pixels from an array (1024 * 1024 * 4) times per iteration
it doesn’t matter of which graphic library people chose on here – the speed of the language is ultimately what holds all of them back
even when optimizing the nested for loops to be more predictable to the cpu:
-- LargeCanvas is my internal library to allow larger resolutions with EditableImages
-- start timing the loop here with os.clock()
for Y = 0, 1024-1 do
for X = 0, 1024-1 do
local R, G, B = math.random(0, 1), math.random(0, 1), math.random(0, 1)
LargeCanvas:PutPixel(X,Y,R,G,B)
end
end
-- end timing the loop here ~> 0.4-0.39 second per iteration
-- in other words, 2.5 fps
-- i now realize that there is a overhead when determining
-- which editableimage to put the pixel in, it's still faster then the
-- column-major loops, though
how about using an 1D loop to set the pixels?
local LargeCanvas = require(script.LargeCanvas):New(1024,1024,Enum.ResamplerMode.Pixelated)
local size = 1024 * 1024 * 4
local screen = LargeCanvas.pixels[1]
LargeCanvas.gui.Parent = script.Parent
while true do
-- start timing the loop here with os.clock()
for i=1,size,4 do
local r = math.random()
local g = math.random()
local b = math.random()
screen[i] = r
screen[i + 1] = g
screen[i + 2] = b
screen[i + 3] = 1
end
-- end timing the loop here ~> 0.19 second per iteration
task.wait()
end
-- this takes around 0.19 seconds iteration to put the pixels in the screen buffer
-- in other words, 5.2fps
since the 1D loop is the fastest, let’s see if it’s faster in rust
use std::thread::sleep;
use std::time::{Duration,Instant};
use rand::{thread_rng, Rng};
const WIDTH : usize = 1024;
const HEIGHT : usize = 1024;
const SIZE : usize = WIDTH * HEIGHT * 4;
const REFRESH_RATE : Duration = Duration::from_millis(16); // roughly 60fps
fn main() {
let mut random = thread_rng();
let mut buffer: Vec<u8> = vec![0; SIZE]; // create pixel buffer here
loop {
let benchmark = Instant::now() // start timing the loop here
for i in (0..SIZE).step_by(4){
buffer[i] = random.gen_range(0..255);
buffer[i+1] = random.gen_range(0..255);
buffer[i+2] = random.gen_range(0..255);
buffer[i+3] = random.gen_range(0..255);
}
println!("took {}", benchmark.elapsed().as_millis()); // 0.01 seconds, in other words, 100 fps
sleep(REFRESH_RATE)
}
}
this takes 0.01 seconds per iteration, or in other words, 100fps to put the pixels in the framebuffer
in Luau you should be able to get atleast 10 fps without needing native codegen, the problem is that math.random is slow to call compared to other functions. same thing applies with the Random object Roblox has.
if you for example calculate the UV coordinate by dividing the current x,y by the image size and use that for the RGB you can get about 30 fps with native code generation and 15 fps without.
in this example I use a window resolution of 1024x1024 and this for the colors:
R = UV.X
G = UV.Y
B = UV.X * UV.Y
With --!native
Without --!native
I was able to get a 3D engine with some meshes of about 3000 vertices working in real-time, haven’t gotten to parallelize anything here and the code wasn’t really my proudest work but this is just for reference.
the overhead of accessing the array is still going to slow things down, even when using static values
local size = 1024 * 1024 * 4
local screen = table.create(size,0)
while true do
-- start timing the loop here with os.clock()
local t = os.clock()
for i=1,size,4 do
screen[i] = 2
screen[i + 1] = 5
screen[i + 2] = 7
-- alpha channel omitted for free performance boost
end
print(os.clock()-t)
-- end timing the loop here ~> 0.03ish to 0.04 second per iteration
task.wait()
end
my point stands, this language is woefully slow for graphics done in software
comparing to other languages, yes it’s slower by a lot, but it is still fast enough for real-time graphics at decent resolutions!
Here’s a 3D fully textured raycaster engine I made around a year ago or so using pure luau running around 80 to 110 FPS at 240x240
(no multi-threading)
Note there is also a bit of overdraw happening with the floors/ceiling and wall rendering
not sure what that has to do with what I said, all I’m saying is that you can still make cool things with it. You aren’t even supposed to be using the CPU for this kind of stuff so it makes sense that it might be kind of slow. but being able to get 30 fps with a 1024x1024 window without parallelization means you should easily be able to get around 120 fps with parallelization too.
Version 1.2b is out!
This update contains an entire rewrite of the entire library yielding MUCH better performance!
Changelog:
- Rewrite of API (more performant, and easier to understand)
- New docs & API site
- Removed “Fonts and text”
- Removed the “argument buffer”
What really makes this better than the previous version of OSGL?
v1.2b is much faster than v1.1b. In certain contexts, OSGL surpasses CanvasDraw FPS-Wise by over 7fps.
More on CanvasDraw & OSGL
As mentioned, in certain contexts, OSGL is faster than CanvasDraw. Using OSGL, if you render nothing to the screen, CanvasDraw will be faster. The latency that CanvasDraw takes to draw is the same that OSGL yields, leading to OSGL having higher FPS on average (again, in some cases. The latency of CanvasDraw varies, and sometimes it’s faster!)
The exact tests that were used to measure this
The code that was used previously in this thread was reused for measurement - each time yielding a different number of frames, the difference between 1, 2 and 3 task.wait
s will only affect OSGL, difference in CD is purely random. Out of these 3 results, “1” is currently being used in OSGL. Here are our results:
1 task.wait
Test 1: Wait every 200 * 1024 pixels
OSGL: 22 FPS
CD: 21 FPS
Test 2: Wait after every Render call
OSGL: 8.9 FPS
CD: 9 FPS
Test 3: No waits at all
OSGL: 6 FPS
CD: 5.1 FPS
2 task.wait
s
Test 1: Wait every 200 * 1024 pixels
OSGL: 25 FPS
CD: 24 FPS
Test 2: Wait after every Render call
OSGL: 11 FPS
CD: 11 FPS
Test 3: No waits at all
OSGL: 9 FPS
CD: 5.5 FPS
3 task.wait
s
Test 1: Wait every 200 * 1024 pixels
OSGL: 35 FPS
CD: 30 FPS
Test 2: Wait after every Render call
OSGL: 27 FPS
CD: 10.5 FPS
Test 3: No waits at all
OSGL: 25 FPS
CD: 5.5 FPS
You can contribute to the project here!
This is the greatest update of all time.
Bug fix #1
- Fixed a bug where the
converter
executable wouldn’t render some pixels
The exe has been updated and can be found here!
at this point, I’m feeling to create a os and put osgl32.dll (this module) in the kernel folder, and someone made a java interpreter.
Java (JBlox env) + OpenGl (OSGL module + tweaks) = minecraft?
edit : is OSGL like OpenGL? haven’t seen docs but ima bookmark for future use
OSGL is sort of like OpenGL. OSGLs API is actually very similar to SFML which is a CPP library.
how does the converter work ?
is there a certain way to set it up, because it wont open