OSGL - EditableImage graphics library

Pretty sure Vector2s are slower than passing numbers? I’m not sure. I’m working on the next version and I’ll see if I can decrease this time :+1:

3 Likes

try avoiding constructors and use either buffers or raw numbers/tables.
Because I saw it myself ColorSequences, NumberSequences, Vectors and all that are a lot slower to access and construct.
When I was making my own project via editable images, I’ve had to replace color/number sequences and vectors and other things with buffers and raw numbers, which drastically improved performance (by x4 or more) (although constructing Vector2’s in WritePixels is inevitable)

2 Likes

He never misses, The absolute legend :star_struck:

2 Likes

This is really great! The ability to use text & custom fonts is going to save me so much time. One question though, could there be some sort of tutorial on how to add custom fonts? I read the docs, but I can’t understand what the instructions are trying to say.

2 Likes

The custom fonts are bitmap fonts (0 represents nothing, 1 represents a pixel) meaning they cannot be resized.

porting a font from online

As for making custom fonts, it’s pretty easy. Take a fontmap, lets take this one:


Cut out each individual letter and save it as a PNGs in a folder.
Then, write a program that goes through each one of these PNGs. it should read the image data, and generate a text output in the style of a lua table. If the colour is transparent, it should be “0”. If there is a pixel, it should be “1” (bitmap fonts). It should make a huge array, something like this:

You can add the array directly to the module (Window > Render > Font), OR, add it during runtime with the function provided.

making your own font

all the bitmap font is, is a dict. it should contain each letter you want. lets say we wanted a 4x4 lowercase letter ‘a’, that could be:

["a"] = {1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,1}

Every 4 pixels would be a single line, making that construct:

image

as our letter “a”. OSGL handles the rest.

2 Likes

What happens when you use this module after the nested for loop?


Making Vector2’s are slower because it’s constructing an object, assigning a metatable and whatnot… while passing 2 numbers doesn’t do any of that.

3 Likes

This looks very interesting. I will try this out!

2 Likes

I see, I think I understand

Is there an already existing program that does this that I can use? If not, how can I make it? (or if possible, could you send me the one you used?) I don’t know how to write one myself

1 Like

the problem is using an interpreted language to access & set pixels from an array (1024 * 1024 * 4) times per iteration

it doesn’t matter of which graphic library people chose on here – the speed of the language is ultimately what holds all of them back

even when optimizing the nested for loops to be more predictable to the cpu:

-- LargeCanvas is my internal library to allow larger resolutions with EditableImages

-- start timing the loop here with os.clock()
for Y = 0, 1024-1 do
	for X = 0, 1024-1 do
		local R, G, B = math.random(0, 1), math.random(0, 1), math.random(0, 1)
		LargeCanvas:PutPixel(X,Y,R,G,B)
	end
end
-- end timing the loop here ~>  0.4-0.39 second per iteration
-- in other words, 2.5 fps

-- i now realize that there is a overhead when determining 
-- which editableimage to put the pixel in, it's still faster then the 
-- column-major loops, though

how about using an 1D loop to set the pixels?

local LargeCanvas = require(script.LargeCanvas):New(1024,1024,Enum.ResamplerMode.Pixelated)
local size = 1024 * 1024 * 4
local screen = LargeCanvas.pixels[1]

LargeCanvas.gui.Parent = script.Parent

while true do
     -- start timing the loop here with os.clock()
	for i=1,size,4 do
		local r = math.random()
		local g = math.random()
		local b = math.random()
		screen[i] = r
		screen[i + 1] = g
		screen[i + 2] = b
		screen[i + 3] = 1
	end
    -- end timing the loop here ~> 0.19 second per iteration
	task.wait()
end
-- this takes around 0.19 seconds iteration to put the pixels in the screen buffer
-- in other words, 5.2fps

since the 1D loop is the fastest, let’s see if it’s faster in rust

use std::thread::sleep;
use std::time::{Duration,Instant};
use rand::{thread_rng, Rng};
const WIDTH : usize = 1024;
const HEIGHT : usize = 1024;
const SIZE : usize = WIDTH * HEIGHT * 4;
const REFRESH_RATE : Duration = Duration::from_millis(16); // roughly 60fps
fn main() {
    let mut random = thread_rng();
    let mut buffer: Vec<u8> = vec![0; SIZE]; // create pixel buffer here
   
    loop {
        let benchmark = Instant::now()  // start timing the loop here 
        for i in (0..SIZE).step_by(4){
            buffer[i] = random.gen_range(0..255);
            buffer[i+1] = random.gen_range(0..255);
            buffer[i+2] = random.gen_range(0..255);
            buffer[i+3] = random.gen_range(0..255);
        }
        println!("took {}", benchmark.elapsed().as_millis()); // 0.01 seconds, in other words, 100 fps
        sleep(REFRESH_RATE)
    }
}


this takes 0.01 seconds per iteration, or in other words, 100fps to put the pixels in the framebuffer

3 Likes

in Luau you should be able to get atleast 10 fps without needing native codegen, the problem is that math.random is slow to call compared to other functions. same thing applies with the Random object Roblox has.
if you for example calculate the UV coordinate by dividing the current x,y by the image size and use that for the RGB you can get about 30 fps with native code generation and 15 fps without.

in this example I use a window resolution of 1024x1024 and this for the colors:
R = UV.X
G = UV.Y
B = UV.X * UV.Y

With --!native

Without --!native

1 Like

I was able to get a 3D engine with some meshes of about 3000 vertices working in real-time, haven’t gotten to parallelize anything here and the code wasn’t really my proudest work but this is just for reference.

1 Like

the overhead of accessing the array is still going to slow things down, even when using static values

local size = 1024 * 1024 * 4
local screen = table.create(size,0)
while true do
	-- start timing the loop here with os.clock()
	local t = os.clock()
	for i=1,size,4 do

		screen[i] = 2
		screen[i + 1] = 5
		screen[i + 2] = 7
		-- alpha channel omitted for free performance boost
	end
	print(os.clock()-t)
	-- end timing the loop here ~> 0.03ish to 0.04 second per iteration
	task.wait()
end

my point stands, this language is woefully slow for graphics done in software

comparing to other languages, yes it’s slower by a lot, but it is still fast enough for real-time graphics at decent resolutions!

Here’s a 3D fully textured raycaster engine I made around a year ago or so using pure luau running around 80 to 110 FPS at 240x240
(no multi-threading)


Note there is also a bit of overdraw happening with the floors/ceiling and wall rendering

6 Likes

not sure what that has to do with what I said, all I’m saying is that you can still make cool things with it. You aren’t even supposed to be using the CPU for this kind of stuff so it makes sense that it might be kind of slow. but being able to get 30 fps with a 1024x1024 window without parallelization means you should easily be able to get around 120 fps with parallelization too.

1 Like

Version 1.2b is out!

This update contains an entire rewrite of the entire library yielding MUCH better performance!

Changelog:

  • Rewrite of API (more performant, and easier to understand)
  • New docs & API site
  • Removed “Fonts and text”
  • Removed the “argument buffer”

What really makes this better than the previous version of OSGL?

v1.2b is much faster than v1.1b. In certain contexts, OSGL surpasses CanvasDraw FPS-Wise by over 7fps.

More on CanvasDraw & OSGL

As mentioned, in certain contexts, OSGL is faster than CanvasDraw. Using OSGL, if you render nothing to the screen, CanvasDraw will be faster. The latency that CanvasDraw takes to draw is the same that OSGL yields, leading to OSGL having higher FPS on average (again, in some cases. The latency of CanvasDraw varies, and sometimes it’s faster!)

The exact tests that were used to measure this

The code that was used previously in this thread was reused for measurement - each time yielding a different number of frames, the difference between 1, 2 and 3 task.wait s will only affect OSGL, difference in CD is purely random. Out of these 3 results, “1” is currently being used in OSGL. Here are our results:
1 task.wait

Test 1: Wait every 200 * 1024 pixels
OSGL: 22 FPS
CD: 21 FPS

Test 2: Wait after every Render call
OSGL: 8.9 FPS
CD: 9 FPS

Test 3: No waits at all
OSGL: 6 FPS
CD: 5.1 FPS

2 task.waits

Test 1: Wait every 200 * 1024 pixels
OSGL: 25 FPS
CD: 24 FPS

Test 2: Wait after every Render call
OSGL: 11 FPS
CD: 11 FPS

Test 3: No waits at all
OSGL: 9 FPS
CD: 5.5 FPS

3 task.waits

Test 1: Wait every 200 * 1024 pixels
OSGL: 35 FPS
CD: 30 FPS

Test 2: Wait after every Render call
OSGL: 27 FPS
CD: 10.5 FPS

Test 3: No waits at all
OSGL: 25 FPS
CD: 5.5 FPS

You can contribute to the project here!

2 Likes

This is the greatest update of all time.

1 Like

Bug fix #1

  • Fixed a bug where the converter executable wouldn’t render some pixels

The exe has been updated and can be found here!

at this point, I’m feeling to create a os and put osgl32.dll (this module) in the kernel folder, and someone made a java interpreter.

Java (JBlox env) + OpenGl (OSGL module + tweaks) = minecraft?

edit : is OSGL like OpenGL? haven’t seen docs but ima bookmark for future use

1 Like

OSGL is sort of like OpenGL. OSGLs API is actually very similar to SFML which is a CPP library.

1 Like

how does the converter work ?
is there a certain way to set it up, because it wont open