Maybe @banunos could explain exactly how it works, but the underlying process is relatively simple.
First, you need a server that’s going to run the actual bot and Roblox client. In game, send web requests to the server to do things such as move forward or get screen. More specifically, you could send POST requests to the server with the direction in the Content and GET requests whenever you want to get the screen.
The server then responds with moving the player forward or, in the situation that the client wants to get the current screen, take a screenshot, turn every pixel color into a table, then send that back to the client via JSON.
Everything on the server is pretty easy to script as it’s just receive request > send keyboard input or receive request > take screen shot and encode into JSON.
You can learn more about HTTP requests on Roblox in the Dev wiki.
I can’t go into the topic of actually setting up a server as that’s a big topic and depends on what you want to do.
As he said, turn every pixel color into a table. While obviously I’m not entirely sure how this works, one method may be making a grid of “pixels” (frames) on a screen GUI and then using the data sent from the server to adjust colors of the in game “pixels” (frames). Just an idea. Not sure if that is entirely what is going on here.
If you have a web server up, you’ll have to do a GET request from roblox every now and then to get pixel locations off it. What you could do is figure out how many pixels your screenshot breaks into and create the same amount of frames in game. Then take the pixels you get on your web server and assign them a value of 1-however many pixels you need. Each number corresponds to a frame in roblox that you can manipulate.
I recommend looking at the docs of the Python image processing library PIL. More specifically the function .getdata() which does exactly what you want.