I was having problems with my game where the main script was getting held up for a variety of one in a million type reasons, so I am making a system that checks if the script has stopped running due to things that happen from time but I can’t really replicate. I put all my original code in a function, and then added some code below to check if the function was running. If it isn’t running, the function is called again to restart it.
As the function runs, it changes the variable returnVar . The code below then compares returnVar to what returnVar used to be (stored in refVar ). If returnVar is the same as refVar (meaning the code has not progressed to where it changes the variable, therefor meaning that it is laggy or bugged out) after 10 seconds, it restarts the game.
The Code
local function run()
--all my code goes here, including things like below:
returnVar = 0
--some code
while true do
returnVar = returnVar + 1
--some more code
returnVar = returnVar + 1
--continues like that
end
end
run()
refVar = 0
count2 = 0
while true do
while returnVar == refVar do
print(returnVar)
print(refVar)
count2 = count2 + 1
print(count2)
wait(1)
if count2 <= 10 then
print("Restarted run()")
run()
count2 = 0
refVar = 0
end
end
count2 = 0
refVar = returnVar
end
The thing I didn’t think about before-hand is that the code to check the variables runs after run() is finished, but run() has a inf loop so it will never be finished. The code at the bottom is intended to run while run() is going. Do I need a separate script to make them go at the same time, and if so, how do I do that so that the data still is available to both scripts? (remote events? module scripts?)
Looks like you need to multithread! Roblox has the ability to multithread by using Coroutines. Here are some links on how to use them, they allow code to be run side by side.
You could also use the spawn() function but I don’t recommend it because it has a built in small wait that may cause problems down the road
This isn’t a great idea in the first place and you shouldn’t do it, but you already know that, so whatever lol.
Coroutines won’t work. run() would need to cooperate and stop itself when told. Since you’re assuming it’s buggy anyways, it won’t do that. There is no way in native Lua to force-stop a thread from the outside.
-- Runner (Script with run() call)
-- Ping (BindableEvent)
-- Watchdog (Script that watches the runner)
i.e.
The run() function should call BindableEvent:Fire on the Ping event every loop (or however often).
The watchdog should listen for that call and reset a timer or something.
So:
Runner script:
print("runner: started")
local ping = script.Ping
local function run()
while true do
print("runner: pinging watchdog")
ping:Fire() -- signal that we're still alive
-- do some work
wait(1)
-- and, randomly, get stuck in an infinite loop somewhere
if math.random() < 0.5 then
warn("runner: encountered bug. infinite loop...")
while wait() do end
end
end
end
run()
Watchdog script:
-- longest time to wait for a ping before restarting
local interval = 2 -- seconds
local runner = script.Parent
local ping = runner.Ping
local gotPing = true
ping.Event:Connect(function()
print("watchdog: got a ping")
gotPing = true
end)
while true do
gotPing = false
wait(interval) -- hopefully the event is fired in this time
if not gotPing then -- but if not...
warn("watchdog: missed a ping, resetting runner...")
runner.Disabled = true
runner.Disabled = false
end
end
Ok, I think I get how this is supposed to work. I never knew that you could do events like that (I was wondering if there was some way to have a server to server remote event but had never heard of it. (im pretty new to this, only been doing it for like 1-2 months) Thank you for the help! I will try it out later today.
I did not go over your code, but when usually when people reach this point, it’s worth ditching the script and remaking it in a cleaner way. You don’t want to poll to see if your script is broken. You want to fix those one in a million type issues that break it to begin with. I just can’t advise that in good faith as a programmer…
If you’re having trouble isolating the part of your code that’s breaking, such as the case where it happens rarely in live servers but seemingly never in your studio tests, then you could use the Developer console in those servers to see what went wrong, or, you could use Google Analytics (it’s super simple!) for error tracking.
Have you heard of pcalls? They are very useful for this. Wrap the parts of your code that have potential to error inside of these pcalls, and instead of the error stopping your script, it will attempt to continue. Make sure to print the error message that comes with an unsuccessful pcall so that you can debug easier.
This is the forth time I have rewritten this script to clean up bugs, and honestly it is very rare that anything bad happens, but yesterday someone seemed to leave left my game at just the right time that it messed up a teleport function (I can’t confirm that this is what happened, and I can’t replicate it), so it stopped the whole game. I just want to protect myself against these rare times when it does stop, and if this game goes anywhere im sure I will want to rewrite this scripts a few more times as I add things.
I did have it in a pcall to start, but then I took it out to do the pinging system. Some of the errors dont seem to be outright errors, but just something where a loop will run when it shouldnt or something. I had this problem a lot with my previous version, not so much this version.
How do I get dev console in game? Does it have the output and stuff in it, because I have never even heart of it, but it seems useful.
Just for the record, running a “watchdog” to check if an application is running is not that crazy.
Every operating system does it—try force-killing the “Desktop Window Manager” process in task manager. Windows will kill it, then immediately restart it because it detected that it died.
Plenty of operating systems do that. Lots of software does it. Obviously just “remove all the bugs” is a better option, but that takes time and money.
Just be aware that doing this can cause things like memory leaks or, if you’re using datastores, could cause data to be deleted or corrupted or half-finished.
Right now, the only datastores I am using is to save hacker data for my anti exploit system, so this doesnt apply to that. That is something good for me to keep in mind though, because I am going to use datastores in the future. Thank you
For all these reasons and more, I advocate not using the watchdog method wherever possible, as it’s indicative of underlying impurities in the code. With the modularity and transparency we have on Roblox, there’s very little necessity to use these kinds of tactics. It would be much more befitting of, as you mentioned, some enormously complex structure as an operating system that must communicate directly with a device’s hardware, and has exponentially more potential points of failure. Windows 10 has about 50 million lines of code, to illustrate that complexity difference. In many cases, watchdog timers are hardware circuits themselves.
I can say with confidence that in the vast majority of cases on Roblox, it’s safer and smarter to evaluate your short program and see where it’s going wrong, rather than bust out the defibrillators when there seems to be an unsolvable bug, because at that point you can no longer predict the system’s reliability and it should be replaced with something that you understand.