Custom anti-exploit appearing too sensitive, even though significant leeway is provided?

DaDude_89 · April 29, 2023, 2:10pm

I’ve created a custom anti-exploit system that automatically kicks/bans and logs apparent exploiters in my game, Chaos at the Bistro.

I noticed some exploits targetting remoteEvents, which I have sanity checked, but I decided a blunt-force anti-exploit would probably be a good idea.

For testing purposes, I disabled real banning and kicks and only made the anti-exploit log bans so I could see how it might work in the live game.

This is the result:
c3a21d37688aef8af188f2376cfdd8847f5a6032

About a thousand or more players were falsely assumed to be exploiting, specifically for… high remoteEvent rates?

While testing, the highest rate I could achieve while playing legitimately was 1.6/s for the Swing event.

The question:
Is it normal for players to accidentally fire remotes faster than expected due to lag, or some undocumented remote queueing behavior? Should I only detect extreme cases to be considered exploits?

This is a bit of a strange topic to ask about on the devforums, but I truly believe the anti-exploit isnt detecting rates incorrectly, and the client-side shouldnt be firing these events this weirdly.

ifkpop · April 29, 2023, 2:23pm

That is pretty weird, I’ve never encountered something like this myself. I’m guessing you have a thing to verify that players are able to swing, meaning higher rates doesn’t impact other players, it might be worth increasing it to a rate at which you are sure it is an exploiter, even though I too would have thought that 7.8 would have been sufficient.

But I wonder, are you certain you calculate the rate correctly?

DaDude_89 · April 29, 2023, 2:34pm

local Event = ListenConfig.Event
local Counts = {}
local Refresh = tick()

ListenConnections[Event] = function(player)
	if tick() - Refresh > Config.RateTimer then 
		Refresh = tick()

		for plr,count in pairs(Counts) do
			local Rate = count / Config.RateTimer
			print(Rate)
			
			if Rate >= ListenConfig.RateLimit then
				-- Marks user here
			end
		end

		Counts = {} 
	end
	
	Counts[player] = (Counts[player] or 0) + 1
end

Here’s the code snippet that calculates the event rate. I believe it counts the events correctly, and it reliably resets the count ever 2s (Config.RateTimer value). It only ever connects once at a time.

ifkpop · April 29, 2023, 2:49pm

It looks like it should be fine, no race conditions (that increase the value abnormally) should be possible, the thing seems to be clearing out properly, unless:

-- Status of Counts and area executed by the CPU: (Assume Counts[player] == 2)
Counts[player] = (Counts[player] -- Lets say the CPU is at this stage of that instruction

-- Now assume that the following is being called simultaneously from another thread:
Counts = {}

-- If the interrupted instruction now continues it will have the following: (Remember, counts[player] == 2
Counts[player] = (2 or 0) + 1 -- It will already have fetched the value when it increases.

Now, I’m not saying this is a likely source of it, but it is something that you wouldn’t be able to find testing yourself, as you can’t fire the even quickly enough to match that thing.

But the servers are not that large, and for someone to be able to hit this exact timing more than once is just not likely. It is however the only somewhat plausible theory I can come up with. I don’t know how roblox deals with these kinds of shared resources, if they automatically sync it to prevent race conditions or not.

DaDude_89 · April 29, 2023, 3:04pm

Huh, I never would’ve thought something like this could happen… Learnt something new today.

This seems like a pretty rare case, for an event to fire while the CPU is in the middle of processing a single line. It would explain a majority of the issue, though, and it seems plausible for the fire counts to stack high enough to trigger the anti-exploit.

I might make some basic switch so the counter alternates between two tables to prevent overlapping writes/reads. I’m about to publish an update which makes the detection even more lenient, so hopefully I can get some more data.

Thank you for the help!

ifkpop · April 29, 2023, 3:24pm

No worries! Again, I don’t really think this is something that will happen here, I’ve never had an issue like this even when doing stuff similar like this. It is the only explanation I have, but I don’t think its a good one. I don’t think the volume of event calls is high enough

MysteriousVagabond · April 29, 2023, 3:57pm

This could be the result of network jitter or packet loss. In either case, even if the player is firing events at a reasonable rate, they may be arriving at the server all bunched up. This also cannot really be tested in Studio since Studio’s configurable artificial latency is constant and does not drop packets.

You could combat this somewhat by throttling event processing rather than measuring incoming event rate. As events come in, put them into a queue and process them at some fixed rate and drop requests if the queue is full. The queue’s maximum length can be set quite high without too much consequence (namely, high enough to handle network instability without dropping requests).

If someone is spamming events, their requests will eventually just start getting ignored. The only consequence for legitimate players may be occasional latency.

imNiceBox · April 29, 2023, 4:30pm

You have no idea what you’re talking about.
Lua threads do not work in parallel so the scenario you described is not possible, Lua threads are not running simultaneously on different CPU threads and the opcode instructions are executed one after another.
If you want to achieve simultaneous code execution, read about Roblox’s parallel Luau implementation.

Answering the OP’s question: delay checks on the server are not reliable, and should only prevent the event from firing instead of banning the player.
Imagine a scenario in which the player experiences temporary 5-second internet connection issue, and during this time period one of your remote events is fired 10 times.
Sadly, we can not customize the network settings in Roblox so after the player’s connection goes back to normal, 10 event calls will be registered by the server immediately, without any delay.

This causes your sanity check to flag the player.
Naturally, you should force a cooldown in your server-sided script but not ban the player!
I also don’t understand why you prevent the function connected to your remote event from firing globally, for every player.
You only allow this event to be fired each 2 seconds, so if someone fires it, no other client will be allowed to fire it during the next 2 seconds?

Alright, so how do you keep your remote events safe?
The best option is to frequently change the way you fire important remote events/functions. By adding one extra argument to each call and then checking if the value is correct on the server, by changing the event name etc. This little change patches all the exploit scripts which are using this remote.
If you do it every day, the exploiters will suffer a lot reverse-engineering your local scripts every time you update.
This is a simple and pretty effective strategy.

I hope it helps.

ifkpop · April 29, 2023, 5:29pm

While I’m sure mean the best, this is just not a great way to start reply. I clearly stated multiple times that I don’t think this is the issue, but that it is the only solution I could come up with at the time. Being confrontational for no reason just isn’t productive, and creates a hostile environment. Instead, I’d recommend starting with “Actually, I believe [blank]” or “Actually, this would only be true if it was running in parallel, which it in this case is not.”.

As for what you said regarding events; I agree that network delays are the most likely cause for this, but I don’t think constantly changing how stuff is done is sustainable. I think it’s better to secure it as well as possible on the server side by doing sanity checks etc. If you have the time to change your events often though, go for it.

imNiceBox · April 29, 2023, 6:11pm

I believe answering with fake info is more inappropriate than firmly pointing out someone is wrong.

Yes, you stated multiple times you don’t think this is the issue while you should’ve stated you don’t think what you’re saying is true in the first place.
“I don’t think this is the issue” in this context means that what you’re saying about threads is true but might not be the actual problem the OP is experiencing.
Sorry, I didn’t mean to make you feel offended, I just wanted to indicate what you said is not true.

Sanity checks are necessary of course, and it seems like the OP has implemented them already, that’s why I wanted to suggest an additional security measure which allows for accurate detection.
Changing the remote events is sustainable as long as you’re not working in a big team of scripters.
It’s not very time-consuming either, changing the order of arguments or adding a new one for 1 remote shouldn’t take more than a minute.

ifkpop · April 29, 2023, 6:58pm

I understand the miscommunication, and I agree that providing an “answer” with incorrect information isn’t helpful. Firmly pointing out that someone is wrong is perfectly fine, while I understand you didn’t mean to offend me (and no offence was taken), I think that maybe stating “that is incorrect” is better than saying “you don’t know what you are talking about”. I do, however, want to make it clear that I appreciate correction

Which is somewhat what I meant. I was unsure how the interrupts where handled in studio, but I gave this suggestion as the cause of the problem anyways as this is an issue one might face in multiple languages. This was, as you pointed out, not correct in this context.

And yes, any and all tips are good, but changing the order of arguments like that is something I think might be an easy source of errors, as it is very easy to forget something during am mundane task like that. I think it is an excellent idea if you are facing serious exploit issues, as that will temporarily stall them and might provide an opportunity to flag players as potential cheaters, as regular players never will send events in the wrong order.

DaDude_89 · April 30, 2023, 3:53am

The queuing sounds like a good idea, actually. I already implemented various cooldowns for when events are expected and for when they shouldn’t fire, I’m just afraid that they’ll affect regular players… I guess at a certain point, it’s necessary as even lag can be considered unfair.

DaDude_89 · April 30, 2023, 3:56am

Adding a “key” argument to the remotes seems like a good idea. What made me decide to add an anti-exploit was when someone told me there was a script for my game on Vermillion… It was extremely basic but exploited a key weakness in my remote which I didn’t realize I hadn’t fixed. The issue is fixed and the exploit doesn’t work as well anymore, but now I know that a key argument would probably work, if not just temporarily.

However, I do agree with @ifkpop on this point:

As a solo developer on this game, it would be unreliable to change this key every time I make an update, and it would be even more inconvenient to publish a change daily, as it’s then significantly harder for me to work on major updates to release later to the live game.

I’ll probably force longer cooldowns between events as a solution. Thank you everyone for the help.