Rock-Paper-Scissors AI

1waffle1 · June 12, 2016, 8:06pm

I made a bot that reads your mind. It records your sequence of plays and predicts your next move based on similar past sequences. If you try to play randomly then it will win significantly more frequently than you because you are bad at being random. If you feed it an actual random sequence of moves (as a test to be sure it isn’t cheating) then it will not win more frequently.

https://www.roblox.com/games/432431172/Rock-Paper-Scissors

RPS.rbxl (18.7 KB)

Playing tons of games with it, it starts to beat me almost 2:1.

Crykee · June 12, 2016, 8:33pm

Can we have a competition to see who can create the biggest gap between wins and loses, with wins being dominant?

Basically meaning you have to predict the bot’s moves assuming the only thing it does is predict your moves.

IntellectualBeing · June 12, 2016, 8:34pm

I’d love to see a leader-stats display. Simply submit your score by hitting a button that records your win/draw/loss ratio.

sloomingbla · June 12, 2016, 8:38pm

It would have to be some sort of limit - something like wins out of 100 to keep things fair, considering it gets better over time.

Crykee · June 12, 2016, 8:40pm

Sounds good. 101 tries got me here haha.

1waffle1 · June 12, 2016, 8:41pm

Well seeing as you can calculate what its next move is going to be based on what your past moves has been, you can beat it 100% of the time. The better challenge would be to make it better at predicting.

FearMeIAmLag · June 12, 2016, 8:54pm

If you learn to beat it every time it will adapt your new pattern and will beat you every time until you change.

1waffle1 · June 12, 2016, 8:56pm

The winning pattern is that you do the opposite of the expected pattern. The bot learns the expected pattern from repetition. If you just simulate the bot and play against what it is going to do then you will always win.

J_J · June 12, 2016, 9:36pm

But by not doing the expected pattern, you’re setting a new expected pattern.

1waffle1 · June 12, 2016, 9:45pm

It finds repetition, not patterns.

AlgyLacey · June 12, 2016, 11:50pm

I got a 2 draws and wins in a row.
I’ll post my best ?/100 score later.

Great game!

1waffle1 · June 13, 2016, 3:47am

This is the extent of the mind-reading logic:

local strings={}
local signal=game.ReplicatedStorage.Signal
signal.OnServerInvoke=function(player,move)
	local s=strings[player.UserId]or''
	local a,b,c,p=0,0,0,''
	for i=1,#s do
		local A,B,C=a,b,c
		local v=3^i
		for _ in s:gmatch("0"..p)do a=a+v end
		for _ in s:gmatch("1"..p)do b=b+v end
		for _ in s:gmatch("2"..p)do c=c+v end
		if a==A and b==B and c==C then break end
		p=p..s:sub(i,i)
	end
	local response
	if a>b and a>c then response=1
	elseif b>a and b>c then response=2
	elseif c>a and c>b then response=0
	elseif a>b and a==c then response=0
	elseif b>c and b==a then response=1
	elseif c>a and c==b then response=2
	else response=math.random(3)-1 end
	strings[player.UserId]=move..s
	if(move==0 and response==1)or(move==1 and response==2)or(move==2 and response==0)then return false end
	if move~=response then return true end
end

It records your plays. rock/paper/scissors are encoded as 0/1/2 and the string of everything you have done is searched for copies of the current last few moves. It searches through every length and predicts that your next move is going to be the same next move that you made then. Longer matches are weighted higher, so if you played the exact same sequence of 10 moves previously then it is probably going to predict that your next move is going to be the 11th move in that previous sequence.

1waffle1 · June 13, 2016, 4:00am

I added more information to the place.
I know it isn’t cheating but when I start to see this

it really goes to show how hard it is to be random.

1waffle1 · June 13, 2016, 5:32pm

I’ve made it slightly more accurate by reducing the value of older repetitions, so when it finds something at point “x” in the string, it divides the value of that repetition (3^length) by x.

sauce RPS.rbxl (21.6 KB)

also now you can choose rock/paper/scissors by pressing 1/2/3 on your keyboard instead of clicking on them.
key mashing lets you press buttons a lot faster

DataBrain · June 14, 2016, 9:16pm

This is quite ingenious. Unless people are using complex mental algorithms or random number generators, it seems only logical that they would recall patterns that seem random to them, but are actually just repeated sequences of past patterns in memory. Give an AI time to analyze these patterns, well, the numbers speak for themselves.

I wonder if there is any way to improve the ability for the AI to predict old patterns beyond simple sequences of choices.

FearMeIAmLag · June 14, 2016, 9:47pm

This seems completely unbalanced:

1waffle1 · June 14, 2016, 11:58pm

Repetition is a good generalization. If you choose rock a ton more than you choose scissors then it is going to choose paper a lot more frequently. If you have a tendency of choosing paper every time after you choose rock, it will choose scissors every game after you choose rock. If you have a habit of some other longer sequence then it will follow that one instead if it sees that you’re blatantly repeating yourself. What it wouldn’t see would be if you were actively avoiding repetitive behavior. That would cause you to beat it.

DataBrain · June 15, 2016, 12:00am

Now that you’ve released your AI’s algorithm, I want to write my own AI which knows what data has been recorded by your AI, and will specifically attempt to beat your AI and its patterns of trying to guess my patterns, and then countering that guess.

ConvexHero · June 15, 2016, 12:37am

You could model it as a group of estimators for certain lengths, and estimate the mean/variance of the predictions of each. From here you could generate a random number and select a viable estimator at random based on some function of the mean and variance. It probably makes sense to stop after a suitable window size as I get the feeling the variance of very long sequences would be quite high. If this is the case, you can just keep track of all subsequences of length N.

There is probably a handful of tuning parameters and bias-variance tradeoffs. For example, having some estimator based on the specific subsequence seen vs all sequences of that length.

There might also be some additional estimators based on assumptions that some patterns are more/less likely to occur if they previously resulted in success/failure.

Sharksie · June 15, 2016, 12:46am

Stole databrain’s idea

RPS.rbxl (19.2 KB)