Self-Driving Neural Network Cop Cars

I’ve been working on neural networks for quite some time now (5-6 months full time, ~55hrs. a week)

Tonight I’ve finally reached a stage where I’m satisfied with their performance. I’m going to show you a quick demo of the finished product then go over how I got there.

Video Demo (viewing from cops perspective): Neural Network Cop Car's Perspective (Vehicle Simulator Roblox Demo) - YouTube

Here is a list my progress here to show how I went about doing this. Skip to the end for some finishing remarks & questions.




6 months ago I started my progress with neural networks & genetic algorithms: https://twitter.com/ScriptOnRoblox/status/1007459441073983488

I started with some more simple stuff so I could understand what I was doing before moving on to the meatier stuff: https://twitter.com/ScriptOnRoblox/status/1007464071145218049

Soon™ I learned about tweaking inputs: https://twitter.com/ScriptOnRoblox/status/1008928955083051008

There’s a lot that happened between here and moving on to more realistic cars, but here’s a progression:

Stage 1: Basic line following, I manually try to trigger breaks and some other things. AI handles driving along the route.

Stage 2: Neural net seems to start understanding that it’s velocity is important… drifting begins!

Ok so there’s actually quite a long time period that’s happened between stage 1-2-3. I spent a LOT of time on different mutation algorithms & networks. I probably spend another month or two of working everyday before I had the cars doing some of this more advanced stuff, and even at that I’m still triggering e-brake manually.

Stage 3: Combined with a pathfinding GPS on the VehicleSim map, I was able to record a demo!

Stage 4: Cars can now control handbrakes! Very fun to watch them learn how to nail corners:

Stage 5: Realized I could train a much more effective AI if I used ReLu (only positive outputs, negative outputs are treated as nil).

Things quickly began to ramp up at this point. Learning was faster and I could now train for some more complicated tasks, like flight! The airplane shown uses airfoil & flies fairly realistically.

Stage 6: Realized I needed a more effective mutation algorithm. Made some of my own momentum-based gene mutation modifiers that essentially made training more consistent (and at a much quicker rate)

Decided to go back and see if I could get more impressive results with the new algorithm:

I was right. Here’s the new algorithm learning to fly it’s very first plane!

And here it is mastering the plane + a cool visualizer to see what the AI sees

Stage 7: Realized once again that I was doing it all wrong. Changed a bunch of stuff within the algorithm. Results were excellent so I added some more variables (like rays to avoid hitting obvious objects).

Stage 8: Implementation! Combined everything I’ve learned, training data + new GPS algorithms and I started to have cop cars!

Stage 8.5: After some gameplay hours I realized I had a lot of areas to improve and works on things like collisions, AI cars getting stuck on things, and awkward driving situations (sometimes they just didn’t wanna throttle lmao)

Stage 9: Realized cops aren’t very effective if they just chase you. If I’m going to make this fun I need a challenge! I spent a few days working on different prediction algorithms for where a potential collision could occur between the AI car & chased car. Fed that into the GPS router & things started heating up fast

Cool car chases are now possible!

Remapped the entire node system on Vehicle Sim because I realized the current one didn’t support real-time chases very well (would awkwardly send the car to the nearest curb, THEN pursue the player…)

Yay more consistent chases!

At this point the cops could find you on the map, route to you, drive/drift their way as fast as they can to your location, & attack you.

It still didn’t feel challenging. Cops at close quarters would work pretty well but I needed them to find you faster and collide with you more accurately. After spending a week reworking some of the GPS algorithm internals I finally came up with something that felt challenging, which is where we are now :smiley: !


Can I play it? Not yet. I still have to add a system to trigger the cops. So far the current plan is to have them casually patrol around town & only get angry when players interact with them. Ideally you’ll be chased down by November!

What is a neural network? To put it simply, a computer that randomly guesses what the solution to a problem could be (or searches for specific data, patterns, similarities, etc…). It guesses thousands (sometimes even billions!) of times and after each set of guesses (say, 100 guesses) it’ll try to see why it’s correct answers were right & why it’s wrong answers were wrong by essentially moving around if statements & thresholds to trigger them internally. This is done through forward propagation in my case with inputs (say there’s 10 inputs, they represent things like how fast the car is going, how far it is from the center line it should be following, where the next turn is, etc…) & outputs. Outputs represent things like SteerLeft, SteerRight, Throttle, etc… Internally it has a list of “nodes” in multiple layers that have different numbers attached to them called bias and weights. I don’t wanna go too into detail with bias & weights but think of bias as adding a number to an input, and weights as a gradient between true → false. This gradient behavior is what makes neural networks so powerful. They can recognize a set of similar inputs & patterns. To do this by hand would be impossible in many cases.

How did you learn neural networks? This video helped me establish the basics: But what is a neural network? | Chapter 1, Deep learning - YouTube This second video alone pretty much explained how to do it for me. Went back multiple times for reference while coding it (probably seen the entire video 6 times total now) Gradient descent, how neural networks learn | Chapter 2, Deep learning - YouTube


It’s been a long journey learning something I knew nothing about, but now I’m ready to move on to my next neural network project. Feel free to ask me anything else. Thanks for reading!

182 Likes

Certainly very interesting.

I’m just flabbergasted that studio didn’t crash, let alone that this is playable. I would certainly never attempt a remotely useful neutral network implementation on an engine that doesn’t support 64-bit. I’d assume the data needed for a reliable experience would far exceed its’ current capabilities.

11 Likes

Time to refer to one of my first devforum posts. Oh where did I drop it…AH here (dusts link). Fits perfectly:


(needs a revamp)

Your creations are next level mah dude. GG

7 Likes

I’ve gotta ask, if you had to spend so much time on it anyway, why not just make conventional AI?

3 Likes

how are you training the network to navigate and collide more accurately? my first thought is an adversarial network where you also have another car that’s trying to avoid all of the police cars.

1 Like

Very cool stuff! Congrats on the great progress.

Would you mind expanding a little bit how did you go about implementing this on Roblox?
Are you using a NN library? Or did you implement your own tools?

11 Likes

Very cool. Keep up with the good work

6 Likes

I’m very impressed with this so far.

Are you going to have something similar to Need for Speed Most Wanted 2005’s police system? It looks like it so far!

1 Like

Low-key I could have. I’m not sure if it would have done as great in some scenarios, though. The nice thing about a NN is that I don’t have to think about anything but the path I give it once training is done. Drifting, turns, etc… are all handled by it and work fairly well at any speed. We also have the added benefit of now having a ‘perfect driver’ so we can put different cars in on the same route and use the same training data. Now we can see what a perfect driver is capable of doing with the car, and tune the car until it acts in a satisfactory way. (Ex: Lambo has too strong drifting for the AI to figure out, tune the lambo stats until it works right)

This one was actually pretty easy :smiley: I just have two rays. One on each side of the car that move forward as the car moves faster. They can help it avoid obvious things like walls and most of the time foliage (which it can run over and destroy anyway so it’s not the biggest issue). I tell the NN what % of the Ray made it and it does the rest. (Ex: I draw a ray 10 studs, it hits at 6 studs in so I feed it 0.6 as an input)

I just figured it out myself. Turns out the algorithm can be done in like < 100 lines of Lua code. It just took me a long time to figure out what those lines and variables had to be >_<

Actually yes :joy:

13 Likes

Congrats as well, @ScriptOn on your award for Best NeuralNetGuy2018.

15 Likes

I thought so!

You going to have radio chatter and stuff like that?

3 Likes

Thanks for sharing your whole experience in such detail. Also, thanks for referencing the tutorials, that channel is always great!

2 Likes

I am in love! Reading this made me realize how insane neural networks are and how powerful they can be! I will definitely be studying this from now on until i have a great understanding of it. Thanks for posting this.

4 Likes

Wow😯! That looks awesome. How many lines did you use? That just looks so Unbelievable. That game your making is going to look awesome.

May I ask which activation function did you use?
I hear that ReLU or Leaky ReLU are the best nowadays, along with a softmax function in the last layer since ReLU only applies on the hidden layers. I’m still trying to figure out how to back propagate through it though.

If you do use ReLU and Softmax, may I know how you back propagate through it?
Thanks in advance, inspiring work btw!
MXKhronos

ReLu without softmax on the last layer.

3 Likes

You backprop through all activation functions the same way, just one big chain rule through all the derivatives to work out what change to a node’s weight would correct the error measured at the node output. Activation functions all have simple derivatives, and ReLU is simplest of all, it’s just piecewise defined as f(x)=0 for x<0 and f(x)=x elsewhere, so the derivative is just 0 for x<0 and 1 for x>=0. Leaky ReLU is not much more complicated, you just have some small slope for x<0, usually 0.01 so that negative values aren’t completely zeroed out, just heavily discounted. SmoothReLu is an exponential curve asymptotic to ReLU: ln(e^x+1) and its derivative is the logistic sigmoid function also commonly used as an activation function.

Softmax is something else entirely, it’s used for converting the raw output of the final hidden layer to a normalized probability distribution, typically scores for some N number of classification categories that sum to 1.0. Backprop through its summations is too involved for a devforum post, but here is a good reference derivation: The Softmax function and its derivative - Eli Bendersky's website

You wouldn’t use softmax for car AI, it’s for classifying inputs into a discrete set of classes (e.g. to decide if a photo is a cat, tree, doge, etc. You’d use a regression error function for steering a car, like squared error terms: how far off course you are, in various directions, squared so that your error is guaranteed to have a minimum you can gradient-descend your way towards (think of how a parabola is like a bowl, it has a bottom, which is where you want to end up–the minimum error). Of course a very complex multivariate loss functions can have local minima too, a major issue for solving things with NNs as a local minimum that is not the global minimum is not the lowest error solution. Part of working with NNs is techniques to avoid these less-optimal solutions.

6 Likes

I love your explanation! Thank you for helping!

2 Likes

Thanks a lot for the information! I got it working and now I’m very excited to apply it.

Do you think you’d ever release a version of this? Maybe release a tutorial on making something similar?

3 Likes