Heads up. I have added QLearningNeuralNetwork, but haven’t completed the documentation (it will take a while). You can have a look at the functions’ parameters to understand what it does.
It is available for Beta version only.
Edit: The documentation for Q-Learning Neural Network is completed.
I don’t plan on implementing this (I don’t have a PTW game), but would it be possible to use this for the following:
Find out when a player is most likely to be willing to buy a product based on where they are, what they just won, what goal they just completed, etc.
Use the AI to find out if certain aspects of players such as, how often they chat; what system they are on; how good they are at the game, could be used to do the same thing as above.
Find out what aspects of the games make people happy \ more willing to spend robux, by just doing something as simple as prompting them (measuring happiness), or purchase prompting them \ noting when they complete a purchase (measure willingness to spend robux) and storing the data about where they are in the game and what they are doing at the time.
These are obviously … morally questionable things to use the AI for (although I could definitely see some front page games using something like this) , but in a purely theoretical sense, what is your opinion?
And less related to the previous three:
Train a bot to model a certain player, or, in a sword fighting game, learn from the top 100 players (determined by kill to death ratio or something like that), to create a good sword-fighting bot.
Yes. All you need is to train the model and extract the model parameters. Then you can interpret the factors from the model parameters that leads to certain predictions.
Yes to all except for the “what system they are on”. Same explanation as above.
Yes. Same explanation as above.
Yes. Code has been provided in this post. Though that is if you wish to use the “Release 1.0” version. The “Beta 1.12.0” version may have slight variation for NeuralNetwork functions.
Also regarding with my opinion to your question, I’m very neutral about it. As long they don’t use AI to harm people, I just don’t really care.
What a coincidence! I’m also trying to create sword-fighting AIs but have had no luck. One thing I can
definitely suggest though is looking into the self-play algorithm. What I learned from the self-play videos I’ve watched is that an AI that plays against an opponent that is too difficult, e.g a hard-coded AI, will fail to learn successfully. The AI needs to play against an opponent of similar skill level in order to learn successfully. And what better opponent than itself!
I would also suggest learning reinforcement learning basics if you haven’t already. Once you’ve got the basics down, choose an algorithm for your problem and try to implement it.
I guess trying the self-play method is the safest option here, I had the neural network fought the hard-coded swordfight bot, according to what I know, it will take an extreemly long time for it to even progress one step.
AlphaZero chess developed by Google, was trained for 9? hours using 64 tpus. That 9 hours is literally equivalent of hundreds of hours of training using a general purpose laptop, like mine.
Edit: realized that comparing Chess to Swordfighting is not realistic since Chess is much more complicated than Swordfighting
The main problem with my sword-fighting AIs is the reward calculation. How do you avoid sparse reward while encouraging AIs to walk towards each other? It sounds like a simple problem, just add a magnitude-based reward right? Well, what if the enemy gets closer to the agent and not the other way around? This complicates things to a whole other level because the agent can be rewarded for simply doing nothing because the enemy got closer to it. I’m not exactly sure how to solve this sadly. Maybe inverse reinforcement learning? A neural network trained to mimic human players? That would limit its exploration but it could work. Where would you gather the data from though?
I guess the solution might be to not add intuitive rewards, though that might slow things down by a bunch, the neural network will have a deeper understanding of swordfighting.
I would reward the neural network based on how much health it has, times the negative of how much health it’s opponent has.
Here is my attempt using genetic algorithm, which failed completely.
(I trained for 12 hours straight, encode the neural network into a string, woke up tommorow, decoded the string into the neural network, ran it, and got this result) (Ignore the error message, that was the hardcoded-bot AI)
Probably more successful than mine. Mine experienced single-action collapse where it chose the same action regardless of the state and even the idle punishment I setup didn’t stop that apparently. I didn’t implement DQN in the conventional way though so maybe that’s what’s wrong. The conventional way is really laggy though because it requires training after each action taken after the experience replay buffer has reached a certain amount of experiences.
@Cffex and @ImNotKevPlayz, let me give you guys some tips. If you guys are planning to do long term training, take advantage of BaseModel’s getModelParameters() and setModelParameters() function that is inherited by reinforcement learning neural networks.
getModelParameters() function returns the tables of matrices that holds all the weight parameters used by the neural network models.