Machine Learning to stop bad (ToS Violating) clothing

Hello everyone!

The Project - Toni

As of today, I have started a project that aims to target clothing on Roblox that violates the specific clause about

  • Content that depicts, strongly suggests or explicitly describes sexual acts
  • Nudity
  • Sexually suggestive assets or bundles

The way I am going about this is quite simple, using tensorflow to create a model that I can easily use I can make it determine if clothing is safe for Roblox or not, similar to what is currently in use or what is supposedly in use by Roblox’s moderation team, however, I wish to make this a bit advanced and also hope to have a very vast database of “illegal” clothing that when a player joins the game their avatar is quickly checked and if any asset is containing this content, the player is banned from the game and a report can be submitted to Roblox.

Where have I gotten?

Well as of right now I haven’t gotten super far, I have made a rough design that you can test at my little server, no it is nowhere near perfect or even finished but you can see if you use something very obvious vs something else that is very obviously not that it does detect it with some small level of accuracy (there are only 200 samples so far).

I have now been able to make it so that in Roblox it’ll check every Graphic a user has and remove any that were flagged by the system. In the future these ID’s will be logged somewhere.

Where will this lead?

Within time and I hope to update this, I will be adding the script you can put in the game that will remove the bad items and send a report to Discord for logging and be a lot more accurate and even able to detect things outside of just t-shirts such as other classic clothing.

Feedback

Right now I also wanna know what I should do if there’s any tips or tricks, and what everyone thinks, would people use this would people not, tell me the many feedback.

Prototype

Hi, After a bit of testing a small prototype has been made that you can test in your games if you so choose, the code file is listed here:
Toni.rbxm (4.1 KB)

Example

3 Likes

Sounds good but your program is not at all good at detecting suggestive imagery at all! I tried a few such images and it labelled them as “safe”. I do acknowledge that it is still in beta and for that if you need someone to test samples then you can ask me.

1 Like

As of right now all it’s trained to do is images like these, these types of t-shirts from Roblox would get flagged, and it doesn’t have nearly enough samples to have actual accuracy as I am still trying to gather a wide range of them.

Adding onto that they must be the formatted like the roblox’s png since it would in theory grab the image when trying to determine if the account wearing it is wearing a bypass item.
Example:
image

Ah, I see. Well you should then also train it on those “strange neko game” or “anime girl games”. Those games are flooding the site and they are just awful.

1 Like

Indeed, right now I am wanna just target basic image detection based upon all classical clothing types and then I will be sure to expand, right now because it’s the easiest I wanna just tackle these shaded clothings and see what I learn along this process. But feel free to send me them if you believe it is of note.

1 Like

wow. i think this is great, doing what roblox doesn’t seem to be doing.

although ofc i would assume you need way way more training samples, as it detected an image of me playing deepwoken (hallowtide is goated) as unsafe content :sob::

seems you’ve already talked about this but just stating what i’ve seen from it.
best of luck, hopefully it ends up working much more accurately in the near future.

Roblox already uses AI for detecting bad content and tries to automate moderation as much as possible but roblox also has millions of daily active users which means everything has to be perfect otherwise false bans can occur which will eventually be a bad look for a company.

When it comes to big numbers stuff like this can falsely detect i don’t get why people think they can do a better job than roblox roblox has millions of active users and more strict moderation means more possibilities for the false detections

if moderation was this strict on roblox there would be millions of false bans



image

And yes, I am aware that It won’t be nearly as good as Roblox’s but the difference is Roblox has to filter millions of variables. In contrast, I want to narrow down on a particular variety of things, it’s gonna take a huge large dataset to make it feasible. I am aware false positives will exist, but I also wanna see about testing ideas with this and seeing what works and what doesn’t, since Roblox’s algorithm is good but It has to deal with an extensive range of things whereas mine doesn’t.

As of right now it really only has 200 images to train on which at a minimum you need 1,000 to get anywhere is not many whatsoever. So yes in this case there would be an extra ordinary amount of false positives.

Not to mention this is for my learning experience as well as everyone else, since Roblox doesn’t make their algorithm public for everyone to use whereas I plan once it’s at a somewhat useful stage to allow people to use it as they please.

It is best not to try and shatter people’s ideas and dreams by saying someone’s already been there and done that, not everything has to be new to be good or to learn from. If you have other complains feel free to share.

Truth sometimes hurts if you can’t accept the negative feedback than that’s more of your issue this type of ideas have been tried many times and all of them failed

with all this being said i wish you good luck and i appreciate the contribution

It’s not to say the feedback isn’t good, since it is true I won’t try and deny it whatsoever, I’d just say work on the approach, constructive is better than outright damnation. But I’m not personally hurt. If anything I wanna see if I can prove you wrong, even though I am quite new to this whole machine learning field so who knows where I’ll get, might even see if using something similar to the GPT models is possible but who really knows? In the meantime, I’ll be adding more and more data as I go to see where I get and what I can change fix up and what not.

If anyone wishes to aid me in this process let me know.

update: Just tweaked the model and should reduce a few really stupid false positives.