How does auto moderation know when an audio/image is innapropriate?

(Sorry if this isn’t the proper place to ask)

Hello devs,

So I was just sitting here and asked to myself, how does auto moderation know when an audio is inappropriate. Roblox has already stated that the only time an AI is used is when it has to do with audios, decals, and things like that.

image

Since there’s no real supervision from a human, how does a robot know what’s inappropriate and what’s not when uploading these things? What do you guys think?

2 Likes

I’m pretty sure every phonetic sound makes approximately the same pitch, so they can detect curse words through that pretty well in an unmodified audio. Which is why most bypasses apply hard overdrive twice or more to the audio and make it loud because it screws up the pitch just enough for the bot to miss the words but not people.

As for pictures, my guess is they detect inappropriate images from a database. I know less about the image side of things, so that one’s just speculation more than anything.

2 Likes

Note that it says “pre-screen,” the algorithm does not moderate assets by itself, it only flags potentially inappropriate assets for review by a human.

I believe the AI screens audio mainly for copyrighted tracks since that isn’t particularly difficult to do, I’m not sure how capable it is of detecting profanity and such.

2 Likes

Comparing hashes and looking through databases I’d guess. These processes aren’t publicised and for good reason - the whole spiel with bad actors and gaming the system to pass assets through that should otherwise not be on the platform.

Above is correct though, all assets have final adjudication by a human. Automation only helps filter the queue more easily given the volume of assets uploaded to the site on a regular basis.

1 Like

there is. any like real moderation action (game deleting, account bans, etc.) is ONLY done by a human. its in the tos

Right, which is why I said

Read and check for context.

1 Like

right, but if the image is inappropriate (or whatever it is, audio, decal whatever), banning and account action will only be taken by human moderators

Yes, but those particular assets will first be seen by an AI before a human. This is where my question comes in, how would the AI (which is NOT a human) know when a so said asset is inappropriate? Of course would afterwards bring the attention to the human for further account action.

1 Like