Issue Type: Other Impact: High Frequency: Constantly Date First Experienced: Date Last Experienced:
Reproduction Steps:
Go in any game and try to send some things in Turkish. Google Translate helps, of course.
Examples:
Son zamanlarda çok fazla türk oyuncu gördüm
roblox görünüşe göre türkçe’den nefret ediyor
Expected Behavior:
I’d expect Turkish people to be able to converse at least somewhat normally.
Actual Behavior:
Some of the words in my examples clearly look like an attempt to bypass in English, so it’s probably a very hard problem to fix. But still, if you watch Turkish players, they can only converse in extremely… uh… broken words. Hopefully something could be done for them. Many messages just get filtered entirely even if there’s only one questionable-looking word.
Can confirm but with a different language, Polish. The filter makes it impossible for users over 13 to talk to people with U13 accounts in Polish regardless of the language they select. The filter constantly tags simple English words if you have Polish selected in your settings to the point of me needing to constantly toggle between the two.
I used to work at HQ, I am aware of this. Roblox has been working on internationalization for years now. The filter is a very large impediment to global adoption as-is. I don’t have any good suggestions for how to fix it, but I’m sure there are a lot of highly intelligent people at Roblox working on it. They’re probably already aware of these issues, but documenting it makes it more actionable.
Any accounts under the age of 13 can barely talk in any language(except maybe Spanish) I tried typing Thai letters in roblox on a alt under 13 and they got tagged.
At the same time, in many languages (for example Polish) there’s lots of swear words that don’t get filtered at all. I believe filter would work best if it could somehow detect what language user is talking in and prioritize filtering based on that.
The biggest reason as to why the filter does not work for other languages really well is because of an issue known as the Scunthorpe Problem. Basically the letters that make up a word the filter sees it as a swear since it does not care about the context. Named because of the fact that word filters would not let you set “Scunthorpe” as your town name (Scunthorpe is a town in England.) More info on this can be found at Wikipedia here: Scunthorpe problem - Wikipedia
Roblox’s filter is specifically designed to minimize the impact of this problem. But I’m not really sure that’s what’s causing the issues here; it’s way too prevalent and often impacts messages that don’t even contain anything remotely inappropriate looking.
This is valid for every language, sometimes even English. This gets a lot worse if you try talking in a different language than the one chosen in your settings.
This issue can be solved with NLP RNNs. These networks can semantically classify and even translate sentences. I’ve seen datasets that have classifications of toxic tweets and such, if not Roblox can make their own dataset because they basically have millions of players chatting at the same time = unlimited data. This also means the network can be trained on incorrect classifications reported by users and be perfected over time. Furthermore, it is possible for Roblox to exploit this and find suspicious players by classifying messages and analysing patterns and frequency.
Roblox uses a service called CommunitySift which works as you describe but they would rather have players frustrated rather than having something slip through followed by bad press. It could definitely stand to use some improvement though.
As a Russian speaker, I have noticed that the Carilic Alphabets that languages like Serbian, Bulgarian, Ukranian and Russian use is often filtered whilst offensive words in these languages can be let through the filter.
An example is: блать
In latinised form, it is blyat, which is the f word in russian,
It is filtered when blyat is said, but not when блать is.
But a lot of other words in carilic languages get filtered, possibly because it is carilic? I don’t know.
But to be honest, detecting a language, translating it to english and then checking for bypasses would be a far better method than filtering half of a language.
I am not aware with Google’s translate’s API, but I am certain that it can easily be connected to Roblox.
Of course not all translations are accurate, but it would be a far better method than just blocking anything that looks like an english offensive word.
There are many cases on where translating something harmless on one language through a translator becomes an extremely harsh slur, automatic translation is definitely not a good idea.