Hello! I recently came into the problem of detecting unrecognized characters and removing them from strings. Although the following issue occurs:
text = "�"
if text:find("�") then
print("found") --success
end
text = TextBox.Text --where the symbol "�" exists
if text:find("�") then
print("found") --fails
end
Is there any string patterns that can detect those characters? They currently appear in emojis so I may be able to take advantage of that?
That’s what the code snippet above tries to explain, string.find isn’t working for this occasion. Also I tried using it with the ignore magic characters parameter set to true which also failed.
The problem is even though � looks the same, something it doesn’t have the same value.
You could try checking the byte number of each character to see if it’s a standard one byte UTF-8 character, it would look something like this:
local function CheckString (Str)
local Result = ""
for i, v in ipairs(string.split(Str, "")) do
if string.byte(v) < 128 then
Result ..= v
end
end
return Result
end
print(CheckString("abc�d"))
--Returns abcd
(also written in devforum so it’s untested)
The only issue is this limits you to only 1 byte characters, so it would also remove characters like À, È, Ì
well go to the lookup table where you copied them from. if you are a mac user it was probably western(mac roman). nobody actually uses ascii, we just use the first few ones. that was atleast what i discovered when i was researching this a bit back.
and look for them. the first should be atleast for me,
It seems that the issue was happening with certain emojis(like the italy emoji). What was happening was that the TextBox cursor was at the center of certain emojis causing them to break in half when space was pressed(showing the unrecognized characters they are made of). The solution was to manually set the TextBox.CursorPosition 1 step forward(TextBox.CursorPosition += 1).