Context:
I’m creating a simple loop that iterates over each character of a string, checks if it’s in a table of accepted characters, and prints if there are any unaccepted characters.
Here’s the relevant code snippet (note that there are no undefined variables and that this code worked fine for years until I started testing it against new characters):
-- Checks the character/substring against each acceptable character --
local iterator1 = 1;
local iterator2 = iterator1;
local substring
local message2 = "";
local i = 0;
while i < length do
substring = string.sub(msg,iterator1,iterator2);
iterator1 = iterator1+1;
iterator2 = iterator1;
local isAnAcceptedChar = false;
-- Check the character/substring against each acceptable character --
for i,v in pairs(Accepted) do
if v == substring then
isAnAcceptedChar = true;
end
end
if not isAnAcceptedChar then
numNonAcceptedChars = numNonAcceptedChars + 1;
print("Rejected character: " .. substring);
end
i = i+1;
end
Additional context: The accepted characters table contains all uppercase and lowercase letters in the standard 26-letter English alphabet and various punctuation characters, but it also contains new characters from the Unicode extended Latin alphabet (such as è for example).
The issue:
When I test with a sentence containing only ASCII characters, everything works normally. But when I try with one of the Unicode characters that I added to the accepted characters table, the issue arises. Here are some examples:
Test #1 - nothing weird happening here
User input: “Can I say this?”
Roblox output:
Original message: Can I say this?
-- No rejected characters
Test #2 - had an issue
User input: “Can I say è?”
Roblox output:
Original message: Can I say è?
Rejected character: �
I’m wondering how è became �. è is in the Accepted table, and Roblox was able to print it normally the first time, so the issue probably has something to do with the part where I try to get a substring? I’m not really sure why this is happening.