Hi, I am wondering what this string pattern means? It looks very complicated
"[^%s]+"
Hi, I am wondering what this string pattern means? It looks very complicated
"[^%s]+"
It looks complicated, but it’s not so much when it’s broken down into more manageable pieces.
local quote = "Broken down into more manageable pieces."
for str in quote:gmatch("[^%s]+") do print(str) end
string.match
finds the first match/occurance of a pattern and returns the captures.string.gmatch
returns an iterator that is called multiple times and returns captures from all matches.local str = "a b c"
print(str:match("%a"))
for capture in str:gmatch("%a") do print(capture) end
string.match
returns only “a”, while string.gmatch
returns all three letters, each in its own iteration (a then b then c).
%s
is a character class for whitespace (" "). A capitalised %S
is its opposite, used to exclude all white spaces, respectively include everything but whitespaces.
^
is called a magic character. It has two meanings:
$
is the polar opposite and matches the end of the string.print(quote:match("^Broken")) --> Broken (not followed by a whitepsace)
print(quote:match("pieces.$")) --> pieces.
%s
represents all white spaces, ^%s
represents all but white spaces (same function as %S
).+
is a class modifier that matches one or more occurances of the preceeding character class. For example (using class %a
, look for at least one or more characters that are upper or lower case letters until you run into a different class):print(quote:match("%a+")) --> Broken (<-- stops at whitespace)
[ ]
finally, the square brackets are used to define sets to combine different character classes.local str = "One.! Two."
print(str:match("%a+")) --> One
print(str:match("%a%p+")) --> e.!
print(str:match("[%a%p]+")) --> One.!
print(str:match("%a+%p")) --> One.
In line 3 we used no character sets. Luau would interpret the pattern as starting at the beginning of the string, match one upper or lower case letter followed by at least one or more punctuation characters.
In line 4 we created a character set. The interpretation would be different: starting at the beginning of the string, match at least one or more characters that is either an upper/lower case letter or a punctuation character.
In line 5 a set is not necessary because +
is only applied to one character class %a
. It would match a string consisting of at least one upper or lower case letter and ending with punctuation.
The original quote can be broken into substrings another way:
local quote = "Broken down into more manageable pieces."
for str in quote:gmatch("%S+") do print(str) end
Interpretation: capture all matches of strings that consist of anything but a whitespace.
The best way to really understand string patterns is to practice and experiment.
Edits: All edits are formatting changes.
Thanks! This helped me so much!
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.