Help with string patterns

I am learning about string patterns (in particular applied to string.gsub), but I still have a doubt which I can’t figure out: what is the - modifier (example: %l-) utilized for? Could you also provide a few examples where it would be useful to use this modifier?
Thank you in advance.

https://www.lua.org/pil/20.2.html

The dash has the exact same behaviour as the asterisk (a- matches a and aaaaaaa as well as an empty string).

It may be helpful to learn RegEx before Lua patterns, because there are so many tutorials on regular expressions (compared to patterns). The basic idea of both is the same and you’ll gain more knowledge.

The - modifier in string patterns is used to match zero or more occurrences of the previous character class or pattern. It essentially indicates that the previous character class or pattern can occur multiple times consecutively or be absent entirely.

Here’s an example:

local text = "Hellooo, how are you?"
local modifiedText = text:gsub("%l+", "x")
print(modifiedText)

In this example, the pattern "%l+" matches one or more lowercase letters consecutively. The - modifier is not used here. It replaces the matched lowercase letters with the letter “x”. So, the output replaces the words “how” and “are” with “xxx”.

Here’s another example:

local text = "abc12def34ghi"
local modifiedText = text:gsub("%d-", "")
print(modifiedText)

In this example, the “%d-” pattern matches any digit character (0-9) and the hyphen - indicates zero or more occurrences of the preceding pattern. So, it matches consecutive digits. By replacing the matched patterns with an empty string, all the consecutive digits are effectively removed from the text string. So in this case, the output would be “abcdefghi”.

Thank you very much, the answer was very helpful!
I still have a doubt though, what is the difference between the - modifier and the * modifier? Because they seem to be achieving the same result.

Also sligthly unrelated, is there any usage for the - modifier in string.gmatch? Because I played around with it and it seems useless.

The - modifier and the * modifier may appear to achieve similar results in some cases, but they have different behaviors.

The - modifier in Lua patterns is used for complementing a character class. It matches zero or more characters that are not part of the specified class. It allows you to define a set of characters that you want to exclude from matching. For example, %l- matches zero or more lowercase letters.

On the other hand, the * modifier in Lua patterns is used to denote zero or more occurrences of the preceding pattern. It matches zero or more repetitions of the preceding pattern. It allows you to match any number of occurrences (including zero) of a particular pattern.

Here’s an example:

local str = "Hello, world!"
local matches = string.gmatch(str, "%l-")
for match in matches do
    print(match)
end
-- Output:
-- H
-- e
-- l
-- l
-- o
-- w
-- o
-- r
-- l
-- d

In this example, string.gmatch with %l- iterates over each lowercase letter in the string. It matches each lowercase letter individually, including the empty string.

%l* - Matches zero or more lowercase letters.

local str = "Hello, world!"
local matches = string.gmatch(str, "%l*")
for match in matches do
    print(match)
end
-- Output:
-- H
-- e
-- l
-- l
-- o
-- w
-- o
-- r
-- l
-- d
-- 

In this example, string.gmatch with %l* matches sequences of zero or more lowercase letters. It matches both individual lowercase letters and sequences of lowercase letters. The final match is an empty string.

As for the usage of the - modifier in string.gmatch , it doesn’t have a direct application. The - modifier is mainly used in string.gsub to complement character classes and perform substitutions. In string.gmatch , it is not needed because the purpose of string.gmatch is to iterate over matches, not to perform substitutions. So, the - modifier is not relevant in the context of string.gmatch .

Thank you very much, you made this topic very clear to me!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.