Edit: Lua does not use regex! Removed that from the thread to avoid confusion.
I’m no good with regex Lua patterns and hadn’t a clue where to even start attempting (other than String Patterns Reference which explains a lot but not everything), so I did my research on the topic (see title) and got pleasant results. Just no journey towards them.
I found two sources, a StackOverflow thread and a GitHub Gist that gave me an answer to my question, so I now have the patterns [^\r\n]+
and ([^\n]*)\n?
to use with gmatch. This will allow me to get the content per line of a multiline string.
Both of these captures achieve exactly what I want, but I don’t understand what these patterns are doing and therefore can’t make a decision on what to use. While hypothetically I shouldn’t care since they do the same thing and work, I make it a habit to care and know what my code is doing. I also feel as though it does in fact matter what pattern I use.
To try and help me better understand this pattern, I attempted to use @Halalaluyafail3’s String Pattern Analyzer Plugin which converts a string pattern into English. It’s worked well, provided you actually understand regex character classes and such for complex patterns.
For reference, these are the explanations the plugin gave back:
Pattern: [^\r\n]+
Analysis: A character set which will match anything but one of the following 1+ times, as many times as possible, giving back when necessary (greedy)
The character ‘\r’
The character ‘\n’
Source: lua - Split a string by \n or \r using string.gmatch() - Stack Overflow
Pattern: ([^\n]*)\n?
Analysis: Capture 1
A character set which will match anything but one of the following 0+ times, as many times as possible, giving back when necessary (greedy)
The character ‘\n’
Matches the character ‘\n’ 0-1 times, as many times as possible, giving back when necessary (greedy)
Source: Split string by line in Lua · GitHub
Further searching has just pushed me into constant dead ends; from as complex as understanding (non-)greedy captures, why \r\n is the specific expression for a new line and why not either of those individually: and what the difference between a newline and a carriage return is. No idea.
I would be grateful if someone could enlighten me on this topic. To summarise what I’m asking:
-
What is the difference between these patterns and why might it be relevant? What are these patterns doing that I should be aware of?
-
Are these good methods to use at all or should I opt to use something else? What are the benefits of the other methods over feeding these expressions through gmatch?
-
Are there any resources that dumb down regex that may be helpful for novices? I can easily go read from a regex documentation site or whatever, already considered it and have a few sources, but wondering if anything really sets a starting point for investigating.