Minify & Beautify Plugin

stravant · December 29, 2014, 4:22am

I finished an upgrade to my old Minification code, which has been in need of an upgrade for a while now: it hadn’t even been uploaded to official site plugins page.

Link to the Plugin

Release notes:
[ul]
[li]No more odd bits of brokenness - There shouldn’t be any odd bits of brokenness anymore other than a few bugs left over that I haven’t ironed out. The old minifier of mine didn’t really use a full comprehensive approach to minification, so it incorrectly handled some really odd edges cases. The new plugin should correctly handle any situation, because it fully understands and handles all of Lua’s scoping rules down to oddness like what happens when you use the same variable name multiple times in a variable / argument list.[/li]
[li]Almost optimal variable naming - The old minifier pretty much just looped over variables in the script and gave them names one after another incrementing the variable name to use each time. The new minifier uses a non-trivial algorithm to figure out a reasonably minimal set of variable names making heavy use of local variable shadowing where-ever possible to try to give as many variables as possible one character names (And does quite well, even minifying itself, a 3000 line script with 600+ variables it manages to use only one character variable names). Aside: Does anyone know if there are any papers on how one would go about finding the truly optimal variable naming? My naming is very good in most cases, but I can find some contrived “bad” cases where it results in a very inefficient set of name compared to the optimal one. I’m interested whether finding the optimal naming is actually a solved problem, and if it is, whether the algorithm is reasonable to use in practice.[/li]
[li]b Additional simplification passes[/b] - The new code base is set up so that I can add additional simplification passes that actually modify the AST rather than simply shuffling around whitespace and changing variable names. For instance, converting “if a then foo() end” into “_=a and foo()”[/li]
[li]b Beautifier[/b] - The new plugin includes a somewhat functional beautifier in addition to the minifier. This beautifier tries to format the code in the standard Lua formatting, adding correct indentation to the code. The beautifier also respects existing whitespace / comments, leaving them as they were before. Also includes a nice option for reverse engineering minified code, which will rename variables into nice unique annotated strings that you can easily find & replace in your text editor of choice as you inspect the un-minified script.[/li]
[/ul]

Overview of the variable renaming algorithm for anyone interested (You can find the code in the “MinifyVariables_2” function in the LuaSyntaxToolset module under the plugin):

First, the minifier runs through the AST, finding all variable references and variable declarations, as well as scoping information, in order to build up a tree of all of the scopes / variables / variable references.

Next, the minifier assigns each variable a “used names” set, which is initially empty. These are names names that that variable cannot have, as they have been used elsewhere in a way that would collide with the variable in question’s usage.

Next, the minifier sorts the variables in order of least used variables first. That is, the variables with the least references to them are to be renamed with the shortest variable names first. But wait, that seems a bit counterintuative?? Shouldn’t the variables with the most usages be the ones that have a high priority for getting short variable names? That’s the way it seems at first, but if you actually think about it: Variables with few usages tend to have very short lifetimes, and that means that there is a good chance that not very many other variables overlap with those lifetimes. If not very many other variables overlap with those lifetimes, then there’s a good chance that we can end up reusing these short variable names again for those other variables. I did some testing and I’m convinced that renaming the less used variables first is a better solution for all “practical” scripts: It is obviously possible to construct a case where this approach performs badly under this algorithm, but these cases are not remotely similar to any “real” code as far as I can see. TL;DR: By renaming infrequently used variables first, you tend not to actually “use up” many short variable names, since you’ll be able to “shadow over” / “shadow under” those infrequently used variables again with the same short variable names later.

Next, for each variable in the script in that sorted order, the script choses the next shortest available variable name for that variable. It does this by finding the shortest variable name that is not already in the “used names” set for that variable.

Once it has done the renaming, it has to go through every other variable in the script that has not been renamed yet, and update that variable’s “used names” set, adding this variable’s name if there is a collision between that variable and this one. Where exactly does a “collision” occur? Here are the exact circumstances (taken from a comment in the plugin source code). Note, “depth” in this context means how scopes deep the variable is nested, think of it has “how tabbed in” the variable is:
[ol]
[li] At the same depth, that overlap in usage-lifetime with this one. EG: “local a = 5 print(a) local b = 3 print(b)”, a and b do not overlap in usage lifetime there, since by the time “b” comes into scope, “a” is no longer need. They can be renamed to the same thing safely.[/li]
[li] At a deeper level, which have a reference to this variable in their lifetimes.[/li]
[li] At a shallower level, which are referenced during this variable’s lifetime[/li]
[/ol]
Make sense? Probably not at first. Take some time to really think about what these cases mean and you should be able to understand why those are the collision cases.

sleitnick · December 29, 2014, 4:55am

Very cool!

Quick bug report:
I took a random script and ran the minifier on it, but it errored out, saying: “395:52: Unexpected symbol”

That line of code is this:

return outBounce(t * 2 - d, 0, c, d) * 0.5 + c * .5 + b

So I’m assuming it has something to do with the ‘.5’ there. Not sure why the author of this code decided to sometimes have the 0 and sometimes not.

Edit: Right before the error, it printed: <Symbol `.`> at: 395:52

stravant · December 29, 2014, 5:12am

Thanks, fixed.
Let me know of any other bugs that come up. I’ll try to get them fixed quickly.

Usering · December 29, 2014, 5:45am

When minifying:

21:44:17.056 - Plugin_197760456.MinifyPlugin.LuaSyntaxToolset:458: 142:12: `=` expected. 21:44:17.056 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 458 - upvalue expect 21:44:17.057 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 957 - upvalue exprstat 21:44:17.057 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1272 - upvalue statement 21:44:17.058 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1284 - upvalue block 21:44:17.058 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 986 - upvalue ifstat 21:44:17.059 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1254 - upvalue statement 21:44:17.060 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1284 - upvalue block 21:44:17.060 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 612 - upvalue blockbody 21:44:17.061 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1107 - upvalue forstat 21:44:17.061 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1260 - upvalue statement 21:44:17.062 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1284 - upvalue block

When beautifying:

<String `"Model"`> at: 70:28 21:46:58.860 - Plugin_197760456.MinifyPlugin.LuaSyntaxToolset:528: 70:28: Unexpected symbol 21:46:58.861 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 528 - upvalue prefixexpr 21:46:58.862 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 730 - upvalue primaryexpr 21:46:58.862 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 932 - upvalue exprstat 21:46:58.863 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1272 - upvalue statement 21:46:58.864 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1284 - upvalue block 21:46:58.864 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 612 - upvalue blockbody 21:46:58.865 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 649 - upvalue funcdecl 21:46:58.866 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 873 - upvalue simpleexpr 21:46:58.866 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 898 - upvalue subexpr 21:46:58.867 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 927 - upvalue expr 21:46:58.868 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 680 - upvalue functionargs

stravant · December 29, 2014, 6:00am

Please pastebin the code you’re trying to minify / beautify if you find a problem. PM me the link if it isn’t something that you want to publicly share.

MrgamesNwatch · December 29, 2014, 8:22am

Very cool, minifying worked on all but on of my scripts in my game.
I narrowed it down to a single function in a module.

output:

Tokens[-3] = `.`
Tokens[-2] = `Print`
Tokens[-1] = `(`
Tokens[0] = `...`
Tokens[1] = `)`
Tokens[2] = `if`
Tokens[3] = `module`
03:04:50.673 - Plugin_197760456.MinifyPlugin.LuaSyntaxToolset:458: 21:23: `)` expected.
03:04:50.675 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 458 - upvalue expect
03:04:50.676 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 648 - upvalue funcdecl
03:04:50.677 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1264 - upvalue statement
03:04:50.678 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1284 - local block
03:04:50.680 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 1315 - global CreateLuaParser
03:04:50.684 - Script 'Plugin_197760456.MinifyPlugin.LuaSyntaxToolset', Line 3168 - field Minify
03:04:50.686 - Script 'Plugin_197760456.MinifyPlugin', Line 131
03:04:50.687 - Stack End

Code:

local module = {}

module.Enabled = true

function module.Print(...)
	if module.Enabled then
		print(...)
	end
end

return module

Beautifying minified scripts seems to work too, at least on the one script I tried it out on.

stravant · December 29, 2014, 8:43am

Fixed.
I forgot to hook up the code to handle variable argument functions.

appropriations · December 29, 2014, 2:07pm

That’s cool.

I installed it even though I probably won’t have any real use for it. Might use it on LocalScripts to obfuscate though.

Just a suggestion: to make it even harder to read when you minify (for obfuscation purposes), replace the spaces that you have at the end of a statement/line with a semicolon. Or at least make it an option.
Also could you add an option that adds semi-colons when you beautify? I’m so used to them, the code looks weird without them.

Elmuowo · December 29, 2014, 2:40pm

The minifier seems to minify the variable “script” like the following, leading into malfunction

Without minify:

script.Parent:Destroy()

With minify:

j.Parent:Destroy()

stravant · December 29, 2014, 9:30pm

“The minifier seems to minify the variable “script” like the following, leading into malfunction”

That should only happen if you have a local variable called “script”, or overwrite the variable “script” (In which case the minifier thinks that you created the variable “script”, hence it can safely rename it, that’s one of the caveats to using the “rename globals” option, described under it)

Do you have an example of it doing otherwise?