Lua performance profiling API

Sharksie · September 14, 2016, 3:30am

I didn’t realize everything was on the same stack. Choice 1 would work but I’d want to name the scopes to make them easier to read, and at that point it’s just choice 2.

That’s fine. The only case where this would be a problem is when you want to yield before closing, which isn’t something that you should do anyway…

B_rcode · September 14, 2016, 4:24am

If it’s using stacks, then what’s the point of choice 2? (if I understood correctly)

spotco · September 14, 2016, 5:24am

Verification that the close is for the open that you expect it to be (otherwise will exception)

Sharksie · September 14, 2016, 5:57am

Will there be any options for extra data? I want to be able to associate a segment with specific data and read the data when I mouse over it. That would be really convenient for debugging. Without extra data I can only tell how long a set of code is taking, but with data I can tell how long a set of code is taking under specific circumstances.

einsteinK · September 14, 2016, 12:34pm

The first choice seems obvious, as it’s similar to Lua’s if-statements and all the things.
The third choice restricts funtionality a lot, while we could make that function with choice 1/2.

Although as a Lua-tionist the first choice seems the best, choice 2 seems better.
The need to end the right token will solve 70% of the issues, being devs not using it properly.
(if they use the token argument for profileend() otherwise they’re still (semi-)screwed)
It also allows implementing choice 1, although implementing choice 2 with choice 1 is also possible.

I was gonna slightly prefer choice 2 over choice 1, but then I (re)read that small note under choice 2:

(basically if choice 2 is implemented, you can act as if choice 1 is implemented)

Choice 2 for the win!

Questions related to details of the implementation

What if we do yield? Will it error, similar to yielding in a non-__call metamethod?
What if the code errors during a profiling?
Not that big of an issue if the profiling is started/stopped in the same thread.
If it isn’t, then it seems like a big issue…
What about multiple (sub)threads? (a bit related to the previous question)
Some simple stuff like calling pcall already creates another thread.
with the “can’t yield” limitation, I actually don’t expect much problems
(well, lots of problems if a profiling won’t get ended because of an error)

Also, for people that don’t like adding/removing print/debug statements a lot:
It wouldn’t be too difficult to write a plugin that can “convert” a script to a debug version.
The converted version could be parented to it, or switched with the original 'n stuff.
(with the right kind of converting, scripts could be converted as normal>debug>normal>…)
Converting would be replacing comments with certain instructions:

--@profile("Test")
while a() do b() end
-- Results in the loop being wrapped
debug.profilebegin("Test")
while a() do b() end debug.profileend()
-- Could be done for functions and all other "scope" statements

--@profile("Test")
func()
-- Results in the function being wrapped
debug.profilebegin("Test") func() debug.profileend()

Just need to edit a fixed version of Stravant’s LuaToolset stuff, which is currently used for minifying etc

spotco · September 14, 2016, 5:41pm

@Sharksie
All you can pass into the profilebegin is a string (that will group with all the other profilebegins of that string name). From the sound of it you should be able to implement what you were talking about using strings you dynamically create.

@einsteinK

Yield, Wait or Error will immediately close all active annotations (if your code isn’t consistent, we need to make the annotation stack consistent at the very least).

So, it’s pretty easy to write something like:

debug.profilebegin(“test1”)
debug.profilebegin(“test2”)
wait()
debug.profileend()
debug.profileend()

(The annotation stack state after the wait is empty, so both of those profileends are invalid).

pcall
behaves as a normal method call

einsteinK · September 14, 2016, 5:45pm

What about coroutine.yield() counting as yielding?

debug.profilebegin("test")
print("result:",coroutine.resume(coroutine.create(function()
	coroutine.yield("first yield")
end))
debug.profileend()

(and the very rare case that coroutine.yield() actually yields until the next tick)

0xBAADF00D · September 14, 2016, 6:41pm

The version that I saw of this was closing all open annotations when the internal Script::resume() (not the actual name) call finished, so coroutine.yield() will work the same as any other yield.

buildthomas · September 14, 2016, 6:52pm

You guys can be smart about this and combine Choice 1 and 2, if debug.profileend() ends by default the section that is on top of the stack (most recently started) if no section token is given (behaves as Choice 1). If a section token is given, it will behave as Choice 2.

Xsitsu · September 14, 2016, 6:59pm

Been testing some things with coroutine.yield(). I’m not super familiar with how it works with the internal thread scheduler however.

Was running the tests with this code:

local function DoStuff()
	print()
	print()
	
	local cor = coroutine.create(function()
		debug.profilebegin("test")
		
			print("STARTING PART 1")
			debug.profilebegin("part 1")
				local n = 0
				for i = 1, 5000 do
					n = n + i * i
				end
			debug.profileend()
			
			coroutine.yield("first yield")
			
			print("STARTING PART 2")
			debug.profilebegin("part 2")
				local n = 0
				for i = 1, 5000 do
					n = n + i * i
				end
			debug.profileend()
		
		debug.profileend()
	end)
	
	print("first")
	coroutine.resume(cor)
	--wait()
	print("second")
	coroutine.resume(cor)
	print("end")
end

while true do wait()
	DoStuff()
end

All of those multiplies are just there to make it actually take time.

With the wait between the resume() calls at the bottom commented out, I consistently get the following results and output:

first
STARTING PART 1
second
STARTING PART 2
end

But with that wait uncommented, I get results and output like this:

first
STARTING PART 1
second
STARTING PART 2
11:47:44.237 - debug.profileend() - No active profile annotation for context.
end

As long as you don’t coroutine.yield() across a step your profiling will be able to work.

einsteinK · September 14, 2016, 8:15pm

That’s exactly what choice 2 will be, he said that in the first post already.

buildthomas · September 15, 2016, 5:27pm

Either oops, or the post has been edited in the mean time (OP has 4 edits). Anyway, that’s cool.

spotco · September 16, 2016, 11:10pm

This is coming within a few weeks. We’ve decided on choice 1 (if you have any sort of error, it’ll print out the stacktrace and line number. this should be much more helpful for debugging than having to manually match with tokens).

Stay tuned for documentation and tutorials (both written and video form).

This is a really useful tool we use internally to measure performance and reason about task scheduler behaviour. I hope once we get all that out you’ll find it the same

einsteinK · September 17, 2016, 11:57am

We can recreate choice 2 using choice 1, so it isn’t really a problem.
(only choice 3 would limit the possibilities offered by any other choice)

What kind of errors? The default?
Or stuff where you call debug.profileend() while there isn’t any profiling going on?
(etc you yielded, so some profiling you started earlier has been stopped)

I’m just wondering how it’ll handle this:

debug.profilebegin("A")
local function idk()
	--debug.profilebegin("B")
	debug.profileend() -- ending B --senpai, notice this!
end
idk()
debug.profileend() -- ending A <-- ERROR

That would error when ending A, although the actual culprit is the B stuff.
This is why the token system would be useful.
A good alternative than manually matching with tokens:
Having profileend() have an optional parameter ‘label’ that may be nil.
E.g. profilebegin("A") profileend(x) will error if x and x ~= "A"
(don’t need to pass it an argument, but when you do, the labels should match)
While the token system might be annoying to match, the label system should be much easier.
(I mean, if you start profiling “multest”, you have to choose a spot to end the profiling for “multest”)
(Of course we can all make this ourselves, so eh, not a real need for it)

Den_S · September 17, 2016, 12:13pm

I’d still go for choice 1, it’s so much easier to use it and you do not have to keep track of annoying tokens all the time.

I regard this as acceptable too:

debug.profilebegin('epicmath')
-- magic
debug.profileend('epicmath')

It’d avoid the need to store and pass tokens, yet forces some “non-invasive” clarity to your profileend-call.

Disadvantage: renaming the sections is a pain (which is why I still prefer option 1).

einsteinK · September 17, 2016, 12:18pm

I don’t see renaming sections happening that much.
Also debug.profileend() would work, acting as choice 1.

buildthomas · September 17, 2016, 12:24pm

Yes but the point of the string in profileend would then be

Den_S · September 17, 2016, 12:24pm

Yea, that’s true. It’s either way not that bad if you have to rename sections (so still acceptable and I would probably use it myself for clarity). It’s basically the same pain you’d have if you would use option 2 and wanted to rename the token variable.
The calls should be somewhat “nearby” (in LOC) to each other anyway.

Basically making the closing string optional? That’d make the API so vague in my opinion.

einsteinK · September 17, 2016, 12:24pm

Gotta stay consistent

Den_S · September 17, 2016, 12:26pm

Well, that’s pretty important in my opinion. It’s like another tabs vs spaces!