Time complexity of string.sub(str, i, i)?

What is the time-complexity of string.sub when singled around a specific character? I’d assume that it would start at the specified index, and only continue through to the third argument (making it effectively constant in the case of just one character). However, I can’t be sure, as I’m unable to find documentation regarding this anywhere. If string.sub has more overhead than expected, is there any more efficient means of retrieving a single character from a string?

In asking, I’m not so much looking for an ‘I think’ answer, so much as I am an ‘I know because’ answer. So if anyone could help clear this up for me, it would be much appreciated. Thanks!

1 Like

The performance depends almost not at all on how long it takes to index (constant time) or copy the since character (also constant time), and more on how the memory gets allocated in order to store the result, which you can’t control. What application do you have where you are concerned about the performance?

3 Likes

Agreed and the memory allocation part is faster than usual anyways:

@OP The garbage collector might choke if you split a 10kb string into characters but sure the time complexity is constant

2 Likes

If I wish to iterate over a string using string.sub, the time complexity of that function, is something that I would very surely be better off knowing. As, if the operation depends on the size of the string, or the placement of the characters therein, it could mark the difference between lots of time, and very little time. You claim that it is constant–which I assume it to be, but you do not explain why you know that it is constant, and that explanation is what I’m looking for.

1 Like

I forbid you from knowing or caring about the mathematical answer to that. Test your application and if isn’t fast enough, then we can think of some ways to optimize it.

2 Likes

I understand your heart is in the right place in saying this, but I think you’re misinterpreting my inquiry. I’m a scripter of over a decade, and I just want to better understand this function that is used very frequently, so that I know with confidence how to design my algorithms more intelligently.

1 Like

Indexing an array, which is probably how strings are stored in this implementation, is constant time. Copying characters is linear time. Allocating memory to store the result is both difficult to place a bound on and also not deterministic. As such you can assume sub() is linear time, O(n), for all practical purposes except for extreme cases where the garbage collector performance becomes more important.

2 Likes

I would assume this to be true, so long as string.sub is indeed indexing the data as it would an array. However, depending on the implementation of strings and of this function, the process may not be identical to that of arrays, hence why I ask for details regarding string.sub as an algorithm. I’m not looking for ‘probably’, as assuming is what I have been doing this far on my own.

I did a lot of googling, and I couldn’t find the answer anywhere. So, I created this thread, in hopes that someone knowledgeable on the innerworkings of Luau would be able to explain, and perhaps point me in the direction of some sources. I seek an answer that suggests credible observations regarding the implementation of string.sub, and its time complexity. Hopefully that can be understood.

1 Like

doubt you will be able to reduce the runtime for this function thought, and even if you did (you wont) it would be some really useless micro optimization that would probably have zero effect.
(if this is a just for fun project then nvm what I said above :stuck_out_tongue: )

This could help v

2 Likes

So, I ran some time-stamping. I created two random strings, each comprised of randomly generated uppercase characters. The first string had a length of 100, whereas the second string had a length of 1,000,000. Using string.sub, I indexed them both in looping reverse order 10,000,000 times. There was negligable difference between the two, suggesting that the operation is at least mostly independent of string-length.

Timstamping code
task.wait(10)

local BuildRandomString = function(Length)
	local StringMake = {}
	for Index = 1, Length do
		table.insert(StringMake, string.char(math.random(65, 90)))
	end
	print(#StringMake)
	return table.concat(StringMake)
end

local Timestamp = function(StringLength, Iterations)
	local RandomString = BuildRandomString(StringLength)
	local Iteration = 0
	local Index = 1
	local Timestamp = os.clock()
	while true do
		Index -= 1
		if Index == 0 then
			Index = StringLength
		end
		local Char = string.sub(RandomString, Index, Index)
		Iteration += 1
		if Iteration == Iterations then
			break
		end
	end
	return os.clock() - Timestamp
end

print(Timestamp(10^2, 10^7) * 10^2)

task.wait(10)

print(Timestamp(10^6, 10^7) * 10^2)

From this, I feel a decent level of confidence in asserting that the operation is, for practical purposes, constant. If anyone has further insight to add, please do so.

3 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.