Deleting Cloned Items On Large Scale

Hey!
I have been working on a build for my project. Unfortunately, I’m a relatively sloppy builder, and I abuse the duplication (Ctrl D) function all the time. As a result, I often end up making many unnecessary duplications that I later have no recollection of, and when I say “many”, I truly mean many. Sometimes, I’ll find that I’ve made up to 4-5 copies of the exact same instance without my knowledge.

This is all fine and dandy in small projects, but it becomes a problem when my builds become more massive—a problem I’ve largely ignored until this point in time. Now, I have a large build with potentially many duplicated clones, and I want to get rid of them.

Solving this problem seemed to be most plausible through scripting. Initially, I wrote the following code and pasted it into the command bar:

local saved = {}
local start = tick()
local items = 0
for index, object in pairs (workspace:GetDescendants()) do
	if object:IsA("BasePart") then
		for index2, object2 in pairs (saved) do
			if object.CFrame == object2.CFrame and object.Size == object2.Size then
				object:Destroy()
				items = items + 1
				continue
			end
		end
		table.insert(saved, object)
	end
end
local delta = tick() - start
print("Number of items: ", items)
print("Time elapsed: ", delta)

Essentially, I loop through each instance in the workspace, check if it’s a BasePart, and if it is, loop through a table of saved instances to see if it matches up with any of them in terms of CFrame and size (implying a cloned duplicate). If it does, I delete the object. If it does not, I add it to the saved table. I also added some tracker variables for my own curiosity.

The obvious problem with this is that this is an O(n^2) time complexity algorithm and extremely inefficient for the task at hand. I think I have about 50,000 parts in my workspace, so clearly this solution will not work. As expected, when I tried to run the code, studio stalled and crashed. Also, there is a possibility that my code is wrong, so if you notice anything, let me know.

Do you guys have any recommendations on what I should do? Is there a better search algorithm for my unsorted saved table?

Actually, one potential idea I have is to use a custom data structure for the saved table, instead of slapping each instance to the end of the table each time. This data structure should be one that allows for quick element adding and searching, so I’m thinking maybe something like a hash table? I would like to hear some other ideas though, before I try something daunting like that. I’m not even sure it would work.

I forgot an exact way of doing this, but there was a post that had said something about finding duplicated items in the same space.

Getting some ideas from this post here also:

You could make a bool to see if the item is infact a duplicated item in a pcall function. It’ll go through the part and see if the part matches some of the properties as the other part.

Something like this:

local cloned = pcall(function()
    
    while not cloned do
    
        if object2.Name == object.Name then
            if object2.CFrame == object.CFrame then
                if object2.Size == object.Size then
                    if object2.Material = object.Material then
                
                    -- A LOT OF MORE CODE HERE BUT THAT MAKES YOUR SCRIPT MESSY
                    else
                    cloned = false
                else
                cloned = false
            else
                cloned = false
            end
        end
    end
end)

Like that note that I just put within the function says, this would indeed be messy if it was done like that. So, I was thinking to still do the pcall function, but instead have it read the properties itself. Making a table of the possible properties of that part, or type of part. Afterwards, if all of the properties match clone would be set to true and therefore if it’s true it’ll Destroy that part.

Which, now that I’m saying it, I think I should make a plugin/module for getting the properties of parts.

1 Like

You should also include a wait() somewhere in your loop.

2 Likes

Recommendedly, the very first line inside of the loop.

1 Like

Thought about doing that but my assumption was that the entire process would then take way too long. In the worst case scenario, the wait will run 50000^2 times, and given that wait() should hypothetically run for 29 milliseconds, this would imply that the wait functions alone will take over 70 million seconds, which is certainly not ideal.

The wait should only run 50k times, not sure I understand where the power comes from. If you place it at the entry to the loop, it shouldn’t create such ludicrous results. It should amount to roughly an hour, worst case scenario. Best case probably ~20 minutes

1 Like

Good catch. I was thinking about putting the wait function in the second loop, not the first. I’ll try it out when I get home, though it may still run a lot longer as there are added processes along with the wait function. I’ll let you know how it goes. Thanks!

Seems to be working. Thanks again!