Help needed in brainstorming a good pathfinding system

After messing around with Blockcast and Raycast, I finally got something that looks pretty good. I can mess around with the tolerances and number of Raycasts to fill any holes that I come across, but I think this is cohesive enough to start thinking about the next steps in the process.


@CompilerError the one thing that I don’t understand yet is how I’m going to store all of this data and bypass the part creation process, which would take hours on a huge map. I’d rather the system scanned the map every time the game started instead of having it scan each map in a private studio session and pasting the resulting data into module scripts, as I feel the former would be much more flexible and would make it faster to develop maps for the later game. I don’t understand what the data would look like when it was stored, or how different nodes would be able to be connected to one another. And I don’t get how this system will work without any part creation, seeing as it relies on :GetPartsinPart(), which requires a reference part as one of its parameters. Could I speed this process up by having each child node run its scans in a separate thread? I feel like that would get out of hand and may be impossible.

Another thing is this:

I still run into this pattern when going underground. If I want to define “surface nodes” or whatever as “any node that doesn’t have another one directly on top of it,” there’s no easy way to do that as is, and if I want caves, then I can’t define it as “any node that doesn’t have another one anywhere above it.” The problem is, I can’t Raycast to fill these spots as they’re completely buried underground, so what’s the best way of detecting them? Is any of this even necessary?

1 Like

You only need a handful of parts, the green parts are just visual. Ideally you’d have a part for each division in the octree, so parts of size 512x512x512, 256x256x256, 128x128x128, 64x64x64, 32x32x32, 16x16x16, 8x8x8, 4x4x4, 2x2x2. You can reuse that set of parts infinitely many times, the green blocks in my example are only used for visualization purposes. Once the scan is complete, the scanner parts above can simply be removed, or just stored in nil for later or something.

I still run into this pattern when going underground. If I want to define “surface nodes” or whatever as “any node that doesn’t have another one directly on top of it,” there’s no easy way to do that as is, and if I want caves, then I can’t define it as “any node that doesn’t have another one anywhere above it.” The problem is, I can’t Raycast to fill these spots as they’re completely buried underground, so what’s the best way of detecting them? Is any of this even necessary?

This is where a grid data-structure comes in handy, if you keep track of “cells” that are occupied by parts or terrain then you you just rescan your grid after the initial octree scan is complete, you can then check for each cell upward how ever many cells you want, say 3 cells above (6 studs in my example). If any of those three cells are occupied then you know it isn’t a surface, in this case if the third cell is occupied then you can remove the bottom 2 cells immediately. This process will leave you with only cells that don’t have another cell above them within 3 cells which should work totally fine in a cave system.

Remember that the green parts are totally unnecessary and are only used for visualizing the end result of the scan. I think you can do this at runtime. You’ll just need to some pretty decent optimization and probably multithreading, start in areas that actually need to be scanned (where NPCs are) and then asynchronously scan the rest of the map. Or optionally just scan as needed if your scanner is fast enough.

I’m definitely trying to be able to manage doing it at runtime. I can hide anything around a minute behind a loading screen; the problem I have right now (which will probably be drastically lessened by switching to pure data) is that it takes about the same amount of time to render an image in Blender as it does to test this system (half an hour minimum), even if I swap out all the node size adjustments in the script with pre-sized nodes in server storage.

What I’ve been trying to do for all of today is test the accuracy of the terrain detection while messing around with coroutines (which I am still very rusty on). What ends up happening is that the subdivision process finishes for the root node before the whole detection process is complete, which is an issue as I’m trying to wait until it’s done before I thin the data down to just the surfaces (something else that also requires a wait(), adding even more onto the time it takes to test this). I need a way to run all the calculations at the same time without crashing Studio and I need a definite way to know when all the calculations are finished so that the script knows when to move onto the next step in the process.

There’s also another problem:



These images are from when I was testing a minimum node size of 8x8x8, so the missing gaps in the terrain are way larger than they appear. Because the process took so long at a minimum node size of 2x2x2, I had no idea these gaps would be possible. I can’t think of what could be causing them, because not only am I using :ReadVoxels() but I’ve got a Blockcast and a Raycast running both ways on each of the XYZ axes for every node as depicted in the bad diagram pictured below. I see no reason why these spots are being missed when from what I’ve seen of the detections elsewhere they clearly shouldn’t be. I only go 90% of the way through so that I don’t catch collisions that sit entirely on the border and count it as a collision when it really isn’t.

I’m willing to post whatever code may be necessary to diagnose this problem. If there were a good way to do it that didn’t result in false positives, I would create a way to “patch” holes in the grid after scanning for data and thinning it out.

I’ve been attempting to implement this for the past two weeks and the problem with having to wait a long time for the test to complete has been the one thing holding me back. For instance, I found out, after converting the nodes from parts in physical space to Vector3s in a table and changing the scan area to half of what it was before, that the number of raw nodes to cut down starts at +44,000. I’ve had to implement waits in the function that compares all the nodes’ heights with one another because if I run the script without them, it will time out even though no parts are being created and it’s just raw data being looked at in the script. If I chunk it into groups of 1000, it helps, but the function I ran will only work up until around the 25,000th node, after which it stops comparing the nodes and automatically assumes all the nodes left are surface nodes. I posted the function below.

-- finds the surfaces of the octree
local function findSurfaces()
	print("activated")
	local sortList = nodes
	table.sort(sortList, function(a, b) return a.Y < b.Y end)
	local surfaceList = sortList
	for i, node in sortList do
		if i % 1000 == 0 then task.wait() end
		for j, neighbor in sortList do
			if node == neighbor or node.Y == neighbor.Y or node.X ~= neighbor.X or node.Z ~= neighbor.Z then continue end
			if node.Y + 2 == neighbor.Y or node.Y + 4 == neighbor.Y then print(i, j, neighbor.Y) table.remove(surfaceList, table.find(surfaceList, node)) break end
		end
	end
	print(#surfaceList)
	for _, node in surfaceList do
		task.wait()
		local surfaceNode = cell002:Clone()
		surfaceNode.Parent = octree
		surfaceNode.Position = node
	end
end

The problem with holes in the scan that I mentioned in the previous post was fixed with more raycasts at the larger node sizes.

I’m going to post the recursive function I use to scan the nodes and then subdivide, because I still have the problem of one level of the recursion finishing before the lower levels with coroutines. I have no way of telling other parts of the script when certain levels of recursion are done in that case, meaning that if I run the surface-parsing function after the first level of recursion with the root node without a decently-sized wait in between, there’s a good chance that the surface-parsing function will be looking at an incomplete list of nodes.

-- recursive function to check for collisions and divide node
local function checkAndDivide(node)
	-- debugging wait()
	task.wait()
	local colliding = false
	local overlaps = workspace:GetPartsInPart(node)
	local nodeRegion = Region3.new(node.Position - (node.Size * 0.5), node.Position + (node.Size * 0.5))
	local materials, occupancies = workspace.Terrain:ReadVoxels(nodeRegion, 4)
	-- checking parts collisions
	for _, part in overlaps do
		if part.Parent ~= octree then colliding = true end
	end
	--checking terrain collisions (>= 4x4x4)
	for _, x in materials do
		if _ == "Size" then continue end 
		for _, y in x do
			for _, z in y do
				if z ~= Enum.Material.Air then colliding = true end
			end
		end 
	end
	-- checking terrain collisions
	if colliding == false then
		if cubecast(node.Position, node.Size.Y) ~= nil then colliding = true end
		if centercast(node.Position, node.Size.Y) ~= nil then colliding = true end
		if node.Size.Y > 2 and perimeteranddiagonalcast(node.Position, node.Size.Y) ~= nil then colliding = true end
	end
	-- divisions
	if colliding and node.Size.Y > 2 then
		local quarter = node.Size.Y * 0.25
		local childPos = {
			node.Position + Vector3.new(quarter, quarter, quarter),
			node.Position + Vector3.new(quarter, quarter, -quarter),
			node.Position + Vector3.new(quarter, -quarter, quarter),
			node.Position + Vector3.new(quarter, -quarter, -quarter),
			node.Position + Vector3.new(-quarter, quarter, quarter),
			node.Position + Vector3.new(-quarter, quarter, -quarter),
			node.Position + Vector3.new(-quarter, -quarter, quarter),
			node.Position + Vector3.new(-quarter, -quarter, -quarter)
		}
		for i = 1, 8 do
			local childNode = findChildNode(node):Clone()
			childNode.Parent = octree
			childNode.Position = childPos[i]
			if node.Size.Y <= 16 then spawn(function() checkAndDivide(childNode) end) else checkAndDivide(childNode) end
		end
		node:Destroy()
	elseif colliding and node.Size.Y <= 2 then
		table.insert(nodes, node.Position)
		node:Destroy()
	elseif colliding == false then	
		node:Destroy()
	end
end

If there’s no way to tell when a certain level of recursion is done, then I can’t do what you’re suggesting with spawning in one cell per size node and moving them around. The best I can do is create and delete several iterations of each size node. I hope this all makes sense.

This way of pathfinding actually reminds me of a own method I’ve wanted to try.
I once got stuck with pathfinding myself and I had to brainstorm something that was fast and simple.

So I came up with the following solution.

Basically you shoot a bunch of raycast in the general direction of the target but with maybe slightly random rotations or rotations guided by normals from any obstacles that raycasts may hit.

More raycasts may increase accuracy but at the cost of performance.

Now, this form of pathfinding might not solve complicated mazes since it’s essentially kinda just walking along / following walls around until an opening is found.

BUT it IS pretty light weight, or at least should be!
The performance cost is very low afaik.

Now if you want to be able to navigate around more complicated obstacles, maybe where U-turns are necessary, you can adjust the algorithm to keep following or walking around the obstacle until it’s closer.

Hey and since you’re using this for zombies, maybe having non-perfect pathfinding is actually more realistic.

This method of pathfinding should also works in real-time without clunky tables or arrays of waypoints that you need to update when the path changes so that’s out of the way as well.

But this is a solution I came up with myself, but it’s similar to flow field I think?

1 Like

This method reminds me of tutorials I’ve seen on Boids, mentioned earlier in this thread. I definitely agree that this is the best method for efficiency, but I want my game to be intense and not easily exploitable, so that requires smarter AI. I might incorporate this in some capacity for overflow enemies, but if I make it standard for every enemy, it will be very easy to take advantage of the poor AI and that would completely destroy the tension when it comes to fighting them.

1 Like

I myself might plan on using this pathfinding method for NPCs in one of my own projects because of how efficient this method is.

My pathfinding solution is basically intended to be used as the default way for NPCs to navigate (if there are many) along with an error checking function.

If NPCs end up getting stuck too often or are unable to get out of a sticky situation, the pathfinding algorithm can be temporarily switched to an alternative one that’s perhaps a bit heavier but is just temporary to get stuck NPCs back to chasing the player.

You can check if a NPC is taking too long or can’t find it’s way by recording what it’s closest distance to the target was and if that distance is ever shrinking within a set amount of time.

My pathfinding concept was originally intended to be used for not-too-smart zombies and possibly sword fighters so they can keep following or strafing around the player while dodging obstacles along the way.

It’s not optimized for solving complex mazes and whatnot.
And to be fair, I think it’s quite difficult to find a pathfinding algorithm that is both fast AND provides flawless navigation.

But it gets the job done if you want a horde shooter or such where the enemies can walk around and avoid bumping into stuff.
With some alterations made to this simple and cost efficient design you could easily make NPCs jump or climb as well.

Just keep in mind that NPCs might walk in circles or take very long to navigate if they run into dead ends since they’ll keep following the walls but this can be solved with what I mentioned before.

Oh also, I REALLY recommend using Parallel Luau for your AI logic as it can DRASTICALLY speed up complicated game logic.

I for instance have a 16-core CPU which greatly benefits from multi-threading, but even with just a 4-core CPU, parallel Luau does absolute wonders.
You can practically almost do 4 - 8 times the amount of work.

Distributing AI logic over multiple CPU cores is heavily recommended.

2 Likes

I’m still having issues with this. For the most part, I’ve figured out that once I have a grid together, I can assign each node “GridX,” “GridY,” and “GridZ” variables that show where each node is in the 3D grid and allow me to do neighbor checks for the pathfinding. I don’t have to worry about whether a node is walkable because all of the nodes left after the surface parsing function are, by definition, walkable. What I’m still having trouble with is all of the functions surrounding the generation of the grid.

For an area of about 256 studs cubed, it takes just over a minute for the process of discovering the nodes and then thinning them to just the surfaces to finish (it doesn’t even fully complete the process, as I’ll explain later) and during the process, the frame rate dips several times into the single digits. The way I’m doing things doesn’t appear to be scaleable to the size map I want, considering I want to do this for chunks of 512 studs cubed several times over and it’s already responding to a much smaller load in this way.

The two biggest roadblocks to fixing this at the moment:

  1. There is no way for the script to know when the octree collision detection process is complete, or (especially) when different levels of the detection process are complete. If I run a coroutine to do only certain levels of subdivisions (ex. in a 512 cube, the eight 256 cubes plus the sixty-four 128 cubes) simultaneously, the script will think that the entire process has been completed as soon as all eight subdivisions are created, and not when all scanning has completed like it’s supposed to. This makes it impossible for me to do what @CompilerError suggests regarding moving the node template around and scanning inside without ditching coroutines entirely, which would drastically increase the time it takes to complete the scan.
  2. The surface parsing function will stop analyzing nodes halfway through the list of nodes (for a 256 cube I get 44,100 nodes and it stops at node #25,574) and assume the whole process is complete. This is different from erroring, as the script doesn’t stop, it just skips ahead to the visualization process and returns this:



    It doesn’t delete any actual surface nodes, from what I’ve seen, but it also doesn’t delete nodes that shouldn’t be surface nodes, like the thousands of nodes underground in the third picture. I have to figure out what’s going on here so I can fix it, but I don’t know where to start because no errors are being returned. Am I just processing too many nodes? How would I even fix that?

I’ve included code for both in the post this is replying to. I hope I can get some pointers on where to go from here because these hangups are the only thing preventing me from breezing through creating the rest of the system.

Take a look at this example place I made a while back, it might help to show my thought process. I never added any logic to handle terrain but I think the logic is somewhat repeatable for terrain with some caveats. Here’s the place file:
OctreeSurfaceScanner.rbxl (408.6 KB)

Let me know what you think here. I’m a bit disconnected from where you are at this point, not sure why/how your process takes over a minute to scan 256^3 studs where mine takes <2s to scan a full 512^3. Obviously you’re doing some extra processing for terrain, but I don’t think that adds up still.

If you’re wanting to use many threads and the threads can’t share the scanner parts then make the set of scanner parts per thread.

Just took a look at this. I understand what you’re doing now with regards to creating the parts in a script and placing them in the workspace to be moved around. I was just placing them in Server Storage and then attempting to take them out and move them around, to no success. I’ll definitely be borrowing that method for my own implementation, as well as the way you organize the grid data. My method involved one big list of Vector3s 44,100 entries long, and that might’ve been the cause of some problems.

I did have a few questions looking at the code. I don’t understand why you’re starting the recursion at octree_size / 2 (256) and why the scan doesn’t turn out right if I remove the division and run it at 512. There also appears to be an unused table called surfaces that I don’t know the purpose of. Did you mean to store the surface node data in there after the surface scan was complete? And could you explain how the if not [x][y + 2] or not [x][y + 2][z] conditional in the surface scan works? Wouldn’t it be looking for [x][y + 1]?

One long array is generally more efficient than a 3D array, like I’ve chosen here, so I don’t think that would be a problem for you, assuming that you properly insert indices into this array based on where they are in the grid you can do O(1) lookups to find them, either 1D, 2D, or 3D array allows for that. Ideally you would continue with the 1D array and do O(1) lookups on that, it would be more efficient than my 3D array.

I don’t understand why you’re starting the recursion at octree_size / 2 (256)

local octree_size = 512;
local octree_depth = 9;

local last_size = octree_size * 2;

for i = 1, octree_depth do
	local size = last_size / 2;
	
	local p = Instance.new('Part');
	p.Size = Vector3.one * size;
	p.Position = Vector3.new(0, 10000, 0);
	p.Anchored = true;
	p.CanCollide = false;
	p.CanQuery = false;
	p.CanTouch = false;
	p.Transparency = 1;
	p.Parent = workspace;
	
	test_blocks[size] = p;
	
	last_size = size;
end

This does start at 512, it begins by multiplying the original octree_size variable by 2 and then it generates the parts.

and why the scan doesn’t turn out right if I remove the division and run it at 512.

I’m not sure what you’re trying to change here. The idea here is that we’re generating a part of size at each of the 8 different sizes: 512, 256, 128, 64, 32, 16, 8, 4, 2.

There also appears to be an unused table called surfaces that I don’t know the purpose of. Did you mean to store the surface node data in there after the surface scan was complete?

Yeah, that’s likely what the purpose of that was. I didn’t go as far with this example as I had originally planned so there’s some weird useless code in here, apologies.

explain how the if not [x][y + 2] or not [x][y + 2][z] conditional in the surface scan works? Wouldn’t it be looking for [x][y + 1] ?

It’s +2 because this is based entirely on world-space coordinates, the smallest cell size is 2, 2, 2 in this example, so this is a bad, hard coded check to see if there’s a cell above the one we’re currently looking at. Ideally, this would be like some + MINIMUM_CELL_SIZE or something.

I’m referring to the first function call for local function recursive_scan(position, size, depth) after the first timer has started (recursive_scan(Vector3.new(0, 0, 0), octree_size / 2);). octree_size / 2 would be 256, and I couldn’t find anything in the code of the function that reverts that to 512 for the first iteration. I even scanned the size of the scan area at runtime and it appears to be 256. The weird thing is that if I remove the division by two at the end of octree_size / 2, the frame rate dips significantly even after the scan is finished and the scanner breaks in a strange way:

This makes sense, now that you mention it. The more I look at it, I like how clean it is compared to mine. A 3D array may be less efficient, but it would probably work better in my case since the amount of nodes to search through is so high. Coding local MINIMUM_CELL_SIZE = octree_size / (2 ^ octree_depth) or whatever would be a great idea if you wanted to run a scan for larger or smaller nodes for larger or smaller humanoids, as well.

Did you organize the data as [x][y][z] for readability’s sake? I think it would be faster to organize it as [x][z][y] since you’d only have to make one check in the conditional (if not [x][z][y + 1] then). Did you organize it into a dictionary format (with the x, y, and z values as keys for each dimension of the array) so that it would automatically sort itself and you wouldn’t have to use table.sort()?

Ah, I’ll have to take another look then. I’m sure it’s an easy fix, if you would like to correct it for your uses!

Did you organize the data as [x][y][z] for readability’s sake? I think it would be faster to organize it as [x][z][y] since you’d only have to make one check in the conditional (if not [x][z][y + 1] then ).

My brain works in x, y, z so that’s why I organized it that way. You’re correct, it would be more efficient, although I suspect by merely nanoseconds across that 512x512x512 scan. Definitely feel free to organize it however you like in your implementation.

Did you organize it into a dictionary format (with the x, y, and z values as keys for each dimension of the array) so that it would automatically sort itself and you wouldn’t have to use table.sort() ?

Yeah, I wanted free O(1) look ups. This comes at the cost of more memory use, although I would imagine it’s miniscule here. Ideally you’d never be sorting an array of this size, that might explain why yours is so much slower actually. A really big part of doing this well is going to be in the data structures you choose to use, so definitely be giving that some thought going forward.

The downside to a dictionary is in how a CPU does RAM queries on it. When you write to a dictionary, the data is broken up in memory and cannot be efficiently queried for predictive operations (which modern CPUs are REALLY good at). For example, if you iterated over the entire dictionary frequently, it’s better to have a one dimensional array that’s index-value pairs are sequential. A modern CPU understands that it’s better to get the entire array from RAM and cache it in its various caches. If you’ve ever profiled a loop running over a dictionary and a loop running over a flat array you’ll see just how significant this difference is.

I actually profiled this a while back, here were my results:

--!native

local list1 = {}
local list2 = {}

for i = 1, 100000 do
    local random = math.random()
    table.insert(list1, random)
    list2[i * 2] = random
end

game:GetService('RunService').Heartbeat:Connect(function()
    debug.profilebegin('list 1')
    for index, random in list1 do
        
    end
    debug.profileend()
    
    
    debug.profilebegin('list 2')
    for index, random in list2 do

    end
    debug.profileend()
end);


Notice in the list2 tests how there’s those REALLY big chunks of work? That’s the CPU fetching more data to iterate over the dictionary, since its values are segmented in memory while list1 is just a solid block of work being done, since the CPU was able to grab the entire array and store it in its cache.

I know this is a bit off topic, but I thought it was really cool. This is actually one of the many reasons that the gaming industry is starting to be more interested in Entity Component Systems.

I just implemented changes according to your implementation and it’s incredible how much smoother it runs. All the changes also somehow managed to fix the surface-parsing problem, as well. Definitely going to be taking notes on the lessons I learned here for future projects.

Running my implementation still takes a bit longer than yours, but I’m fairly confident that’s because of all the checks I’m performing to detect terrain. Those extra checks force me to add task.wait()s into the recursive scan at several points, because if I don’t, the script will exhaust its allowed execution time. Running at 512, mine takes roughly 4 minutes to scan collisions and parse for surfaces, while your implementation takes roughly 6 seconds. If I run at 256, it finishes in 15 seconds, while yours finishes in 2 seconds. The vast majority of the time is spent in scanning for collisions and the time it takes to parse for surface nodes in each area takes roughly the same amount of time for both implementations, depending on the size of the scanning area (a couple of seconds tops).

The one thing I can think of doing right now is to reduce the amount of time spent during the collisions phase doing checks. I’ve already made significant efforts in the newly written check() function to break out of loops and exit out of the function if one collision is found, but I’m not sure how effective it is. What I might do is run a benchmark on each type of check that I have and then try to optimize them for different size nodes so it checks as efficiently as possible, though I don’t know how much that will speed things up.

Don’t ask me how I figured this out, but changing the initial depth value in the recursive_scan() function from 1 to 0 allows you to pass through the actual octree size without dividing by 2 or getting the weird spacing between nodes at the end.

1 Like

How are you currently doing the terrain scans? Would you be against posting that code? I’m wondering if maybe we can’t squeeze out a bit more performance there.

Super cool to see it scanning properly though! Congrats!

2 Likes

I was able to significantly thin down the number of raycasts I was doing (before there were several that were straight-up unnecessary), and now I’m able to remove all task.wait()s in the entire script if I’m scanning at 256 and the whole process takes about 8.5 seconds. It still times out at 512, and above, so here is the check() function I use:

-- function to check whether a node is colliding or not
local function check(node, position, size)
	-- checking parts collisions
	local colliding = #workspace:GetPartsInPart(node, paramsOL) > 0
	if colliding == true then return colliding end
	-- checking terrain collisions (>= 4x4x4)
	if size >= 4 then
		local materials, occupancies = workspace.Terrain:ReadVoxels(Region3.new(position - (Vector3.one * size * 0.5), position + (Vector3.one * size * 0.5)), 4)
		for _, x in materials do
			if _ == "Size" then continue end
			for _, y in x do
				for _, z in y do
					if z ~= Enum.Material.Air then colliding = true return colliding end
				end
			end
		end
	end
	-- checking terrain collisions
	colliding = cubecast(position, size) ~= nil or centercast(position, size) ~= nil or node.Size.Y > 2 and perimeteranddiagonalcast(node.Position, node.Size.Y) ~= nil
	return colliding
end

The first test uses workspace:GetPartsInPart() to scan for parts. If nothing comes up, it moves to the next test which uses workspace.Terrain:ReadVoxels() to scan for terrain in all nodes 4 or larger. If both those tests fail, the last test all nodes go through is to execute two custom raycasting functions cubecast() and centercast(), with nodes 2 or smaller executing a third function if those fail called perimeteranddiagonalcast(). All three custom functions are listed below.

cubecast():

-- completes blockcasts through the center
local function cubecast(position, size)
	local result
	local posL = CFrame.new(position + Vector3.new(size * 0.45, 0, 0))
	local posR = CFrame.new(position - Vector3.new(size * 0.45, 0, 0))
	local posU = CFrame.new(position + Vector3.new(0, size * 0.45, 0))
	local posD = CFrame.new(position - Vector3.new(0, size * 0.45, 0))
	local posF = CFrame.new(position + Vector3.new(0, 0, size * 0.45))
	local posB = CFrame.new(position - Vector3.new(0, 0, size * 0.45))

	local desL = CFrame.new(position - Vector3.new(size * 0.4, 0, 0))
	local desR = CFrame.new(position + Vector3.new(size * 0.4, 0, 0))
	local desU = CFrame.new(position - Vector3.new(0, size * 0.4, 0))
	local desD = CFrame.new(position + Vector3.new(0, size * 0.4, 0))
	local desF = CFrame.new(position - Vector3.new(0, 0, size * 0.4))
	local desB = CFrame.new(position + Vector3.new(0, 0, size * 0.4))

	local sizeX = Vector3.new(size * 0.1, size * 0.9, size * 0.9)
	local sizeY = Vector3.new(size * 0.9, size * 0.1, size * 0.9)
	local sizeZ = Vector3.new(size * 0.9, size * 0.9, size * 0.1)

	result = workspace:Blockcast(posL, sizeX, desL.Position - posL.Position, paramsRC) if result ~= nil then return result end
	result = workspace:Blockcast(posR, sizeX, desR.Position - posR.Position, paramsRC) if result ~= nil then return result end
	result = workspace:Blockcast(posU, sizeY, desU.Position - posU.Position, paramsRC) if result ~= nil then return result end
	result = workspace:Blockcast(posD, sizeY, desD.Position - posD.Position, paramsRC) if result ~= nil then return result end
	result = workspace:Blockcast(posF, sizeZ, desF.Position - posF.Position, paramsRC) if result ~= nil then return result end
	result = workspace:Blockcast(posB, sizeZ, desB.Position - posB.Position, paramsRC) if result ~= nil then return result end
	return result
end

centercast():

-- completes raycasts through the center
local function centercast(position, size)
	local result
	local posL = position + Vector3.new(size * 0.5, 0, 0)
	local posR = position - Vector3.new(size * 0.5, 0, 0)
	local posU = position + Vector3.new(0, size * 0.5, 0)
	local posD = position - Vector3.new(0, size * 0.5, 0)
	local posF = position + Vector3.new(0, 0, size * 0.5)
	local posB = position - Vector3.new(0, 0, size * 0.5)
	
	local desL = position - Vector3.new(size * 0.4, 0, 0)
	local desR = position + Vector3.new(size * 0.4, 0, 0)
	local desU = position - Vector3.new(0, size * 0.4, 0)
	local desD = position + Vector3.new(0, size * 0.4, 0)
	local desF = position - Vector3.new(0, 0, size * 0.4)
	local desB = position + Vector3.new(0, 0, size * 0.4)
	
	result = workspace:Raycast(posL, desL - posL, paramsRC) if result ~= nil then return result end
	result = workspace:Raycast(posR, desR - posR, paramsRC) if result ~= nil then return result end
	result = workspace:Raycast(posU, desU - posU, paramsRC) if result ~= nil then return result end
	result = workspace:Raycast(posD, desD - posD, paramsRC) if result ~= nil then return result end
	result = workspace:Raycast(posF, desF - posF, paramsRC) if result ~= nil then return result end
	result = workspace:Raycast(posB, desB - posB, paramsRC) if result ~= nil then return result end
	return result
end

perimeteranddiagonalcast():

-- completes raycasts across the edges and through the diagonals
local function perimeteranddiagonalcast(position, size)
	local result
	local corners = {
		position + Vector3.new(size * 0.5, size * 0.5, size * 0.5),
		position + Vector3.new(size * 0.5, size * 0.5, -size * 0.5),
		position + Vector3.new(size * 0.5, -size * 0.5, size * 0.5),
		position + Vector3.new(size * 0.5, -size * 0.5, -size * 0.5),
		position + Vector3.new(-size * 0.5, size * 0.5, size * 0.5),
		position + Vector3.new(-size * 0.5, size * 0.5, -size * 0.5),
		position + Vector3.new(-size * 0.5, -size * 0.5, size * 0.5),
		position + Vector3.new(-size * 0.5, -size * 0.5, -size * 0.5)
	}
	for i = 1, 8 do
		for j = 1, 8 do
			if i == j then continue end
			result = workspace:Raycast(corners[i], corners[j] - corners[i], paramsRC) if result ~= nil then return result end
		end
	end
	return result
end

The first two functions don’t go all the way through to the other side to prevent false positives from hitting something that isn’t contained by the node but merely touches up against it. The first two raycast tests run for all size nodes because if they don’t, it leaves massive holes in the grid that I mentioned in an earlier post (they look small in the photos but each node is size 16 or 32, I believe):

Admittedly, I was initially trying to make perimeteranddiagonalcast() into just perimetercast() but I got confused trying to write out what connections constituted the edges of the cube, got lazy, and just decided to test all vertices against all other verticles.

Finally, here’s some data I gathered about running different combinations of functions (scanning 256):

(perimeteranddiagonalcast for <= 2x2x2)
cubecast + centercast: 8.19 sec, 32382 nodes
cubecast: 5.86 sec, 31580 nodes
centercast: 6.08 sec, 26357 nodes

neither (perimeteranddiagonalcast for everything): DNF, 36203 nodes (w/ task.wait())

I wrote a terrain scanner for mine, my terrain scan managed a 512x512x512 area in ~0.7s

I took a much simpler approach and treated terrain scan as a separate task entirely, and didn’t perform it as part of the octree, since unfortunately there’s not really a good way to determine if a region is empty (since it’s filled with air when its empty) we might as well just scan the whole 512x512x512 region.

I took a bit of a different approach and only used ReadVoxels (no raycasting at all) for terrain. I’m quite certain that there’s quite a bit of performance to be regained just in improving how the surface scan is performed… but that’s a tricky optimization and right now the surface scan logic is very simple. I will say that my terrain scan is likely less accurate than yours probably is, but I’m fairly happy with it.

If I disable the the debug blocks I can do the full 512x512x512 scan (terrian and buildings) in ~1.3s which is pretty reasonable for a high definition scan like this.

Here’s the new version of my place file:
OctreeSurfaceScanner.rbxl (244.5 KB)

Looking at your version it certainly does run faster. I think there’s a bit of a problem when it comes to overfilling and then leaving the surface node function unaltered, though; the results don’t seem very accurate:

I don’t know how important this will become but if I’m going to ultimately be comparing different grid nodes’ heights against one another in a cost analysis for pathfinding, I would imagine that it’s going to throw some weird bugs my way. If I were standing in the part of the plains with the floating nodes and the enemy was standing on the part of the plains with nodes on the ground, it might calculate an “ideal” path in a totally roundabout way when it should just be an easy calculation that goes straight towards me.

I regenerated the terrain to be more mountainous and it’s especially bad with mountain peaks. It seems like it’s calculating a surface grid with a resolution of 4 but with nodes of size 2.

I do think at least some sort of raycasting is necessary for things to remain accurate, though only doing one large terrain scan and making it separate from the parts scan made me realize that I could probably separate the terrain scan out of the main process and only do it at the beginning of the game, since I have no plans to make terrain destructible.

Something that got my attention is that both your implementation and my implementation count the water as a “surface”, even though you can’t really walk on it. This is probably both of us are only checking to make sure the terrain isn’t air.

I don’t know how that could affect pathfinding, but it’s something to be aware of.

The algorithm later needs to be able to deal with differences in height, so I would image it’s less of a problem than you might think. What you’re looking for is continuity more than accuracy. Think of how Roblox’s navigation mesh looks around terrain. Fidelity is less important, I think, than continuity. Additionally, making this something that can be used in game is also important. Ideally you won’t need to do many, if any, terrain scans once the game starts, but it is definitely nice to be able to do those scans in a reasonable amount of time.

For the water, that’s as easy as checking against Air and Water.

In all honesty, looking at the default navmesh, it seems like it’s interested in being both continuous AND accurate. I tested it on another map and got this:



Having the game run is definitely more important, though. Recording the number of casts for each node, it currently tops out at 68 casts for nodes bigger than 2 and 12 for everything else. I’m looking right now to see how much I can cut that down without losing fidelity. The issue is mainly the nodes that go through all of the casts and find nothing because there’s nothing to find or they’re buried underground. If there were a way to make the searching algorithm “smart” enough to not even bother with these empty/submerged nodes, it would probably cut down on the strain a lot.

Another method that I don’t know if it was brought up but that I think could cut down on a lot of the process is, rather than using :ReadVoxels() or continuously breaking cells down to look for terrain, instead, blockcast from the top of the octree for each minimum node (treating it like scanning for points on a quadtree) and then snap the results to the grid using the octree structure. This would combine the processes of finding the nodes and parsing them down to just the surfaces in one step. I just don’t know how much more efficient it is.