Is there any way to store a very large amount of data? Attempting large json import

Currently I’m attempting to import a rather large json file. I’ve converted it into an array for a module script. Upon upload the file is 5.1 million short lines. Takes Roblox Studio about 3-5 minutes to actually load it when pasted into a module script. Everything I do with it is working properly so long as it’s saved locally. Upon trying to publish the game, it fails every time.

Is there some other way to store the data than as a module script?

I think it would be a good idea to still use module scripts for storing your large data, you just need to optimize your data. Roblox probably can’t handle that much data in one script, especially in one string based on the image you have shown.

What you could do is combine multiple lines into one line and use a separator character. For example:

Line1
Line2
Line3

Would be

Line1|Line2|Line3|…

This would significantly reduce the amount of lines in your code while also keeping all the data.

You can also separate the data into multiple module scripts. I’m just guessing here, but I’m assuming Roblox is able to better handle the data spread out in many scripts than a huge amount of data in one script. So for example, you could have maybe 1000 lines of data for every module script, and the module scripts would be named numerically then required in order.

You can even combine both techniques which may make it much more efficient to store your data in Roblox.

However, doing this manually would definitely take forever. That’s why you would likely need to use external scripts for this. You can write a script that reads through your JSON file and use the techniques above. Fortunately, I wrote a Python(3) script that does just that:

Python Script
# Denny9876
# Sorry if the code is a mess. I only recently got into Python.

import sys # Used to terminate program early on invalid inputs.

def Main():
	ReadFile = None # Data file to be read.
	CurrentFile = 1 # Used to name module scripts in numerical order.
	Items = [] # List containing dictionaries of all module script data.
	NewData = [] # List containing data of a module script.
	FileName = "" # Name of data file to be read.
	CurrentNewLine = "" # Contains combined lines of data.
	Separator = '|' # Separates data in combined line. Can change.
	CurrentNumLine = 0 # Keeps count of the current actual line.
	CurrentNumArrayItem = 0 # Keeps count of the number of lines in the module script table.
	NumLines = 0 # Max number of actual lines before combining into one line.
	NumArrayItems = 0 # Max number of items in module script table before making new module script.

	# Get data file name to read and copy from.
	FileName = input("Enter file name: ")
	# Terminate program if data file not found.
	try:
		ReadFile = open(FileName, "r")
	except FileNotFoundError:
		print("File not found. Terminating program.")
		sys.exit()
	
	
	# Get number of actual lines to combine into one line.
	NumLines = GetInteger("Enter number of lines to combine: ")
	# Terminate program if input is less than or equal to -1.
	if NumLines <= -1:
		print("Aborted program.")
		sys.exit()
	
	# Get number of lines a module script can hold.
	NumArrayItems = GetInteger("Enter max number of items for each table: ")
	# Terminate program if input is less than or equal to -1.
	if NumArrayItems <= -1:
		print("Terminated program.")
		sys.exit()

	# Reads and copies data.
	for line in ReadFile:
		if CurrentNumLine > NumLines:
			NewData.append('	"' + CurrentNewLine[0:len(CurrentNewLine) - 1] + "\",\n")
			CurrentNewLine = ""
			CurrentNumLine = 0
			CurrentNumArrayItem += 1
		if CurrentNumArrayItem > NumArrayItems:
			Items.append({"Name" : "DATA_" + str(CurrentFile), "Data" : NewData})
			NewData = []
			CurrentFile += 1
			CurrentNumArrayItem = 0
		CurrentNewLine += line.rstrip() + Separator
		CurrentNumLine += 1
	
	# Send copied data into CreateRbxmxFile() function.
	if Items:
		if CurrentNewLine != "":
			NewData.append('	"' + CurrentNewLine[0:len(CurrentNewLine) - 1] + "\",\n")
		if NewData:
			Items.append({"Name" : "DATA_" + str(CurrentFile), "Data" : NewData})
		CreateRbxmxFile("DATA", Items) # By default, created file is named "DATA.rbxmx". Can change name here.
	
	ReadFile.close()

	print("Divided file: " + FileName)

def CreateRbxmxFile(name = "FILE", sources = []):
	File = open(name + ".rbxmx", "w") # Create new RBXMX file.

	# Write formatted data into file.
	File.write("<roblox version=\"4\">\n")
	for source in sources:
		File.write("	<Item class=\"ModuleScript\">\n")
		File.write("		<Properties>\n")
		File.write("			<string name=\"Name\">"+ source["Name"] + "</string>")
		File.write("<ProtectedString name=\"Source\"><![CDATA[local Data = {\n")
		File.writelines(source["Data"])
		File.write("}\n\n")
		File.write("return Data]]></ProtectedString>\n")
		File.write("		</Properties>\n")
		File.write("	</Item>\n")
	File.write("</roblox>\n")

	File.close()

# Validates input to only accept integers above 0 or -1.
def GetInteger(msg):
	Input = None

	try:
		Input = input(msg)
		Input = int(Input)
	except ValueError:
		pass

	while(type(Input) != int or (Input < 0 and Input != -1)):
		try:
			Input = input("Invalid input. " + msg)
			Input = int(Input)
		except ValueError:
			pass
	
	return Input

Main() # Run main function. (it's a C++ habit)

The script asks to enter the data file’s name (including extension), what amount of lines should be combined into one line (30 recommended), and how many (combined) lines should be stored in each module script (1000 recommended). After entering the inputs, a “DATA.rbxmx” file is created, and you can import that into Roblox. Note this may import many module scripts so it is recommended to import them into an empty folder. Also note that I do not know how your JSON data is formatted exactly, so you may need to make minor edits to the script because my script best works on just the raw data itself in every line.

And finally, to access the data, just loop through and require each module script. Here’s a basic Luau script to give you an idea:

Luau Script
local Data = script.Parent.Data -- Folder containing module scripts.

local DataCount = 0 -- How many (original) lines of data there are.

for i = 1, (#Data:GetChildren()) do -- Loop through modules in numerical order.
	local ModuleData = require(Data["DATA_"..i]) -- Get table of data in each module script.
	
	for a, line in pairs(ModuleData) do -- Get each line of data in table.
		if (a % 500 == 0) then -- Add small delay every 500 lines. Increase number to speed up getting data, decrease to reduce lag.
			task.wait()
		end
		DataCount += #string.split(line, '|') -- string.split(line, '|') returns table of all data in a line.
	end
end

print(DataCount)

I have not tested publishing, however importing and loading the data only took a few seconds on my laptop on a five-million-line generated file, so I assume this would work.

Hope this helped! If there’s still any problems or questions, just leave a reply.

Also just curious, what is this data used for?

So I reduced the amount of lines by an extremely significant amount. No matter the amount of splitting the amount of data I was trying to use would not lead to something that didn’t take Roblox over 3 minutes to combine.

The data is a file format called ldraw, which is the unofficial file format for 3d modeled “building bricks”. I used python to remove any data from any of the files that wasn’t related to sub-file primitives.

Originally I was just messing around placing individual points for each 3d model, because why not. I cut the data down to only the sub-file primitives because you can derive attachment points from them.

Examples of finding the attachment points
image1

Example of when I was recreating the geometry

I apologize for the late reply. I’ve looked at LDraw and it appears to be Lego building software. I’m assuming you’re trying to recreate LDraw models into Roblox by reading the file data, right?

When you say it takes Roblox several minutes to combine, do you mean creating the model? Also is the problem still publishing, or just the time it takes to load the model now?

I again apologize if I’m not entirely understanding the purpose or current problem. Right now I’m assuming your problem is the loading time of your LDraw model into a Roblox server. However if that’s not it please let me know.

And if you don’t mind sharing, what’s the LDraw model you used called (if it’s public)? It would allow me to do my own testing if needed.

LDraw is completely public. You can download the complete library from their website. I was attempting to import every ldraw part. I’ve been learning more and more about the format in order to cut it down as far as possible. At the moment I’m trying to separate out any part that isn’t actually a unique part or just a print of another part in order to lower the amount of space it takes up. Everything about the file format can be found on their website.

Creating the model itself takes less than a second, it’s compiling the thousands of files. I attempted to create a plugin that takes the trimmed down files I made to import them into roblox, but you cannot paste over 200k lines from the plugin to a script or even a string value.

1 Like