Player (That's not in-game) Information Search

My idea is to design a search engine in the game, where anyone could search another player’s Username, and info like DisplayName, Username, AccountAge, etc. would show up. But for it to be useful, I want it to search for info on a player that also isn’t in the game.

The interface would be looking like this:

And also the workspace sorting:
image

So far all I could do was to get the image, Username and the DisplayName, but for it to look like an actual thing I would also need things like AccountAge, Location the player is playing at (Country), and also access some databases that would give me important in-game stuff like CIN and Date of Enrollment.

This is the SearchMain script:
In some lines I had no idea how to, so some are in comments.

local UserService = game:GetService("UserService")
local RS = game:GetService("ReplicatedStorage")

local Players = game:GetService("Players")
local textBox = script.Parent.Input
local Profile = script.Parent.PROFILE
local ERROR = script.Parent.ERROR
local ERRsound = workspace.PC.FrelingMac.Screen.Error
local back = script.Parent.Back

local function onFocusLost(enterPressed, _inputObject)
	if enterPressed then
		
		--	local IDsearch = Players:GetUserIdFromNameAsync(textBox.Text)
		
		local success, err = pcall(function()
			IDsearch = Players:GetUserIdFromNameAsync(textBox.Text)
		end)
		
		if success then
			
			local success, result = pcall(function()
				return UserService:GetUserInfosByUserIdsAsync({ IDsearch })
			end)

			if success then
				for _, userInfo in ipairs(result) do
					
					back.Visible = true
					
			--		local joinTime = os.time() - (userInfo.AccountAge*86400)
			--		local joinDate = os.date("!*t", joinTime)
					local img = Players:GetUserThumbnailAsync(userInfo["Id"], Enum.ThumbnailType.HeadShot, Enum.ThumbnailSize.Size420x420)
					
					--RS.TransData:FireServer(IDsearch)
					
					Profile.Visible = true
					
					Profile.Ime.Text = userInfo["DisplayName"] -- Name
					Profile.Username.Text = userInfo["Username"] -- Username
			--		Profile.Dayage.Text = tostring(userInfo.AccountAge) --Dayage
			--		Profile.DOJ.Text = joinDate.day .. "/" .. joinDate.month .. "/" .. joinDate.year -- Date of join
					Profile.Headshot.Image = img
					
				end
			else
				ERROR.Text = "ERROR\nCan't get data"
				ERROR.Visible = true
				ERRsound:Play()
				wait(1.5)
				ERROR.Visible = false
				ERROR.Text = "ERROR\nGeneral Error"
			end
		else
			ERROR.Text = "ERROR\nUser not in Registry"
			ERROR.Visible = true
			ERRsound:Play()
			wait(1.5)
			ERROR.Visible = false
			ERROR.Text = "ERROR\nGeneral Error"
		end
	else
		textBox.Text = ""
	end
end

textBox.FocusLost:Connect(onFocusLost)

back.MouseButton1Click:Connect(function ()
	Profile.Visible = false
	back.Visible = false
end)

An important thing to mention is that I also want the access to the database in which there are stored variables like Citizenship, and a “CIN” number.

The script in the ServerScriptService looks like this (Please forgive the written code, I know it’s horrid) :

local DataStoreService = game:GetService("DataStoreService")
local RunService = game:GetService("RunService")
local RS = game:GetService("ReplicatedStorage")
local playerData = DataStoreService:GetDataStore("PlayerData")
local Players = game:GetService("Players")

local One = math.random(0, 9)
local Two = math.random(0, 9)
local Three = math.random(0, 9)
local Four = math.random(0, 9)
local Five = math.random(0, 9)

local function onPlayerJoin(player)  
	
	local storage = Instance.new("Folder")  
	storage.Name = "ValueData"
	storage.Parent = player



	local citizenship = Instance.new("BoolValue") 
	citizenship.Name = "Citizenship"
	citizenship.Parent = storage
	
	local CIN = Instance.new("NumberValue") 
	CIN.Name = "CIN"
	CIN.Parent = storage

	local playerUserId = "Player_" .. player.UserId  
	local data = playerData:GetAsync(playerUserId)  
	if data then


		CIN.Value = data
		citizenship.Value = data

	else


		CIN.Value = One .. Two .. Three .. Four .. Five .. "129"
		citizenship.Value = false

	end

end



local function onPlayerExit(player)

	local success, err = pcall(function()

		local playerUserId = "Player_" .. player.UserId

		playerData:SetAsync(playerUserId, player.ValueData.Citizenship.Value)
		playerData:SetAsync(playerUserId, player.ValueData.CIN.Value)
		
	end)



	if not success then

		warn('Error saving data!')

	end

end

And the end part that’s bugging me, basically sending data back to the SearchMain script:

local function TransferData(IDsearch)
	local Username = Players:GetNameFromUserIdAsync(IDsearch)
	local Info = playerData:GetAsync(Username)
	if Info then
			--Here it's supposed to receive the event request and send the data back to the SearchMain
	end
end

The problem I’ve been having was not being able to get the data when requested, and sending it back, if anyone knows I would be delighted.

I do need some more training with databases and connecting scripts with them.

So to wrap up,
I’ve scrolled through many Documentation posts, but I’m just not sure if I’m doing it right.
Also, do I need to use HttpService with this? Because web scripting is just something I have no idea about, same with APIs. If anyone has any ideas, suggestions or anything be free to reply.

1 Like

Getting the account age is simple, the Player instance has a AccountAge property.
But I’m not sure how (and why…) you would get information such as location.

For the in-game information such as “CIN” and “Date of Enrollment”, you’d save this data to a DataStore linked to the player’s UserId and load it whenever you need it. The player doesn’t need to be in game for you to access their datastore data.

The thing is, I want it to be able to search the AccountAge of a player that is not in the game. And sadly player.AccountAge only works for players that are in the game.

About the DataStore, it would be great, but it would need a lot of changes to be done.

90% sure you cannot get account age without pinging ROBLOX API however with HTTPService you physically cannot ping ROBLOX API so u’d need a proxy. you could log player’s account age in a datastore when they join and just access that later

they can get your token and log in to your account and take your robux and limiteds and items

Well I’ve read that yes, but proxys are paid, and as I’m not getting more out of the game than paying, that wouldn’t be really be a good option. The datastore might be better, but I’m not really experienced in that area, if you could give me some tips on how to start I would appreciate it.

:warning: WARNING :warning:: Proxies are dangerous and they could take your cookies/tokens and steal content in your account

2 Likes

I know the risk, and I am not using a Proxy

4 Likes

there are google services where you can host python or php scripts which you can use to make your own proxy but they’re data rated with the free versions n’ crap so it ain’t that good. however for datastore the real pointers i can give are looking at alvinblox vids or use the api.

I will look into that, thank you.

I’ll try to explain how IR systems work.

I am currently also working on implementing this into Roblox because the issue in Roblox is that Roblox does not have this feature integrated nor do they even have many of the core tools to lookup data efficiently especially from datastores. You can’t build up custom search queries but that is fortunately further down the Roblox roadmap.

Search engines usually use something known as FTS “Full-Text Search” which is a complicated but heavily optimized IR system. I strongly advice reading all about it from other sources and papers.

Typically very primitive search engines use an ILIKE search, which in SQL looks like this:

SELECT ... WHERE some_column ILIKE '%some_string%'

This will match keys that contains the search term and returns a queryset containing all those results.

This method easily slows down if your database contains millions of indexes since you’re iterating over that data per key due to O(n) complexity.

This sadly isn’t possible out of the box when it comes to Roblox because the issue is that you can’t retrieve multiple datastore keys with 1 request. These search functions would constantly have to poll new data from the datastore and Roblox just doesn’t have the tools for that. You’d have to spam GetAsync over and over which would quickly halt the datastore. Even something like ListKeysAsync which uses prefix search is currently broken (I’ve sent Roblox a bug report about it since its pagination doesn’t function as intended and colbert has already sent a report mentioned the faulty documentation flaws).

For now, you just have to store your data in scripts or get them from external sources. If you do decide to store them within scripts I’d advice looking into how to compress that data. Storing large datasets of strings will eat up a lot of memory though.

If you absolutely have to do this I’d not use any datastore requests (again due to what I explained earlier) for this and rather write your data into a modulescript or more realistically just store the data in an external database and now you can just apply the filter querysets as the API parameters. If you stick to a pure Roblox implementation, generate a lookup string where each key is positioned in the sentence by their index. Now you can just apply regex and return the indexes where those matches were found and with these return the actual data per index from the modulescript.

How do more complicated IR systems work?

Primitive FTS implementations usually work like this:

  1. You tokenize your input sentence: "Big brown fox!" -> {"Big", "brown", "fox!"}

  2. Remove common stop words such as: in, and, or, to… They occur in almost every sentences and would make the lookup slower.

  3. Remove special characters and lowercase everything.

  4. Apply a stemmer onto your tokens that is used in text normalization. It removes inflexional endings from words and stems them to their root form. I’ve recreated that algorithm onto Roblox: StringStem - Porter Stemmer Algorithm for IR systems - Resources / Community Resources - DevForum | Roblox The stemmer part is very important in this process and you can read all about it in the documentation.

  5. You go through your ‘big-data’ and create a key for each word with its values being which document IDs they are found in. i.e: ["home"] = {38, 24, 50, 235 ...}

  6. You can use the same process with the input search string and look up which words are found in which documents and apply a set intersection on them:
    "Big brown fox!" -> ( "big" (23, 47, 35 ...), "brown" (59, 23, 45), "fox" (40, 23) ) results in document 23 being the one with highest matches found and is returned. This way you can also rank them and return other perhaps less relevant results (i’ll get to relevancy soon).

  7. Additionally you can perform an OR search instead of an AND search with unions of sets instead of getting the intersection of sets. This would return more results since you’re only comparing whether 1 of the tokens exist in the documents. Might not be good since with millions of documents you’d likely returns tens of thousands.

  8. Term frequency. When you generate the keys in step 5. It’s good to store the frequency of those terms in each document for ranking. This way when we perform our search, those documents with higher term frequency get returned highest even if they share the same intersection count.

  9. Inverse document frequency. Our scoring system is quite good now but there are some short comings with our document retrieval scoring system. Just because we have a lot of similar words in our document does not necessarily mean it’s the relevant data that we look for. The whole reason why we also remove stop words is for documents that contain a lot of the same common words don’t get returned because they affect our relevancy scoring. We can use log10 to find how often the term occurs across all documents by dividing how many documents we have with the term frequency.

  10. Optionally, you can generate something known as search vectors in step 5, which are lexemes that look like this: 'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4 Basically the number 1, 6, 10 in a refer to the position where that word was found in the document.

  11. Weights. These positions can further be expanded with weights like A, B, C or D. which are typically used to reflect document structure: 'a':1A 'cat':5 'fat':2B,4C
    Weights are typically marked as such: {D-weight = 0.1, C-weight = 0.2, B-weight = 0.4, A-weight = 1.0}.
    Your ranking functions can use these to prioritize which documents to retrieve. So this form of search would search for the terms only in those specific areas / structures of documents.
    This probably isn’t necessary unless you specifically have an abstract and a body section explicitly set per document but you can totally do this with specific words as well.

  12. TrigramSimilarity and TrigramDistance. This breaks up words into sets of max 3 letters which is used in the query to work out a similarity metric between the input sentence to each documents. I have a diffing algorithm that doesn’t work quite like TrigramSearch but can totally be used for this application which uses my very own invented similarity metric algorithm: StringDiff - String Similarity Metric Diffing Algorithm - Resources / Community Resources - DevForum | Roblox So now finally if you apply this similarity metric ratio between the returned results you’ll get a more desired document. I’d advice doing this process in the actual search functions due to possible term typos. Not an easy task by any means since diffing can become quite slow in larger datasets.

Again, I’d advice looking up actual papers on Full-Text search but the core tools like stemming and diffing already exist in Roblox thanks to my hard work.

Here’s a good reference video that explain this process better than I can:
Django Search - YouTube

1 Like

What? Sending a request to a proxy does not allow them access to your cookie/tokens lol. I have absolutely no clue where you would draw that conclusion from.

Proxies just recieve the IP address from the device it was sent from and returns a value, there’s no way it can take your cookie unless you send your cookie in the http request lol.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.