[Update] December 5, 2023
Hi Creators,
As we announced at RDC 2023, Roblox is building a public Luau dataset for training AI models. Our goal is to ensure that existing and future AI tools understand Luau and Roblox development. AI tools that deeply understand Luau can help you to work more efficiently, thanks to a new wave of integrations, AI assistants, and tools.
To contribute your Luau Code to the Luau public data set, please visit: https://create.roblox.com/data-collection
Thanks to the community, we have enough data to be able to improve the performance of fine-tuned code-focused large language models by 15%-20%. Our next goal is to grow the corpus by an order of magnitude to contribute to models at their earliest training stages, which require significantly more training data that can improve their ability to understand and generate Roblox Luau substantially.
We are asking you to opt in so anyone can make the tools that can boost your productivity. Please note that you will be able to pick which experiences you want to contribute and can adjust or opt out any time here. Data is collected anonymously, and we will filter out any sensitive data, such as your API keys.
We want to say a BIG thank you to those who have contributed their scripts. Your input has already significantly enriched the public dataset and enhanced our models. In fact, we think our internal code-generation models now exceed the quality of other leading LLMs when generating Roblox scripts. We will share our first version of the public Luau dataset with StarCoder and HuggingFace soon.
For more information on everything discussed and to opt-in, please visit our webpage and let us know if you have any questions.
Thank you!
FAQ
Share with Roblox
Why do we want scripts from our community for our AI models?
- Roblox is building its own AI models to make AI-powered creation tools. Today, we use code from a subset of free, publicly available-marketplace assets to train these models. Using additional data contributed by our community helps our AI offer you accurate and up-to-date suggestions.
How can I contribute to the Roblox AI models?
-
To help improve Roblox tools with AI, you can opt-in to select the experiences to share.
-
You are in control: You can opt-out anytime from your Account Settings. No data will be used for training models until one week after you opt-in. Upon your opt-out, we will not use your data for any new training beyond 30 days. However, we reserve the right to keep the existing models trained on your data active for up to 120 days.
-
From and for the community: We respect your privacy. Data is collected anonymously, and we will filter out any sensitive data, such as your API keys.
-
Owner-governed sharing: Group-owned experiences can only be shared by owners. For experiences that have multiple owners, sharing with Roblox or the wider public is only possible if all owners opt-in.
-
Can I still use Roblox AI tools if I don’t opt-in?
- Everyone can use our AI products like Code Assist. Those who share their data with our AI models will get access to more comprehensive and performant models that include community contributions.
If I opt-in to share my data, does this only include scripts? What about other types of data, like my models or images?
- Opting into sharing your data with Roblox and Luau currently only includes scripts. In the future, if we decide to add other data types, you will still have to opt-in to sharing these types, and we won’t automatically include your data without your opt-in.
Contribute to the Luau Dataset
What does contributing to the Luau Dataset mean?
- Roblox is building a public Luau Dataset for anyone training AI models.
-
Our goal is to make Luau a first-class programming language and ensure that existing and future AI tools understand Luau.
-
AI tools support for Luau: This means more accurate suggestions and more integrations when using 3rd party AI models, so creators like you can use their tool of choice in their creation process.
-
How can I contribute to the public Luau data set?
-
Much like contributing to Roblox, privacy and compliance are our top priorities. Experiences won’t be shared unless all owners consent.
However, opting out from the Luau Dataset won’t be possible after 30 days as the dataset will be publicly available. Please only contribute code from experiences that you are comfortable with. Only a small fraction of your contributed code, less than 30%, will be compiled and aggregated into an open source-available dataset for AI model training.