Hi Creators,
We’re adding a dedicated playtest subagent to Assistant and Studio’s built-in MCP Server. When triggered, it spawns a test character and runs through gameplay scenarios in its own context, keeping Assistant’s context window clean and opening the door to multi-agent playtests down the road.
You can use it to verify things like:
- Players can pick up a new item from the ground and sell it to an NPC shopkeeper.
- Players can get in and out of vehicles.
- Players can pick up an axe, walk over, and chop down a tree.
This is a Studio Beta feature and the tool’s first version, so try it out and let us know what you think! We’ll look to your feedback for any gaps and things to improve. Known issues and limitations can be found below.
Key Features
- Dedicated Playtest Subagent: builds on previous playtest automation tools and supports Assistant and other agents by contracting playtesting tasks to a dedicated subagent. We plan to expand this to support multiple agents testing simultaneously.
- Doesn’t burn your tokens: the playtest subagent offloads compute to a separate model run by Roblox, so it’s not your tokens tackling every test.
- Keeps Your Agent’s Context Clean: each playtest is reduced to the instructions you send and the report you get back, keeping your agent’s context window uncluttered.
- Model Improves Over Time: uses an independent model that can improve over time.
Get Started
- Enable the Beta: In Studio, go to File > Beta Features and check Playtest Agent.
- Trigger the Subagent: Studio MCP can trigger the subagent on its own. You can ask Assistant to playtest your experience with a prompt such as “use the playtest subagent to make sure players can buy the apple from the shop.”
- Observe (or take a break!): Playtest agents run autonomously.
- Playtest Results: Playtest agents return Pass, Fail, Inconclusive, or Error results for each playtest at the end, as well as a structured report of their actions and observations.
Known Issues and Limitations:
We’re actively working to improve the reliability of playtest agent behavior. Some limitations are:
- False passes: The subagent can report a test as “Pass” when a human playtester would spot an issue. Don’t rely solely on the playtest subagent to validate your experience.
- Requires clear instructions: The subagent needs actionable instructions specific to your experience. Vague or missing instructions may result in unpredictable behavior or an “Error” result.
- Limited observability: Playtest agents use Studio MCP tools to perform in-game actions. A test may return “Pass” even if you or your agent would determine it failed based on other evidence.
- Daily usage cap: Includes a playtest agent-specific daily usage cap for now.
- Turn limit: Tests abort after 50 turns. Each turn can contain multiple batched tool calls.
- Loop detection: If the subagent observes several identical tool calls in a row, it ends the test early to avoid getting stuck in a loop.
- No real-time reflexes: The subagent does not think and act in real time. Playtests that require fast reactions in dynamic environments, like vehicle steering or difficult combat, may fail.
What’s Next
- Multiple playtest agents operating simultaneously within a single Studio test instance.
- Improved handling for tests that require longer action sequences.
- Improved Studio MCP tool use to prevent cases in which the test aborts improperly due to observed repeat tool calls.
- Playtest agents that can be added directly by you instead of requiring another agent.
- Upgrading playtest agents to be able to test player-associated features such as leaderboards and persistent data stores.
The quality of this feature will continue to improve over time. We anticipate that after a period of broader use, learning and iteration, we’ll move to GA. We’ll keep you updated on improvements and new features as we iterate.
FAQs
Click here to view
How is it different from previous playtest automation?
-
Uses the same tools but creates a dedicated subagent that allows for separate context windows to be preserved for Assistant and other agents.
-
Can batch multiple playtests (e.g., read them from a file) to run consecutively through the subagent for known regression tests.
-
A subagent also opens the door to a future in which multiple playtest agents can participate in the same shared playtest session and interact with each other.
When does Assistant use the new subagent instead of orchestrating separate MCP tools?
- For now, we are testing out the quality of the subagent on its own. While external agents can choose whether to use it, Assistant will require clear prompt guidance in order to use it (“e.g. use the playtest subagent to…”).
When can I expect the issues and limitations to improve?
- We’re prioritizing features that enable playtest agents to perform a wider range of gameplay in the near term. For limitations related to the subagent model’s ability, we are working on a new playtesting model to raise test quality overall.
