[Studio Beta] Studio Assistant & MCP Playtest Agent


Hi Creators,

We’re adding a dedicated playtest subagent to Assistant and Studio’s built-in MCP Server. When triggered, it spawns a test character and runs through gameplay scenarios in its own context, keeping Assistant’s context window clean and opening the door to multi-agent playtests down the road.

You can use it to verify things like:

  • Players can pick up a new item from the ground and sell it to an NPC shopkeeper.
  • Players can get in and out of vehicles.
  • Players can pick up an axe, walk over, and chop down a tree.

This is a Studio Beta feature and the tool’s first version, so try it out and let us know what you think! We’ll look to your feedback for any gaps and things to improve. Known issues and limitations can be found below.

Key Features

  • Dedicated Playtest Subagent: builds on previous playtest automation tools and supports Assistant and other agents by contracting playtesting tasks to a dedicated subagent. We plan to expand this to support multiple agents testing simultaneously.
  • Doesn’t burn your tokens: the playtest subagent offloads compute to a separate model run by Roblox, so it’s not your tokens tackling every test.
  • Keeps Your Agent’s Context Clean: each playtest is reduced to the instructions you send and the report you get back, keeping your agent’s context window uncluttered.
  • Model Improves Over Time: uses an independent model that can improve over time.

Get Started

  1. Enable the Beta: In Studio, go to File > Beta Features and check Playtest Agent.
  2. Trigger the Subagent: Studio MCP can trigger the subagent on its own. You can ask Assistant to playtest your experience with a prompt such as “use the playtest subagent to make sure players can buy the apple from the shop.”
  3. Observe (or take a break!): Playtest agents run autonomously.
  4. Playtest Results: Playtest agents return Pass, Fail, Inconclusive, or Error results for each playtest at the end, as well as a structured report of their actions and observations.

Known Issues and Limitations:

We’re actively working to improve the reliability of playtest agent behavior. Some limitations are:

  • False passes: The subagent can report a test as “Pass” when a human playtester would spot an issue. Don’t rely solely on the playtest subagent to validate your experience.
  • Requires clear instructions: The subagent needs actionable instructions specific to your experience. Vague or missing instructions may result in unpredictable behavior or an “Error” result.
  • Limited observability: Playtest agents use Studio MCP tools to perform in-game actions. A test may return “Pass” even if you or your agent would determine it failed based on other evidence.
  • Daily usage cap: Includes a playtest agent-specific daily usage cap for now.
  • Turn limit: Tests abort after 50 turns. Each turn can contain multiple batched tool calls.
  • Loop detection: If the subagent observes several identical tool calls in a row, it ends the test early to avoid getting stuck in a loop.
  • No real-time reflexes: The subagent does not think and act in real time. Playtests that require fast reactions in dynamic environments, like vehicle steering or difficult combat, may fail.

What’s Next

  • Multiple playtest agents operating simultaneously within a single Studio test instance.
  • Improved handling for tests that require longer action sequences.
  • Improved Studio MCP tool use to prevent cases in which the test aborts improperly due to observed repeat tool calls.
  • Playtest agents that can be added directly by you instead of requiring another agent.
  • Upgrading playtest agents to be able to test player-associated features such as leaderboards and persistent data stores.

The quality of this feature will continue to improve over time. We anticipate that after a period of broader use, learning and iteration, we’ll move to GA. We’ll keep you updated on improvements and new features as we iterate.


FAQs

Click here to view

How is it different from previous playtest automation?

  • Uses the same tools but creates a dedicated subagent that allows for separate context windows to be preserved for Assistant and other agents.

  • Can batch multiple playtests (e.g., read them from a file) to run consecutively through the subagent for known regression tests.

  • A subagent also opens the door to a future in which multiple playtest agents can participate in the same shared playtest session and interact with each other.

When does Assistant use the new subagent instead of orchestrating separate MCP tools?

  • For now, we are testing out the quality of the subagent on its own. While external agents can choose whether to use it, Assistant will require clear prompt guidance in order to use it (“e.g. use the playtest subagent to…”).

When can I expect the issues and limitations to improve?

  • We’re prioritizing features that enable playtest agents to perform a wider range of gameplay in the near term. For limitations related to the subagent model’s ability, we are working on a new playtesting model to raise test quality overall.
99 Likes

This topic was automatically opened after 10 minutes.

This is an amazing new feature. Amazing way of being able to test and verify different things.

12 Likes

Excited to try this out, looks amazing!

11 Likes

Looks neat! Any other updates for Claude MCP though?
I also have noticed Assistant running 10x faster than before which is nice.

13 Likes

Such an amazing change, looking forward for more!

2 Likes

Thanks for asking! We’re constantly looking at what comes next and will have more to share soon.

6 Likes

Okay now, I don’t really use AI and I’m not very knowledgeable if this already exists . However 1 thing i definitely would use it for is the meaningless tasks such as importing and renaming x assets, or copying loads of asset ids from decals that i have made, basically the super boring and time consuming tasks.

Sorry if this is poorly worded but I am basically asking if we will ever see assistant gain read only access to our/ groups inventories and creations.

Also this does look interesting, I will try it out

6 Likes

I can see how that capability would help. While there isn’t anything to share on that right now, we’re looking systematically about how to help save you time on things like this.

4 Likes

Is there a way to disable it entirely? The reason I ask is because it gets in the way of codex’s own subagents. A few times codex’s subagents will try running the studio mcp subagent, and it will time out.. wasting time.

1 Like

I imagine you can use the cloud APIs alongside some locally running python script to accomplish this (and you could use AI to help accomplish the creation of this).

2 Likes

You can disable the ‘agent’ tool in the MCP tool section of codex. For the playtest agent, specifically, it should only be available to those who opt in to the feature beta. To help us investigate, could you please share whether you’re seeing this for the playtest subagent or the search subagent?

4 Likes

Wouldn’t the best use of a test character be to test multiplayer? These sound like things a developer wouldn’t need an agent to help with.

5 Likes

Yes, we are excited to expand from this to provide multiplayer testing! For now, we hope that this first step can make it easier for those who use agents to help them self-correct their mistakes so you can save time overall on these kinds of tests.

2 Likes

I’m excited for when this lets me test multiplayer (co-op with the AI would be enough) but its in current state this is just gimmicky. The agent failed to use my UI to select a role, then it failed to be able to walk (it kept claiming it couldn’t navigate and then gave up and teleported itself), then it failed to tilt its camera down to press a green button that was literally right in front of it. embarrassing performance.
I’m a vibecoder, but I’m not so far gone i need the machine to literally play my own game for me. i can see this being EXTREMELY useful if i could spin up 5+ agents to test full lobby and emergent mechanics scenarios, so I’m hopeful there will be improvements.

9 Likes

Thank you for sharing this feedback. We appreciate the opportunity to learn about how the beta playtest agent performs in more scenarios so we can improve it. This move to a subagent form will give us more options over time to address the limitations you’re encountering.

4 Likes

I’m not sure, but I don’t have the playtest subagent enabled in beta features so it’s most likely the search subagent. Maybe it was causing issues because it wasn’t fully implemented? The timeout issue happened yesterday a few times.

(Doesn’t time out anymore)

1 Like

Thank you, we’ll investigate to see if we can figure out the underlying cause. Glad to hear it’s better now and to see if something was going on yesterday.

2 Likes

I’m wondering why these AI tools are constantly being released meanwhile things the community has been asking for years don’t even get put into consideration??

6 Likes

This is one of the updates that I really needed but never actually thought of. Give a rise to the person that came up with this.

1 Like