New Audio API [Beta]: Elevate Sound and Voice in Your Experiences

After working with the new audio API, I had a few ideas for features. Adding these features would make it possible to create more dynamic and immersive environments.

Feature 1: Directional Sensitivity

It would be awesome to have a property that controls the directionality of sound for the AudioListener, similar to how different microphones capture sound. The same idea could also be added to AudioEmitter but in terms of direction of sound emittance.

While AudioInteractionGroup can sometimes provide a workaround, a built-in directional sensitivity feature would be more accurate, flexible, and easier for developers to implement dynamic environments.

Example

In a stage or presentation setting, a directional mic could prevent audience voices from being picked up, focusing only on the speaker in front of the mic. Unfortunately, listeners hear in all directions so we need a manual system.

The first technique that comes to mind is to make it so each audience user’s voice AudioEmitter has a different group than the stage mic AudioListener. This achieves a desired result but isn’t very flexible. Imagine we wanted it so that anyone can walk on stage and be picked up by the mic. They should be heard based on the direction and distance to the mic.

The current workaround would require tracking each player’s direction and position relative to the listener.

Additionally, simulating accurate distance attenuation would require a mic listener for each player/input. This is because DistanceAttenuation is only a property of AudioEmitter. Having it also for AudioListener would be useful. (Highlighted Correction: Here

With a directional property, the AudioListener can be defined as most sensitive to hearing sounds in front of it. This will ignore audience voices while still being flexible enough to automatically allow anyone to walk on stage and have their voices heard accurately.

Visualization

Polar patterns are a great way to visualize what this property might allow for. Control over defining custom polar patterns would be amazing. At the very least an enum would also work for most usecases. Currently, both listener and emitters are omni-directional.

In Roblox, a visualization feature would be necessary to see how the directional sensitivity is oriented relative to the parent.

Feature 2: Parent-Independent Alignment

AudioListener and AudioEmitter need to be parented to anything with orientation. The “forward” direction for these audio instances are thus locked to the orientation of the parent. I propose more control over how they are oriented.

Example

Let’s say I make a top-down experience with the camera locked to some orientation. I want spatial audio so I put an AudioListener in the character. While facing some direction a sound is heard from the right. If my character does a 180deg turn that sound will now be heard from the left which contradicts how my camera is oriented.

To workaround this I would have to make some part or attachment that has to be updated per frame to match my character position, but not rotation. The AudioListener would be parented to this extra instance.

Rough Idea

Instead, a property could be introduced say “AlignmentMode” inspired by a similar property in constraints. If the AlignmentMode is “Parent” it will match the parent orientation like how it is currently. If the AlignmentMode is “Global” then the alignment could be locked to a specific global direction. There could also be modes for per-axis control.

If directional sensitivity is added for AudioEmitter, then this feature could also work for emitters.

Recap

  1. AudioListener needs a property for controlling sensitivity of hearing sound directionally.
  2. AudioEmitter needs a property for controlling sensitivity of emitting sound directionally.
  3. AudioListener needs a DistanceAttenuation property.
  4. AudioListener needs properties to control the alignment of hearing spatial emitters.
  5. AudioEmitter needs properties to control the alignment for emittance of sound if directional sensitivity is added for emitters.
4 Likes

Is the AudioAnalyzer client side only? Ive been playing with it and I only get an output on the client, the server always outputs 0 on both RMS and Peak:
My Setup:
image
Client Script Output for RMS:
image
Server Script Output for RMS:
image

1 Like

I wonder. Will the old sounds be deprecated?

Hey @monkeymario1000, we will not be removing Sound, SoundGroup, or any of the SoundEffect APIs. We do think there are some sound design use-cases that just aren’t possible with those instances, but we don’t plan to delete them

will audioEmitters ever work with volumetric audio?

@BubasGaming we’ve looked at porting this over, but unfortunately it’s a little tricky now that there can be arbitrarily-many AudioListeners hearing an AudioEmitter from different perspectives.
Sounds also only support volumetric emission with Box, Cylinder, and Ball parts, and we’ve long had ambitions of making it work with arbitrary shapes (e.x. meshes).

The new API separates file playback from emission, so you might be able to accomplish something similar with multiple AudioEmitters spread around the surface of a part (e.x. offset by Attachments)

  1. AudioListener needs a property for controlling sensitivity of hearing sound directionally.
  2. AudioEmitter needs a property for controlling sensitivity of emitting sound directionally.
  3. AudioListener needs a DistanceAttenuation property.
  4. AudioListener needs properties to control the alignment of hearing spatial emitters.
  5. AudioEmitter needs properties to control the alignment for emittance of sound if directional sensitivity is added for emitters.

@jonbyte thanks for the feedback! We are discussing

Is the AudioAnalyzer client side only? Ive been playing with it and I only get an output on the client, the server always outputs 0 on both RMS and Peak

@jplib325 yeah, all audio processing is disabled on the server to conserve memory & cpu, which unfortunately means that AudioAnalyzer only works client-side. If you need metering information on the server, you could have each client compute volume levels for some subset of the audio (maybe their own voice or nearby sounds) and do something like what @tacheometrist shared above

right now I’m using an UnreliableRemoteEvent to share this data with the server

7 Likes

cc: @ReallyLongArms

I was reading through these sentences and I would like to correct what I am trying to say. The main point for this part is really wanting the addition of distance attenuation for the AudioListener. Let’s say each AudioEmitter for each player’s voice chat is set to some DistanceAttenuation curve. But then let’s also say that we want the microphone to pick up sounds in it’s own range. This is where DistanceAttenuation for AudioListener would shine.

To correct what I said, the manual approach would actually require multiple emitters (not listeners) for each player for the microphone to listen to. The DistanceAttenuation curve would be inverted for each. For example, if we wanted it so the microphone AudioListener had a max range of 100 studs then each extra player AudioEmitter DistanceAttenuation curve would be inverted such that at distance 100 the curve = 1, and at distance 0 the curve = 0. Finally, this curve would have to be updated based on the distance from the player to the microphone.

Adding DistanceAttenuation for AudioListener may still require extra AudioEmitters, but it removes the need for the weird inversion of the AudioEmitter curve. In other words, it allows prioritizing the range of the listener over the range of the emitter. However, I suppose the case of handling an AudioEmitter and AudioListener both with curves could be a thing, but I don’t think that causes any issues?

Side Note

The DistanceAttenuation curve property is awesome, but for simple configuration I think it would also be cool if there was a Distance and AttenuationMode property. Modifying these would override the DistanceAttenuation curve, but it can still be edited as well (overriding the properties I mentioned to “custom”). AttenuationMode could be an Enum including Linear, Inverse, ecc. Distance would be a number.

2 Likes

what is the base instance for the all audio instances?

2 Likes

can someone help?
I have not enabled any beta features yet I still see the instances in play testing(I cant insert them in editing mode using the interface but only commands). Is it out of the beta?

Screenshots

image testing

image editing

used the command Instance.new("AudioEmitter").Parent = game.Workspace
image

2 Likes

is there a way to detect if someone is talking?

3 Likes

So I’m trying to use the audio API for a Voice Chat Radio I was programming, and I used the sample code to recreate the old voice chat RollOff, but the ListenerPosition is the workspace’s CurrentCamera’s Position.

So two players could try to speak to each other normally while being far from the CurrentCamera Position, and they can’t hear each other. How could I get around this?

2 Likes

what is the base instance for the all audio instances?

@VeryLiquidCold currently all the audio instances inherit from Instance. We thought about making them share a base “AudioNode” class, or inherit from “AudioProducer”/“AudioEffect”/“AudioConsumer” base classes, but since our engine only supports single-inheritance, this would close the door on some types like AudioPlayer ever inheriting from another meaningful base-class, e.x. Playable.

You can check whether a particular instance is consuming/producng audio by checking :GetConnectedWires("Input")/:GetConnectedWires("Output"); combined with pcall you could generalize this further

I have not enabled any beta features yet I still see the instances in play testing

@whatplayer5858 the studio beta only controls whether the instances are findable/insertable in the explorer; any audio instances that were already there or instantiated by scripts are still functional; we did this so that you can still publish experiences that use the new features

So I’m trying to use the audio API for a Voice Chat Radio I was programming, and I used the sample code to recreate the old voice chat RollOff, but the ListenerPosition is the workspace’s CurrentCamera’s Position.

Hey @BehzadGen, there’s an AudioEmitter:SetDistanceAttenuation API that makes implementing the equivalent rolloff much simpler; the announcement is here and
there is some sample code showing how this can be used to recreate the old voice rolloff without a per-frame/polling loop

So two players could try to speak to each other normally while being far from the CurrentCamera Position, and they can’t hear each other. How could I get around this?

You could reparent the AudioListener to your local player’s character’s head, or to a part/attachment/model that’s nearer – we spawn an AudioListener on the camera by default because we can more reliably count on it existing, but that default isn’t necessarily the best, feel free to change it for your needs!

6 Likes

2nd Edit
Don’t do V do what ReallyLongArms said below me their solution was so much better and worked and got rid of audio still being really faint


Fixed the issue, leaving up incase anyone else runs into the same issue:
You have to set the DeviceOutput.Player to a Player, for me it’s the Player the DeviceOutput is linked with, otherwise it’s heard by everyone.


For some reason, all Audio from AudioDeviceInput’s are heard globally (Meaning all Players can hear each other, regardless of how far they are) and I’m not sure why. I’ve tried both methods from the code samples here AudioEmitter | Documentation - Roblox Creator Hub and audio spoken is still heard globally to all Players.

Here’s my setup that builds the Audio stuff onto the player, am I missing something or is this a bug?

-- Services
local Players = game:GetService("Players")

local function wireConnect(Source: Instance, Destination: Instance, Name: string, Parent: Model)

	-- Voice System
	local Wire = Instance.new("Wire", Parent)
	Wire.SourceInstance = Source
	Wire.TargetInstance = Destination
	Wire.Name = Name
end

Players.PlayerAdded:Connect(function(Player: Player)

	-- Voice System
	if Player:FindFirstChild("AudioDeviceInput") then
		local DeviceInput = Player:FindFirstChild("AudioDeviceInput")
		DeviceInput.Name = "DeviceInput"
		DeviceInput.Player = Player
	else
		local DeviceInput = Instance.new("AudioDeviceInput", Player)
		DeviceInput.Name = "DeviceInput"
		DeviceInput.Player = Player
	end

	Player.CharacterAdded:Connect(function(Character: Model)

		Character:MoveTo(workspace.Checkpoints[Player:GetAttribute("Checkpoint")].Base.Position)

		-- Voice System
		local Emitter : AudioEmitter = Instance.new("AudioEmitter", Character)
		Emitter.Name = "Emitter"

		local curve = {}
		curve[0] = 1
		curve[50] = 0.5
		curve[100] = 0 
		Emitter:SetDistanceAttenuation(curve)

		wireConnect(Player.DeviceInput, Emitter, "voiceFilterWire1", Character)

		local voiceFilterWire2 = Instance.new("Wire", Character)
		voiceFilterWire2.Name = "voiceFilterWire2"

		local Listener = Instance.new("AudioListener", Character)
		Listener.Name = "Listener"

		local DeviceOutput = Instance.new("AudioDeviceOutput", Listener)
		DeviceOutput.Name = "DeviceOutput"
		wireConnect(Listener, DeviceOutput, "OutputWire", DeviceOutput)

		local Analyzer = Instance.new("AudioAnalyzer", Character)
		Analyzer.Name = "Analyzer"
		wireConnect(Player.DeviceInput, Analyzer, "Wire", Analyzer)
	end)
end)
2 Likes

This might be happening because in CharacterAdded, both an emitter and a listener are created and parented to Character – this means each voice is both emitted-from and heard-by the Character’s 3d position; the listener will hear the emitted voice as full-volume, since they are situated on top of one another.

Each listener also wires what it heard to an AudioDeviceOutput which renders it to your speakers.

Instead of creating an AudioListener & AudioDeviceOutput per-player, you could use a client-side script to just create one listener that hears things from your local perspective (e.x. workspace.CurrentCamera or Players.LocalPlayer.Character)

Edit: Ah! I see you were able to fix it by using the AudioDeviceOutput.Player property – that also works, since that will restrict which players are able to receive from a particular output.

4 Likes

I have actually used this and it is apparently not what I had in mind.
I was thinking that this could be used to play the same sound from all the cars in a multiple-car train, where the cars are connected together. (Copying the sound to all the cars would make it hard to change it later.)
However, I realized that what I wanted to do was practically impossible because AudioEmitter can only send one sound source.
Personally, I would have liked to have a mechanism whereby when a part containing the original sound source is connected to another part via Wire, the sound is played back exactly the same way on the other parts (even if the pitch is changed in the script, it will always be duplicated).

2 Likes

@ReallyLongArms Hello!
I’ve been running into an interesting dilema recently with PlaybackRegions and I was wondering if I could get some help on it.
We’re running into an issue where playing two files at once utilizing PlaybackRegions will sometimes not load properly after one another if there is decent space in between.

Are there any specific values or timeframes between when a playback region sound is called for and when it’s culled? It seems to be a really short time between and can cause sounds to choke up during gameplay.
Is there a culling difference between PlaybackRegions vs. Single playable files?
I remember that shorter files are loaded directly from memory.
When utilizing playback regions and a single file with multiple variations in it, are there any changes to caching?

2 Likes

If it is culling with the playback regions / audio, it seems overly aggressive at the moment.

2 Likes

Hey @panzerv1 – do you have a video or screen recording of the phenomenon you’re observing? We can take a look

@NTL331 can you clarify what you mean by this?

However, I realized that what I wanted to do was practically impossible because AudioEmitter can only send one sound source.

It should be possible to wire multiple AudioPlayers to one AudioEmitter to emit all of them, or wire an AudioPlayer to multiple AudioEmitters to emit from multiple locations. For example, if you had 3 audio files A, B, and C, and wanted to emit them from 3 different 3d locations, you could do something like this

3 Likes

I sent a private message regarding the question

3 Likes

What does getconnectedwires do? what is the pin argument in the getconnectedwires are?


theres no documentation for this can someone help?

1 Like

so what happened so Sounds?‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

Hey @VeryLiquidCold, adding a description to this now – sorry for the inconvenience

:GetConnectedWires returns an array of Wires that are connected to a specific pin – most of the audio APIs have one "Input" and/or one "Output" pin; AudioCompressor has an additional "Sidechain" pin.

In the future we expect to add things that have way more pins, so we made this a string for forwards-compatibility

so what happened so Sound s?‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

@Micamaster100 they still exist, and if they satisfy your use-cases you can feel free to keep using them – but, Sound is really doing multiple jobs under the hood (AudioPlayer + AudioDeviceOutput – and if it’s parented to a Part or Attachment, it also behaves like there’s an AudioEmitter and AudioListener)
The new API aims to let you mix & match for greater control

Hiya @ReallyLongArms ,
Assuming you are still dealing with issues regarding the audio system, I have a slight issue.

So I am using the audio system to extend the usage of voice chat, but the dilema I am running to is that not all players can hear eachother - even though wires are there, everything is unmuted and has an audible volume. The user id access list is utilised to control who can hear who, rather than destroying and recreating wires.

Here is a diagram with how it all functions:
paintdotnet_h9dGIbtLSp
Key
Input - AudioInputDevice
Analyser - AudioAnalyzer
Output - AudioOutputDevice
Fader - AudioFader (For volume control)

These are all created on the server and stored in ReplicatedStorage - it was previously stored (not created) in the player but this issue still persisted.
Input and Output have their .Player values set to their corresponding player and are not muted
Input access lists include the userids of all the users surrounding them
Fader is set to a volume of 3
Analyser is only used for a cosmetic part of a UI to display when a user is speaking (as it can be too far to see the overhead icon)

We have observed some issues between players when their experience control UIs are different, but this was fixed by disabling and enabling their microphone - though this is no longer relevant now with everyone on the new ui.

I have logs of all visible properties of all the audio related elements and I have observed no difference between them all.

An example of this would be:
(All players are in each other access lists and have valid connections)
Player1 can hear Player2 and Player 3
Player2 can only hear Player 1 - they should also be able to hear Player3, but cannot
Player3 can hear Player1 and Player2

I can provide a place file in a private message on request, as well as any more explanation.

1 Like