1. Introduction
The Bluetooth® LE Audio Stream Management section regroups
- the Basic Audio Profile (BAP)
- the Audio Stream Control Service (ASCS)
- the Published Audio Capability Service (PACS)
- the Broadcast Audio Scan Service (BASS)
These profile and services are used to configure and establish unicast and broadcast audio streams.
2. Basic Audio Profile (BAP)
The Basic Audio Profile defines roles and procedures to establish audio streams with remote devices. It's specified by the Bluetooth® SIG and the full specification is accessible on their website[1].
2.1. Roles
The BAP introduces 6 roles:
- Unicast client: Establishes connection to Unicast server, discovers its capabilities and configure the audio stream. This role is used by a smartphone, a laptop, a television etc.
- Unicast server: Advertises its role, exposes its capabilities, accepts the Unicast server to configure the audio stream. This role can be used by a headphone, a speaker, some hearing aids, some earbuds, or even a microphone.
- Broadcast source: establishes a BIS and advertise all the information over extended advertising and periodic advertising.
- Broadcast sink: scan advertising to find the broadcast source, and get synchronized to receive the audio stream
- Scan delegator: exposes its capabilities and wait for a broadcast assistant to receive some information of a broadcast source and commands. The scan delegator also has the broadcast sink role to get synchronized to the selected broadcast source.
- Broadcast assistant: scan advertising to find broadcast source, discovers the scan delegator capabilities. It sends the information of the broadcast source to the scan delegator.
2.2. Codec Configuration
To configure audio streams in Broadcast or Unicast mode, the BAP specified the format of "Codec specific Configuration". They are constituted of multiple Length-Type-Value (LTV) structures to define stream parameters like the Sampling Frequency or the Frame Duration.
Parameter | Size | Description |
---|---|---|
LENGTH | 1 byte | Length of the Type field + Length of the value field |
TYPE | 1 Byte | Type of the data: Sampling Frequency (0x01) Frame Duration (0x02) Audio Channel Allocation (0x03) Octets Per Codec Frame (0x04) Codec Frame Block Per SDU (0x05) |
VALUE | LENGTH - 1 Byte | Value corresponding to the type See Assigned Numbers section 6.12.5[2] |
The following table summarize types found in Codec Configuration and common values:
Type | Mandatory | Values | |
---|---|---|---|
Sampling Frequency | 0x01 | Yes | 0x01: 8 kHz 0x03: 16 kHz 0x05: 24 kHz 0x06: 32kHz 0x07: 44.1 kHz 0x08: 48kHz |
Frame Duration | 0x02 | Yes | 0x00: 7.5ms codec frames 0x01: 10ms codec frames |
Audio Channel Allocation | 0x03 | No (default 0) | 4-bytes bitfield of Audio Location values Bit 0: Front Left Bit 1: Front Right Value 0 = Mono Audio (no specified Audio Location) |
Octets Per Codec Frame | 0x04 | Yes | Number of bytes in a frame |
Codec Frame Block Per SDU | 0x05 | No (default 1) | Number of Codec Frame per SDU |
The following table is an example of a codec specific configuration:
Setting | Length | Type | Value | |||
---|---|---|---|---|---|---|
Sampling Frequency = 48KHz | 0x02 | 0x01 | 0x08 | |||
Frame Duration = 10ms | 0x02 | 0x02 | 0x01 | |||
Audio Channel Allocation = Front Left + Front Right (stereo) | 0x05 | 0x03 | 0x03 | 0x00 | 0x00 | 0x00 |
Octets Per Codec Frame = 100 octets | 0x03 | 0x04 | 0x64 | 0x00 |
2.3. Unicast Roles
The following diagram illustrates the interaction between the different unicast roles:
2.3.1. Unicast Topologies
The BAP specifies 16 different topologies for the Unicast Roles. They define ways of establishing audio stream between one Central device and one or two Peripheral devices. The following table can be found in the BAP Specification and lists the possible topologies:
Audio Configuration |
Legend C------S |
Num Servers | Sink ASEs | Source ASEs | Audio Channels per Sink ASE |
Min Sink Audio Locations per Server |
Audio Channels per Source ASE |
Min Source Audio Locations per Server |
CISes | Audio Streams |
---|---|---|---|---|---|---|---|---|---|---|
1 | --------> | 1 | 1 | 1 | 1 | 1 | ||||
2 | <-------- | 1 | 1 | 1 | 1 | 1 | ||||
3 | <-------> | 1 | 1 | 1 | 1 | 1 | 1 | 2 | ||
4 | ------->> | 1 | 1 | 2 | 2 | 1 | 1 | |||
5 | <----->> | 1 | 1 | 1 | 2 | 1 | 1 | 2 | ||
6(i) | --------> --------> |
1 | 2 | 1 | 2 | 2 | 2 | |||
6(ii) | --------> <-------- |
2 | 2 | 1 | 1 | 2 | 2 | |||
7(i) | --------> <-------- |
1 | 1 | 1 | 1 | 1 | 2 | 2 | ||
7(ii) | --------> <-------- |
2 | 1 | 1 | 1 | 1 | 2 | 2 | ||
8(i) | --------> <-------> |
1 | 2 | 1 | 1 | 2 | 1 | 2 | 3 | |
8(ii) | --------> <-------> |
2 | 2 | 1 | 1 | 1 | 1 | 1 | 2 | 3 |
9(i) | <-------- <-------- |
1 | 2 | 1 | 2 | 2 | 2 | |||
9(ii) | <-------- <-------- |
2 | 2 | 1 | 1 | 2 | 2 | |||
10 | <<------- | 1 | 1 | 2 | 2 | 1 | 1 | |||
11(i) | <-------> <-------> |
1 | 2 | 2 | 1 | 2 | 1 | 2 | 2 | 4 |
11(ii) | <-------> <-------> |
2 | 2 | 2 | 1 | 1 | 1 | 1 | 2 | 4 |
The arrows in the "Legend" column illustrate the CIS(s) to establish between the devices. One arrow means 1 CIS. The arrow tip indicates the direction of the CIS (unidirectional or bidirectional). When the tip of the arrow is doubled, the CIS transport 2 channels. Topologies with "(i)" define interactions between 1 Central and 1 Peripheral, whereas topologies with "(ii)" define interactions between 1 Central and 2 Peripherals (earbud use-case for example).
For example:
- The configuration 4 defines a unidirectional stereo stream on one CIS to 1 device, which could be applied to transmit media to a headphone.
- The configuration 11(ii) defines two bidirectional CIS to 2 peripheral devices, which could be applied for a telephony use-case using earbuds where the two microphone on the earbuds would be used.
2.3.2. Unicast Codec Configurations
The BAP specifies LC3 codec configurations that can be used to establish audio streams between devices in Unicast mode. The following tables can be found in the BAP Specification and lists the possible configurations:
Low Latency:
Set Name | Sampling Frequency (kHz) |
Frame Duration (ms) |
SDU Interval (µs) |
Framing | Maximum SDU Size (Octets) |
RTN | Max Transport Latency (ms) |
Presentation Delay (µs) |
---|---|---|---|---|---|---|---|---|
8_1_1 | 8 | 7.5 | 7500 | unframed | 26 (27.734 kbps) | 2 | 8 | 40000 |
8_2_1 | 8 | 10 | 10000 | unframed | 30 (24 kbps) | 2 | 10 | 40000 |
16_1_1 | 16 | 7.5 | 7500 | unframed | 30 (32 kbps) | 2 | 8 | 40000 |
16_2_1* | 16 | 10 | 10000 | unframed | 40 (32 kbps) | 2 | 10 | 40000 |
24_1_1 | 24 | 7.5 | 7500 | unframed | 45 (48 kbps) | 2 | 8 | 40000 |
24_2_1 | 24 | 10 | 10000 | unframed | 60 (48 kbps) | 2 | 10 | 40000 |
32_1_1 | 32 | 7.5 | 7500 | unframed | 60 (64 kbps) | 2 | 8 | 40000 |
32_2_1 | 32 | 10 | 10000 | unframed | 80 (64 kbps) | 2 | 10 | 40000 |
441_1_1 | 44.1 | 8.163 | 8163 | framed | 97 (95.06 kbps) | 5 | 24 | 40000 |
441_2_1 | 44.1 | 10.884 | 10884 | framed | 130 (95.55 kbps) | 5 | 31 | 40000 |
48_1_1 | 48 | 7.5 | 7500 | unframed | 75 (80 kbps) | 5 | 15 | 40000 |
48_2_1 | 48 | 10 | 10000 | unframed | 100 (80 kbps) | 5 | 20 | 40000 |
48_3_1 | 48 | 7.5 | 7500 | unframed | 90 (96 kbps) | 5 | 15 | 40000 |
48_4_1 | 48 | 10 | 10000 | unframed | 120 (96 kbps) | 5 | 20 | 40000 |
48_5_1 | 48 | 7.5 | 7500 | unframed | 117 (124.8 kbps) | 5 | 15 | 40000 |
48_6_1 | 48 | 10 | 10000 | unframed | 155 (124 kbps) | 5 | 20 | 40000 |
(*) Mandatory Configuration Unicast Client & Unicast Server
High reliability:
Set Name | Sampling Frequency (kHz) |
Frame Duration (ms) |
SDU Interval (µs) |
Framing | Maximum SDU Size (Octets) |
RTN | Max Transport Latency (ms) |
Presentation Delay (µs) |
---|---|---|---|---|---|---|---|---|
8_1_2 | 8 | 7.5 | 7500 | unframed | 26 (27.734 kbps) | 13 | 75 | 40000 |
8_2_2 | 8 | 10 | 10000 | unframed | 30 (24 kbps) | 13 | 95 | 40000 |
16_1_2 | 16 | 7.5 | 7500 | unframed | 30 (32 kbps) | 13 | 75 | 40000 |
16_2_2 | 16 | 10 | 10000 | unframed | 40 (32 kbps) | 13 | 95 | 40000 |
24_1_2 | 24 | 7.5 | 7500 | unframed | 45 (48 kbps) | 13 | 75 | 40000 |
24_2_2 | 24 | 10 | 10000 | unframed | 60 (48 kbps) | 13 | 95 | 40000 |
32_1_2 | 32 | 7.5 | 7500 | unframed | 60 (64 kbps) | 13 | 75 | 40000 |
32_2_2 | 32 | 10 | 10000 | unframed | 80 (64 kbps) | 13 | 95 | 40000 |
441_1_2 | 44.1 | 8.163 | 8163 | framed | 97 (95.06 kbps) | 13 | 80 | 40000 |
441_2_2 | 44.1 | 10.884 | 10884 | framed | 130 (95.55 kbps) | 13 | 85 | 40000 |
48_1_2 | 48 | 7.5 | 7500 | unframed | 75 (80 kbps) | 13 | 75 | 40000 |
48_2_2 | 48 | 10 | 10000 | unframed | 100 (80 kbps) | 13 | 95 | 40000 |
48_3_2 | 48 | 7.5 | 7500 | unframed | 90 (96 kbps) | 13 | 75 | 40000 |
48_4_2 | 48 | 10 | 10000 | unframed | 120 (96 kbps) | 13 | 100 | 40000 |
48_5_2 | 48 | 7.5 | 7500 | unframed | 117 (124.8 kbps) | 13 | 75 | 40000 |
48_6_2 | 48 | 10 | 10000 | unframed | 155 (124 kbps) | 13 | 100 | 40000 |
The configurations are separated between two tables: Low Latency and High Reliability. High Reliability configurations use a higher Retransmission Number (RTN) and higher maximum transport latency to ensure that the audio packets are all correctly transmitter, but induce higher latency due to increased buffering of the audio packets. Most of the configuration uses an "unframed" mode, meaning the spacing of the events on the air will be a multiple of the actual time between two audio Packets (SDU Interval).
2.4. Broadcast Roles
The following diagram illustrates the simple broadcast use-case, where a device acting as Broadcast Sink interacts with a Broadcast Source.
When the device acting as Broadcast Sink is not able to select which Broadcast Source to synchronize to, due to limited display or input capacity for example, the device may use the Scan Delegator role to ask a remote Broadcast Assistant to scan and select the Broadcast Source.
2.4.1. Broadcast Advertisement
A Broadcast Source advertises to be visible by remote Broadcast Sinks. It's advertisement is composed of two different types of advertising:
- Extended Advertising: Partially on advertising channels (37-38-39) and on connection channels, it permits to be visible when a Broadcast Sink is scanning
- Periodic Advertising: On connection channels, it contains large packets with information about the stream configuration
To synchronize for a Broadcast Source, a broadcast sink initiate a scanning procedure to find the extended advertising payload, which points to the Periodic Advertising payload, which itself points to the BIG.
2.4.1.1. BASE Structure
The Broadcast Audio Source Endpoint (BASE) structure is the data structure that permits to describe a Broadcast Source. It is located in the periodic advertising data and is constituted of 3 levels with the following items:
Level | Parameter | Description |
---|---|---|
1 (Group) | Basic Audio Announcement Service UUID | UUID defined in Bluetooth Assigned Numbers: 0x1851 |
1 (Group) | Presentation Delay | Presentation Delay parameter. Refer to the the LC3 codec and audio data path wiki page, section Audio Latency[3] |
1 (Group) | Num Subgroups | Number of subgroups (level 2) |
2 (Subgroup) | Num BIS | Number of BISes (level 3) |
2 (Subgroup) | Codec ID | Codec ID for the subgroups. 0x0000000006 for the LC3 codec |
2 (Subgroup) | Codec Specific Configuration | Series of LTV structures with common codec specific configuration for the subgroup |
2 (Subgroup) | Metadata | Series of LTV Metadata structures |
3 (BIS) | BIS Index | Unique index for the BIS |
3 (BIS) | Codec Specific Configuration | Series of LTV structures with codec specific configuration for the bis |
Example of a stereo Broadcast Source with one BIS:
Example of a stereo Broadcast Source with 2 BISes:
2.4.1.2. Broadcast Source Air Events
Over the air, a Broadcast Source emit 4 different types of packets
- ADV_EXT_IND packets are sent on the 3 advertising channels (Extended Advertising)
- AUX_ADV_IND packets are sent on connection channels and contains the remaining data of ADV_EXT_IND packets (Extended Advertising)
- AUX_SYNC_IND packets are sent on connection channels (Periodic Advertising)
- BIS ISO Packets are sent on connection channels during BIG Events (BIG)
The following diagrams illustrates a possible scheduling of those packets:
2.4.2. Broadcast Topologies
The BAP specifies 3 different topologies for the Broadcast Roles. They define ways of establishing audio stream between one Central device and one or two Peripheral devices. The following table can be found in the BAP Specification and lists the possible topologies:
Audio Configuration |
Legend | Audio Channels per BIS |
BISes | Audio Streams |
---|---|---|---|---|
12 | ---)))--> | 1 | 1 | 1 |
13 | ---)))--> ---)))--> |
1 | 2 | 2 |
14 | ---)))-->> | 2 | 1 | 1 |
2.4.3. Broadcast Codec Configurations
The BAP specifies LC3 codec configurations that can be used to establish audio streams between devices in Broadcast mode. The following tables can be found in the BAP Specification and lists the possible configurations: Low Latency:
Set Name | Sampling Frequency (kHz) |
Frame Duration (ms) |
SDU Interval (µs) |
Framing | Maximum SDU Size (Octets) |
RTN | Max Transport Latency (ms) |
Presentation Delay (µs) |
---|---|---|---|---|---|---|---|---|
8_1_1 | 8 | 7.5 | 7500 | unframed | 26 (27.734 kbps) | 2 | 8 | 40000 |
8_2_1 | 8 | 10 | 10000 | unframed | 30 (24 kbps) | 2 | 10 | 40000 |
16_1_1 | 16 | 7.5 | 7500 | unframed | 30 (32 kbps) | 2 | 8 | 40000 |
16_2_1* | 16 | 10 | 10000 | unframed | 40 (32 kbps) | 2 | 10 | 40000 |
24_1_1 | 24 | 7.5 | 7500 | unframed | 45 (48 kbps) | 2 | 8 | 40000 |
24_2_1** | 24 | 10 | 10000 | unframed | 60 (48 kbps) | 2 | 10 | 40000 |
32_1_1 | 32 | 7.5 | 7500 | unframed | 60 (64 kbps) | 2 | 8 | 40000 |
32_2_1 | 32 | 10 | 10000 | unframed | 80 (64 kbps) | 2 | 10 | 40000 |
441_1_1 | 44.1 | 8.163 | 8163 | framed | 97 (95.06 kbps) | 4 | 24 | 40000 |
441_2_1 | 44.1 | 10.884 | 10884 | framed | 130 (95.55 kbps) | 4 | 31 | 40000 |
48_1_1 | 48 | 7.5 | 7500 | unframed | 75 (80 kbps) | 4 | 15 | 40000 |
48_2_1 | 48 | 10 | 10000 | unframed | 100 (80 kbps) | 4 | 20 | 40000 |
48_3_1 | 48 | 7.5 | 7500 | unframed | 90 (96 kbps) | 4 | 15 | 40000 |
48_4_1 | 48 | 10 | 10000 | unframed | 120 (96 kbps) | 4 | 20 | 40000 |
48_5_1 | 48 | 7.5 | 7500 | unframed | 117 (124.8 kbps) | 4 | 15 | 40000 |
48_6_1 | 48 | 10 | 10000 | unframed | 155 (124 kbps) | 4 | 20 | 40000 |
(*) Mandatory Configuration Broadcast Source & Broadcast Sink
(**) Mandatory Configuration Broadcast Sink
High reliability:
Set Name | Sampling Frequency (kHz) |
Frame Duration (ms) |
SDU Interval (µs) |
Framing | Maximum SDU Size (Octets) |
RTN | Max Transport Latency (ms) |
Presentation Delay (µs) |
---|---|---|---|---|---|---|---|---|
8_1_1 | 8 | 7.5 | 7500 | unframed | 26 (27.734 kbps) | 4 | 45 | 40000 |
8_2_1 | 8 | 10 | 10000 | unframed | 30 (24 kbps) | 4 | 60 | 40000 |
16_1_1 | 16 | 7.5 | 7500 | unframed | 30 (32 kbps) | 4 | 45 | 40000 |
16_2_1 | 16 | 10 | 10000 | unframed | 40 (32 kbps) | 4 | 60 | 40000 |
24_1_1 | 24 | 7.5 | 7500 | unframed | 45 (48 kbps) | 4 | 45 | 40000 |
24_2_1 | 24 | 10 | 10000 | unframed | 60 (48 kbps) | 4 | 60 | 40000 |
32_1_1 | 32 | 7.5 | 7500 | unframed | 60 (64 kbps) | 4 | 45 | 40000 |
32_2_1 | 32 | 10 | 10000 | unframed | 80 (64 kbps) | 4 | 60 | 40000 |
441_1_1 | 44.1 | 8.163 | 8163 | framed | 97 (95.06 kbps) | 4 | 54 | 40000 |
441_2_1 | 44.1 | 10.884 | 10884 | framed | 130 (95.55 kbps) | 4 | 60 | 40000 |
48_1_1 | 48 | 7.5 | 7500 | unframed | 75 (80 kbps) | 4 | 50 | 40000 |
48_2_1 | 48 | 10 | 10000 | unframed | 100 (80 kbps) | 4 | 65 | 40000 |
48_3_1 | 48 | 7.5 | 7500 | unframed | 90 (96 kbps) | 4 | 50 | 40000 |
48_4_1 | 48 | 10 | 10000 | unframed | 120 (96 kbps) | 4 | 65 | 40000 |
48_5_1 | 48 | 7.5 | 7500 | unframed | 117 (124.8 kbps) | 4 | 50 | 40000 |
48_6_1 | 48 | 10 | 10000 | unframed | 155 (124 kbps) | 4 | 65 | 40000 |
Like the Unicast codec configurations, the Broadcast codec configurations are separated between a Low Latency table and High Reliability table.
3. Audio Stream Control Service (ASCS)
The ASCS is hosted on a Unicast Server to permit remote Unicast Client to configure and establish audio streams between the two devices. It contains the current unicast state of the Unicast Server and its state machine. It's specified by the Bluetooth® SIG and the full specification is accessible on their website[4].
3.1. Audio Stream Endpoints (ASE)
Audio Stream Endpoints (ASE) are hosted on Unicast Server and are controller by remote Unicast Client to establish audio streams. They can be Sink or Source: a Sink ASE will host an audio stream directed from the Unicast Client to the Unicast Server and a Source ASE from the Unicast Server to the Unicast Client. An ASE can host multiple channels, concatenated on one CIS.
The following diagram illustrate a Unicast Server hosting 3 ASE: 2 Sink ASE and 1 Source ASE. The bidirectional CIS is established using ASE with ids 1 and 3, the unidirectional CIS uses the ASE with ID 2. The resulting configuration is configuration 8(i).
3.2. ASE State Machine
The Source ASE State Machine is the following, as specified in the Bluetooth SIG ASCS Specification[4]:
The Sink ASE State Machine is the following, as specified in the Bluetooth SIG ASCS Specification[4]:
For more details about ASE states and state machine, refer to the ASCS Specification[4] section 3.
4. Published Audio Capability Service (PACS)
The PACS permits to a Unicast Server to expose to remote Unicast Clients and Broadcast Assistants its audio capabilities. It's specified by the Bluetooth® SIG and the full specification is accessible on their website[5].
The PACS contains information about 6 items
- Sink Published Audio Capabilities
- Sink Audio Locations
- Source Published Audio Capabilities
- Source Audio Locations
- Available Audio Contexts
- Supported Audio Contexts
4.1. Published Audio Capability (PAC)
A PACS Server exposes one or multiple Published Audio Capability (PAC) records. These records indicates to a remote PACS Client which audio configuration are supported by the server.
Parameter | Size (octets) | Value |
---|---|---|
Codec_ID | 5 | Codec ID, 0x0000000006 for the LC3 Codec |
Codec_Specific_Capabilities_Length | 1 | Length, in octets, of the Codec_Specific_Capabilities value |
Codec_Specific_Capabilities | Varies | Sequence of LTVs (Length-Type-Value) of Codec Capabilities See Assigned Numbers[2] |
Metadata_Length | 1 | Length of the Metadata field |
Metadata | Varies | Sequence of LTVs (Length-Type-Value) of Metadata See Assigned Numbers[2] |
4.2. Audio Locations
A PACS Server exposes a bitfield of Audio Locations value supported. The possible values can be found in the Assigned Numbers specification[2]. Possible common values for an Audio Location are
- 0x00: Mono Audio without audio location. A mono speaker which is not part a speaker system can use this value.
- 0x01: Front Left
- 0x02:
5. Broadcast Audio Scan Service (BASS)
The Broadcast Audio Scan Service is hosted on a Scan Delegator collocated with a Broadcast Sink to communicate with a remote Broadcast Assistant and exchange information and commands about available Broadcast Sources. It's specified by the Bluetooth® SIG and the full specification is accessible on their website[6].
6. STM32Cube Firmware APIs
7. References