Audio Matrix & Dante
with PIXELHUE Switcher

Audio Matrix Function Introduction

In the professional audio-visual (ProAV) application field, audio and video serve as core information carriers, collectively forming a systematic transmission matrix. In traditional workflows, video signals are processed and output by specialized video processors, while audio signals are independently handled by dedicated mixing consoles. The two operate in coordination via baseband links or IP networks.

The industry is currently witnessing a trend toward deep integration: video processing equipment is progressively expanding its audio processing capabilities, integrating core functional modules such as basic audio routing and switching, gain control, and mixing. Simultaneously, through algorithm optimization and hardware architecture upgrades, video processors continue to enhance the professionalism of audio processing chains, supporting advanced features such as multi-channel noise reduction, sound field simulation, and metadata synchronization.

This convergence trend significantly improves system integration and operational efficiency, driving ProAV toward intelligent and fully IP-based evolution.

1. Principles of Embedded Audio Transmission

While current professional video processors are increasingly integrating IP interfaces to support video stream transmission protocols such as NDI, Dante, and ST2110, their core input/output still primarily relies on three types of physical interfaces: HDMI, DisplayPort (DP), and SDI. All these interfaces support embedded audio transmission capabilities. When the original video stream contains embedded audio, the audio and video signals can be synchronously transmitted directly through the video interface without the need for additional deployment of an independent audio transmission network. The number of audio channels and maximum sampling rates supported by different interfaces are shown in the table below.

Chapter 1：Audio Channels and maximum sampling rates overview

Video processing interfaces primarily allocate their bandwidth for video transmission, while audio data is transmitted during the idle intervals of the video signal. Taking HDMI as an example, its data transmission process is strictly divided into three phases:
• Control Period: Transmits horizontal/vertical synchronization signals (HSYNC/VSYNC) and control
headers via TMDS channels to declare the type of the next cycle;
• Data Island Period: Transmits audio packets and auxiliary data during horizontal/vertical blanking intervals (H-Blank/V-Blank);
• Video Data Period: Transmits 24-bit pixel data encoded in RGB or YCbCr, occupying the core bandwidth resources.

Audio transmission occurs exclusively during the Data Island Period, where it utilizes the idle bandwidth of the blanking intervals to enable simultaneous audio and video transmission over the same cable without requiring additional dedicated bandwidth during active video transmission.

P1: Data transmission process

The Data Island Period is distributed within the vertical blanking (V-Blank) and horizontal blanking (H-Blank) intervals. This design prevents timing conflicts with the active video data period. However, embedded audio transmission relies on the idle bandwidth resources available during these blanking intervals—the higher the requirements for the number of audio channels and the sampling rate, the greater the demand on the capacity of the blanking intervals. The specific relationship can be expressed as follows:

Blanking Interval Bandwidth Capacity
=
(Number of Channels × Sampling Rate × Bit Depth)

For example:
Transmitting 16 channels of audio at 192 kHz with 24-bit resolution requires approximately 73.7 Mbps of bandwidth.

2. Introduction of Dante audio technology

In the field of audio transmission, in addition to conventional analog audio interfaces (such as TRS, RCA, XLR) and digital audio interfaces (such as S/PDIF, MADI, USB), IP network-based Dante interfaces are rapidly gaining popularity. Users can leverage Dante Controller software to automatically discover and manage the input/output matrix connections of all audio devices within a local area network, significantly reducing the complexity of building on-site audio systems.

P2: Topology map with Dante Technology

2.1 Definition of Dante

Dante (Digital Audio Network Through Ethernet) is a protocol for transmitting lossless digital audio over standard IP networks, introduced by Audinate in 2006. It enables the transmission of multi-channel, uncompressed audio signals via Ethernet (100Mbps/Gigabit) and supports bidirectional routing control. Its transmission mechanism can be summarized as follows:
• Protocol Architecture: Utilizes the UDP/IP protocol (Transport Layer), where audio data is
encapsulated into IP packets and transmitted through standard network switches.
• Clock Synchronization: Employs IEEE 1588 PTP (Precision Time Protocol) to achieve sub-
microsecond-level device synchronization, eliminating clock jitter.
• Zero-Configuration Networking: Uses the Zeroconf protocol for automatic IP address allocation and
device discovery, eliminating the need for DHCP or DNS servers.

2.2 Core Features and Technical Advantage

2.2.1.Low Latency and High Audio Quality

• Latency as low as 34μs (on a 100M network with 3 channels), while a Gigabit network supports up to
1024 bidirectional audio channels (at 48kHz/24bit).
• Supports 32-bit depth and sample rates ranging from 44.1kHz to 192kHz, enabling lossless transmission.

2.2.2.Network Compatibility and Redundancy

• Shares networks with existing IT infrastructure, supporting simultaneous transmission of audio,
control data, and video streams (e.g., TCP/IP devices, PCs).
• Features dual-network redundancy; automatically switches to a backup network if the primary fails,
ensuring broadcast-grade reliability.

2.2.3. Flexible Routing and Scalability

• Enables visual routing management via Dante Controller software, allowing drag-and-drop
configuration of signal flows (unicast/multicast).
• System expansion only requires connecting new devices to network ports without rewiring (e.g.,
adding microphones or processors).

2.2.4.Cost and Operational Efficiency

• Replaces analog and multi-core cables, reducing physical wiring by up to 90% and lowering installation costs.
• Supports Power over Ethernet (PoE), simplifying power delivery for stage equipment (e.g., Apollo e2m headphone amplifiers)

2.3 Comparison with Traditional Analog Systems

Chapter 2：Comparison with Traditional Analog Systems

3. Introduction to Audio Matrix Usage

All input and output video interfaces of the Pixelhue Q8 switcher support embedded audio input and output. Each interface supports up to 8 audio channels with a sampling rate of 48 kHz. P20-DS switcher support embedded audio input and output. Each interface supports 2 audio channels with a sampling rate of 48 kHz. And future updates will extend support to 96 kHz. To ensure user-friendly operation and maintain familiar workflow conventions, our audio matrix adopts the same configuration logic as Dante Controller: The horizontal axis represents all input audio channels. The vertical axis represents all output audio channels.

For P20-DS：

Chapter 3：P20-DS's total audio channels

For Q8:

Chapter 4：Q8's total audio channels

P3: P20-DS's Audio Matrix

P4: Q8's Audio Matrix

Sample according to Q8's Audio Matrix:

When the 8 channels of an interface are in a collapsed state, clicking on the corresponding channel mapping point will establish a one-to-one correspondence between channels 1-8 of the input interface and channels 1-8 of the output interface.

P5: One-to-one correspondence

P6: One-to-one correspondence

All audio channels support mute and test tone functions.

P7: Mute and test tone