Multichannel Audio Digital Interface (MADI) standardized as AES10 by the Audio Engineering Society (AES) defines the data format and electrical characteristics of an interface that carries multiple channels of digital audio. The AES first documented the MADI standard in AES10-1991 and updated it in AES10-2003 and AES10-2008. The MADI standard includes a bit-level description and has features in common with the two-channel AES3 interface.
Use of coaxial and optical fiber media that support transmission of audio signals over 100 meters, up to 3000 meters over multi-mode and 40,000 meters over single-mode optical fiber
The original specification (AES10-1991) defined the MADI link as a 56-channel transport for linking large-format mixing consoles to digital multitrack recording devices. Large broadcast studios also adopted it for routing multi-channel audio throughout their facilities. The 2003 revision (AES10-2003) adds a 64-channel capability by removing varispeed operation and supports 96 kHz sampling frequency with reduced channel capacity.[a] The latest AES10-2008 standard includes minor clarifications and updates to correspond to the current AES3 standard.
Audio over Ethernet of various types is the primary alternative to MADI for transport of many channels of professional digital audio.
Transmission format
MADI links use a transmission format similar to Fiber Distributed Data Interface (FDDI) networking. Since MADI is most often transmitted on copper links via 75-ohm coaxial cables, it more closely compares to the FDDI specification for copper-based links, called CDDI. AES10-2003 recommends using BNC connectors with coaxial cables[1]: 7.1.4 and SC connectors[b] with optic fibers.[1]: 7.2.1 MADI over fibre can support a range of up to 2 km.
The basic data rate is 100 Mbit/s of data using 4B5B encoding to produce a 125 MHz physical baud rate. Unlike AES3, this clock is not synchronized to the audio sample rate, and the audio data payload is padded using JK sync symbols. Sync symbols may be inserted at any subframe boundary, and must occur at least once per frame.[c] Though the standard disassociates the transmission clock from the audio sample rate, and thus requires a separate word clock connection to maintain synchronization, some vendors do give the option of locking to parts of the transmission timing information for purposes of deriving a word clock.
The audio data is almost identical to the AES3 payload, though with more channels. Rather than letters, MADI assigns channel numbers from 0–63. Frame synchronization is provided by sync symbols outside the data itself, rather than an embedded preamble sequence, and the first four time slots of each sub-channel are encoded as normal data, used for sub-channel identification:[6]
Bit 0: Set to 1 to mark channel 0, the first channel in each frame
Bit 1: Set to 1 to indicate that this channel is active (contains interesting data)
Bit 2: notA/B channel marker, used to mark left (0) and right (1) channels. Generally, even channels are A and odd channels are B.
Bit 3: Set to 1 to mark the beginning of a 192-sample data block
Sampling frequency
The original AES10-1991 specification allowed 56 channels at sample rates from 32 to 48 kHz with an additional vari-speed range of ± 12.5%.[7][8] This leads to a total range of 28 to 54 kHz. At the highest frequency, this produces a total of 56 × 32 × 54 = 96768 kbit/s, leaving 3.232% of the channel for synchronization marks and transmit clock error.
The 2003 revision specifies different relations between sampling frequency and number of channels.[9]
32 kHz to 48 kHz ± 12.5%, 56 channels;
32 kHz to 48 kHz nominal, 64 channels;
64 kHz to 96 kHz ± 12.5%, 28 channels.
With a 48 kHz sampling frequency, 64 channels take 64 × 32 × 48000 = 98.304 Mbit/s. Adding the minimum 8 × 58 kbit/s of framing produces 98688 bit/s, leaving 1.312% free for timing variation and other overhead.
Both versions of the standard accommodate higher sampling frequencies (for example, 96 kHz or 192 kHz) by using two or more channels per audio sample on the link.
Notes
^Some implementations have extended this technique to support 192 kHz sampling frequency.[4][5]
^"ST1" is called out in AES10. This is a reference to an entry for the SC connector in a discontinued AES connector database.
^This requirement introduces a 0.45% minimum overhead.