Form of cryptanalysis which allows an attacker to distinguish encrypted data from random data
In cryptography, a distinguishing attack is any form of cryptanalysis on data encrypted by a cipher that allows an attacker to distinguish the encrypted data from random data.[1] Modern symmetric-key ciphers are specifically designed to be immune to such an attack.[2] In other words, modern encryption schemes are pseudorandom permutations and are designed to have ciphertext indistinguishability. If an algorithm is found that can distinguish the output from random faster than a brute force search, then that is considered a break of the cipher.
A similar concept is the known-key distinguishing attack, whereby an attacker knows the key and can find a structural property in the cipher, where the transformation from plaintext to ciphertext is not random.[3]
Overview
To prove that a cryptographic function is safe, it is often compared to a random oracle. If a function were a random oracle, then an attacker is not able to predict any of the output of the function. If a function is distinguishable from a random oracle, it has non-random properties. That is, there exists a relation between different outputs, or between input and output, which can be used by an attacker for example to find (a part of) the input.
Example
Let T be a sequence of random bits, generated by a random oracle and S be a sequence generated by a pseudo-random bit generator. Two parties use one encryption system to encrypt a message M of length n as the bitwise XOR of M and the next n bits of T or S respectively. The output of the encryption using T is truly random. Now, if the sequence S cannot be distinguished from T, the output of the encryption with S will appear random as well. If the sequence S is distinguishable, then the encryption of M with S may reveal information of M.
Two systems S and T are said to be indistinguishable if there exists no algorithm D, connected to either S or T, able to decide whether it is connected to S or T.
A distinguishing attack is given by such an algorithm D. It is broadly an attack in which the attacker is given a black box containing either an instance of the system under attack with an unknown key, or a random object in the domain that the system aims to emulate, then if the algorithm is able to tell whether the system or the random object is in the black box, one has an attack. For example, a distinguishing attack on a stream cipher such as RC4 might be one that determines whether a given stream of bytes is random or generated by RC4 with an unknown key.
Examples
Classic examples of distinguishing attack on a popular stream cipher was by Itsik Mantin and Adi Shamir who showed that the 2nd output byte of RC4 was heavily biased toward zero.[4] In another example, Souradyuti Paul and Bart Preneel of COSIC have shown that the XOR value of the 1st and 2nd outputs of RC4 is also non-uniform. Significantly, both the above theoretical biases can be demonstrable through computer simulation.[5]