Improving integral cryptanalysis

In my masters thesis (which is in Slovak) we tried to improve integral cryptography by using specific values in places, where in the classic approach it is acceptable to use any values.

Main contributions of my thesis were testing this idea on a toy cipher, which turned out to be successful and implementation of a search for this new type of integral distinguisher into Solvatore1. Solvatore is a tool, which can check a cipher for a classic integral cryptanalysis vulnerabilities using SAT solvers.

We also analyzed time and space complexity of an attack using integral cryptanalysis and formally proved some of some implied statements, which were used in the Solvatore paper1 .

Quick intro into integral cryptanalysis

Internal cryptanalysis is one of the basic attacks the ciphers are tested against. It is a Chosen Plaintext attack (CPA) (more about different types of attacks). An attacker chooses a set of plaintexts and can use an oracle to encrypt them using a secret key. Using the set of created ciphertexts the attacker tries to (in our case) uncover the secret encryption key.

When doing an integral cryptanalysis, the set of plaintexts is usually created using the following principle, also shown in the illustration:
Block ciphers have a set number of input bits used for plaintext. Each of these bits is set either to be active (A) or constant (C) 2. Constant bits are always the same in each plaintext, while active bits are permuted through all of its combined values, as can be seen in the illustration.

Set of plaintexts

Integral distinguisher is a given “set” of active and constant bits, which when encrypted and all summed together using XOR will always have a known value in some bit of the sum. For example, in the illustration we have all bits of second byte assigned as active, which results in known sum of their ciphertexts – all bits of second byte are 0.

Basic integral distinguisher

Please note, that encryption usually takes place in rounds and integral attack is usually effective only for several ones at the beginning. For example, there are integral distinguishers for AES for 5 rounds, which means that an attacker can mount an attack on 6 or 7 rounds, however the difficulty of the attack grows with each added round and AES has 10,12 or 14 rounds by specification. This means that within reason, we try to maximize the number of rounds on which we can find an integral distinguisher.

Integral distinguishers can be used to uncover encryption keys. For example, let’s say there is a distinguisher for 4 rounds of a cipher. In an attack scenario for 5 rounds of this cipher, an attacker would choose the plaintexts based on the 4 round distinguisher and use the encryption oracle to generate 5-round ciphertexts. Then they would try to brute-force the key used in the last encryption round. For each key, they would decrypt one round of the cipher for all ciphertexts and XOR-sum them together.

Basic attack

Using the distinguisher, attacker knows that this sum has to have specific values for certain bits. They try all possible values for the key used in the last round, decrypt only the last round and sum together the results. If the results match the expectations based on the distinguisher, the guessed key might be the right one.

Integral distinguishers, ANF and derivatives

We can also look at encryption as a set of boolean functions, which inputs are bits of plaintexts and the key, and its output is 0 or 1. In other words, first bit of ciphertext is computed as $f_1(x_1,x_2,x_3,x_4,k_1,k_2,k_3,k_4)$ , where $x_1$ to $x_4$ are bits of a plaintext and $k_1$ to $k_4$ are bits of a key.

Algebraic normal form (ANF) of a function consists of one or more variables ANDed together (monomials), which are then XORed together. For example, $f_1$ is a boolean function and $f_2$ is an equivalent boolean function in ANF:

(We use addition as XOR and multiplication as AND ) $$f_1(x_1,x_2,x_3,x_4) = (x_1 x_2 x_3)(x_4 + x_1) + x_1$$ $$f_2(x_1,x_2,x_3,x_4) = x_1 x_2 x_3 x_4 + x_1 x_2 x_3 + x_1$$

Derivative of such function with respect to active bits can tell us if there will be a useful integral distinguisher.

Let’s say, that we have 4-bit block cipher, with 4 bit plaintexts, 4 bit ciphertexts and 4 bit key. That encryption can be described by functions $f_1(x_1, x_2, x_3, x_4, k_1, k_2,k_3,k_4)$ to $f_4(x_1, x_2, x_3, x_4, k_1, k_2,k_3,k_4)$, where output of $f_1$ is first bit of the ciphertext ,output of $f_2$ is the second bit and so on.

Let’s assign first and second bits to be active, and third and fourth bits to be constant. For each boolean function $f_1$ to $f_4$ will be created its derivative with respect to $x_1$ and $x_2$. If any of these derivatives is a constant 0 (for the sake of example, derivative of $f_3$ is equal to constant 0), then we know, that there is a integral distinguisher with this assignment of bits as active and constant.

More precisely, if we created a set of plaintexts where first two bits are active, encrypted them all using an encryption oracle and then summed together ciphertexts using XOR, since derivative of $f_3$ was constant 0, third bit of ciphertexts sum will be certainly 0.

Blue bits

The central idea of the thesis is a question: Could assigning a specific value for a constant bit help us?

We created a new types of a bit apart of active and constant – blue bit 0 and blue bit 1, which are constant bits that have that specific value. We call them blue bits just so we can more easily distinguish them.

From the point of view of integral distinguisher as a derivative from the previous chapter, if some derivative of a function contained only monomials with constant bits, by assigning these to 0, we could make a derivative of a function be constant 0, even if it otherwise would not be. In other words, we could create an integral distinguisher, where there could be none before. This will be demonstrated with examples in part Blue Bits Examples.

Cipher RES

We created a simple toy cipher RES, just to demonstrate a potential impact of the new method. It has a short block length of 4 bits, just to allow for operations, which are exponential according to block length.

It is composed from 3 operations:

  1. XOR add a key
  2. Modular addition of a constant to the cipher state
  3. Cyclic shift left by a constant number of bits

RES description

For all of our experiments, we tried all possible values of the constants for modular addition and cyclic shift. Some of these turn out to be very susceptible to integral cryptanalysis, and in the thesis we provide formal proof as to why.

We tried analysis of this toy cipher using Solvatore. We also brute-forced all of the possibilities of setting the input bits as constant or active, which turned out to be identical. Results can be seen in the following table. Each value represents the maximum number of rounds, for which we found an integral distinguisher with specific RES configuration.

$c_{ls}$ \ $c_{ma}$0123456789101112131415
0$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$
1$\infty$474$\infty$474$\infty$474$\infty$474
2$\infty$353$\infty$353$\infty$353$\infty$353
3$\infty$474$\infty$474$\infty$474$\infty$474

Blue bits for RES

Blue bits did help us find integral distinguishers for more rounds of RES cipher. We brute-forced all possibilities of setting input bits as blue or active.

Using this method, we found better integral distinguishers for almost all configurations of RES cipher. This proves that blue bits can be used to generate better integral distinguishers.

In this table, for each cipher configuration, there is a maximal number of rounds for which the classical integral distinguisher was found, and by how many rounds it could be improved when using blue bits.

$c_{ls}$ \ $c_{ma}$0123456789101112131415
0$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$$\infty$
1$\infty$47+14+2$\infty$4+27+14$\infty$47+14+2$\infty$4+27+14
2$\infty$3+105+23+3$\infty$3+35+23+10$\infty$3+105+23+3$\infty$3+35+23+10
3$\infty$47+44+1$\infty$4+17+44$\infty$47+44+1$\infty$4+17+44

Blue bits examples

In previous part we could see that blue bits can improve search for integral distinguishers for more rounds. In this part we showcase in detail how could that be the case on specific examples.

Blue bit 0 use-case

Function $f_1$ for first ciphertext bit in RES(3,14,9) (RES with 3 rounds and constants $c_{ma}$=14 and $c_{ls}$=3) after differentiation in respect to first and second bits $x_1$ and $x_2$ in ANF looks like this:

$$f_1(x_1, x_2, x_3, x_4,k_1, k_2,k_3,k_4) = x_4 + x_3 k_1 + x_3 k_2 + x_3 k_3 + x_3 k_4 + x_4 k_3 $$

We can see, that each monomial in ANF contains $x_3$ or $x_4$, which are constant plaintext bits. If these were both designated as blue bit 0, the whole function would be equal to constant 0, and integral distinguisher would exist.

Blue bit 1 use-case

Function $f_2$, which represents second ciphertext bit in RES(2,7,5) (RES with 2 rounds and constants $c_{ma}$=7 and $c_{ls}$=5) after differentiation in respect to third and fourth bits $x_3$ and $x_4$ looks like this: $$f_2(x_1, x_2, x_3, x_4, k_1, k_2,k_3,k_4) = (k_1 + x_1 k_1) + (k_3 + x_1 k_3 )+ (k_1 k_2 + x_2 k_1 k_2) + (k_2 k_3 + x_2 k_2 k_3 )+ (x_1 k_2 k_4 + x_2 k_2 k_4) $$

This one is not in ANF just to more easily demonstrate, that if $x_1$ or $x_2$ were set as blue bits 1, each parenthesis-enclosed expression would XOR sum to 0, and whole function would be equal to constant 0.

In this case, there also would exist an integral distinguisher, whereas if we worked only with active and constant bits, we would find none.

Integrating blue bits into Solvatore

Computing intergral distinguishers by modeling full ANF, as was done in previous parts is computationaly infeasible for real-world ciphers and more than a handful rounds. Complexity of the functions rapidly increases with each round. That is why we try to integrate concept of blue bits into Solvatore 1.

Solvatore 1 is a tool, which transforms a question of whether a cipher has an integral distinguisher into a logical form, which is solvable by a SAT solver.

Without getting too much into details in this article, we created two ways of how such thing could be implemented.

Integration by creating a new state in Solvatore

First approach is to modify logical operations which Solvatore uses to take into account this new types of bit designation, and how it interacts with active, constant bits and with each other. This approach however would be ineffective, if key whitening was used.

We tested this approach on cipher Speck3, with a shortened block length to 16 bits and at most 5 or at least 27 blue bits. In both cases the results were the same as with the original version of Solvatore.

Integration by simplifying the first round

Another approach was to model ANF for the first round of a cipher in Python, substitute the values 0 and 1 in place of blue 0 and blue 1 and then model it in Solvatore logic as a new cipher. Other rounds are created as in the original cipher, and the internal logic of Solvatore is modified so the first simplified round and the others are bridged.

This approach was tested on RES and the results were the same as with the original version of Solvatore.

Integration by simplifying several rounds

The idea from the previous chapter can be expanded. We can model several encryption rounds at the beginning (not just the first one), and then use original cipher for the rest.

However, modeling full ANF of a cipher in Python is getting way more difficult with each additional round, so this approach has its limits as well.

From 28 RES configurations where we know of a better distinguisher, using this method we were able to recreate 20. However, for most of them we had to model most of the rounds in Python, which severely limits its computational efficiency.

Conclusion

We were able to demonstrate, that the idea of blue bits has validity, at least for a toy cipher RES. We modeled searching for such integral distinguishers in Solvatore using two methods, one of which also showed results on the toy cipher.

Other contributions in the thesis were formal proofs of some implicit claims in the Solvatore1 paper, and complexity estimations of an integral attack with or without blue bits.

In the future could this thesis be extended by speeding up the searching, so it would be practical for real-world ciphers with longer block. Different interesting idea could be the application of blue bits for some bits of the key.


  1. Zahra Eskandari, Andreas Brasen Kidmose, Stefan Kölbl, and Tyge Tiessen. Finding Integral Distinguishers with Ease. In Carlos Cid and Michael J. Jacobson Jr.,editors,Selected Areas in Cryptography – SAC 2018, pages 115–138, Cham, 2019.Springer International Publishing. ↩︎

  2. To be technically precise, in the example we assign whole bytes as either active or passive, but assigning each bit separately is possible as well ↩︎

  3. Ray Beaulieu, Douglas Shors, Jason Smith, Stefan Treatman-Clark, Bryan Weeks, and Louis Wingers. The simon and speck families of light-weight block ciphers. Cryptology ePrint Archive, Report 2013/404, 2013. https://eprint.iacr.org/2013/404↩︎