Re: [Cfrg] On the (non-)randomness of the S-box of Streebog and Kuznyechik

Leo Perrin <leo.perrin@inria.fr> Tue, 06 August 2019 15:04 UTC

Return-Path: <leo.perrin@inria.fr>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A3D112019B for <cfrg@ietfa.amsl.com>; Tue, 6 Aug 2019 08:04:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.899
X-Spam-Level:
X-Spam-Status: No, score=-6.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YlwLoiQ2mNSK for <cfrg@ietfa.amsl.com>; Tue, 6 Aug 2019 08:04:14 -0700 (PDT)
Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BC6E9120189 for <cfrg@irtf.org>; Tue, 6 Aug 2019 08:04:13 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="5.64,353,1559512800"; d="scan'208,217";a="315770426"
X-MGA-submission: MDHmWFnqjlA6qFNrfisz0emZEJnlskowwxr4d+iYjiy2hsAkDeTidbKKxQZOFWeUaEn4snAh6MibfpIKAJREP/ELfZ5vFF4l3HQkbL6YszUHG4MyA7v1JV9GhbrLNn4/l1AVWdDgg6WZvAdrR4jRRqvc2sqBvjoALDCOQUBTA6qczA==
Received: from zcs-store2.inria.fr ([128.93.142.29]) by mail3-relais-sop.national.inria.fr with ESMTP; 06 Aug 2019 17:04:11 +0200
Date: Tue, 06 Aug 2019 17:04:11 +0200
From: Leo Perrin <leo.perrin@inria.fr>
To: "Stanislav V. Smyshlyaev" <smyshsv@gmail.com>
Cc: Stephen Farrell <stephen.farrell@cs.tcd.ie>, cfrg <cfrg@irtf.org>, Dmitry Belyavsky <beldmit@gmail.com>
Message-ID: <1887277869.25916502.1565103851198.JavaMail.zimbra@inria.fr>
In-Reply-To: <CAMr0u6kGAPRoS70uqqOPJzv30tBR0pgMKLSrBO0eksWrB5Pi8w@mail.gmail.com>
References: <1327417226.25659372.1565019306532.JavaMail.zimbra@inria.fr> <CADqLbz+2dbvxdaGKp_3XMprp4XMxDK=B=1GKCLmxkjThX9kPYg@mail.gmail.com> <CAMr0u6kGAPRoS70uqqOPJzv30tBR0pgMKLSrBO0eksWrB5Pi8w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=_9ae79a7d-89a4-4de4-b797-5197e271ba22"
X-Originating-IP: [128.93.64.216]
X-Mailer: Zimbra 8.7.11_GA_3800 (ZimbraWebClient - FF68 (Linux)/8.7.11_GA_3800)
Thread-Topic: On the (non-)randomness of the S-box of Streebog and Kuznyechik
Thread-Index: d7twuV8vuN7GF6YGMGJOqUPyQeLNqQ==
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/5MpF0idlzEtS9a5a2oWz3bukD8E>
Subject: Re: [Cfrg] On the (non-)randomness of the S-box of Streebog and Kuznyechik
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Aug 2019 15:04:18 -0000

Hello everyone, 

There are two issues which should not be mistaken for one another. 

1. Do the specifics of the decomposition I found lead to an attack? 

As far as I know the answer is no, though I haven't looked at this very much since I found the TKlog decomposition about a year ago. I cannot speak for the whole secret key community but I know the same is true for many of us as we have been busy with the NIST lightweight standardization. It is not accurate to say that I have been looking at ways to exploit these properties for a year---or that anyone else has (as far as I know). 

I personnally really don't like the properties I found but it is fair to say that it is merely my intuition. I like to think it is an informed intuition as I have spent *a lot* of time looking at S-boxes, but the fact is that it is only an intuition. Talking about backdoors and such would only be speculations at that point, although I would definitely welcome investigations on this topic. 

2. Why did the designers provide wrong information to the ISO cryptographers? 

When publishing a cryptographic algorithm, it is expected that all its part are properly explained. For example, if we look at the NIST call for lightweight algorithms [1], it is explicitly stated that "The submitter *shall* explain the provenance of any constants or tables used in the algorithm." In other words, having unexplained constants or S-boxes is a deal breaker from the start, regardless of the other properties of the cipher. That was the situation in which the Russian algorithms were before 2018: they had an unexplained S-box. At that point, when asked for explanations by ISO, its authors "provided" the missing information in the form of [2]. 

What I claim, and what motivates my call for deprecation, is that this information is wrong: it describes an S-box generation process which is tremendously unlikely to produce anything that remotely resembles their S-box. Further, I disagree with Dmitry when he describes this as "controversial declarations about some details of the construction". It is factually wrong declarations, not controversial ones; and it is not about some details of the algorithms, it is about half of the round function; the only non-linear part of both algorithms in fact. 

The point of the short implementations is only to prove my claim above. The existence of a short implementation is not a problem in itself---the AES S-box has ~equally short implementations in golf languages. What matters is that it utterly disproves the explanations provided by the algorithm authors. 

Let me put it differently. Right now, the CFRG is looking for PAKE candidates. If a candidate relied on an elliptic curve claimed to be generated randomly but which actually turns out to have a very strong and new mathematical structure which cannot happen by chance, I think it would be kicked out without further analysis. 

Ultimately, cryptography is about trust. How can there be trust in an algorithm when all the evidence points towards their designers acting in bad faith? 

/Léo 

PS: To answer Jonathan proposal of verifiable randomness: Yes, as far as I am concerned, that could be a way forward. I am personnally not a fan of random S-boxes but our main conclusion in [3] is that the properties of a random S-box are basically always the same: they have no structure and have mediocre differential/linear properties with overwhelming probability that would probably be good enough here. 

[1] https://csrc.nist.gov/CSRC/media/Projects/Lightweight-Cryptography/documents/final-lwc-submission-requirements-august2018.pdf 
[2] [ https://cdn.virgilsecurity.com/assets/docs/memo-on-kuznyechik-s-box.pdf | https://cdn.virgilsecurity.com/assets/docs/memo-on-kuznyechik-s-box.pdf ] 
[3] https://eprint.iacr.org/2019/528 

> De: "Stanislav V. Smyshlyaev" <smyshsv@gmail.com>
> À: "Leo Perrin" <leo.perrin@inria.fr>, "Stephen Farrell"
> <stephen.farrell@cs.tcd.ie>
> Cc: "cfrg" <cfrg@irtf.org>, "Dmitry Belyavsky" <beldmit@gmail.com>
> Envoyé: Mardi 6 Août 2019 15:14:43
> Objet: Re: [Cfrg] On the (non-)randomness of the S-box of Streebog and
> Kuznyechik

> Dear Stephen and Leo,

> >> Does the structure you've discovered also exist in GOST R 34.11-94?
> Stephen, GOST R 34.11-94 is built using a completely different construction, so
> (Leo will correct me, if I am mistaken) the discovered properties are not
> related to GOST R 34.11-94. I’d also like to clarify that this GOST R 34.11-94
> is an old hash function, which has been deprecated for about 7 years in Russia.

>>> In general I would support CFRG deprecating any algorithms for which there is
>>> significant doubt. I have not myself verified that the structure Leo has found
> >> is real…
> First of all, I’d like to express my sincere appreciation to Leo for his deep
> study of Streebog/Kuznyechik S-Box properties. I am not a designer of those
> algorithms, but my company (software development: digital signature servers,
> VPN solutions etc.) uses them in those software products which are intended for
> usage with governmental applications inside Russia (they must be used for such
> applications). Hence, I am definitely interested in having as much research of
> those algorithms as possible: they are obviously worth being studied.
> Leo’s results about structures of those S-Boxes are very interesting and
> important for understanding the properties of the algorithms themselves and for
> optimization of their implementations (which is interesting for all software
> developers implementing them). I sincerely wish that Leo continues your
> research – Leo, please continue this study, you’re doing a great work.
> Leo has discovered a very interesting property of the S-Boxes, a very curious
> structure of representing them, but there is a certain contrast between high
> level of the Leo’s scientific part of work and the lack of ground (so far, at
> least) for conclusions and accusations about possible vulnerabilities in the
> algorithms. At the CFRG session in Montreal there was a discussion on this
> issue, it is important to continue the research when some new properties are
> found in any crypto, but we can’t “deprecate” something deployed (at least,
> there are many cases when they are used in Russian part of Internet) “just in
> case”, just because there are some new properties found (without any results of
> how they could lead to a vulnerability or to a backdoor, despite the fact that
> Leo has obviously been working on finding such for at least one year).
> If Leo had a result like Ferguson and Shumow in 2007 ( [
> https://rump2007.cr.yp.to/15-shumow.pdf |
> https://rump2007.cr.yp.to/15-shumow.pdf ] , when they showed that the knowledge
> of certain parameter – discrete logarithm in that case – allows to mount an
> attack on predicting Dual EC DRBG outputs with low complexity), then, of
> course, it would be definitely a reason for significant doubts in those
> algorithms (and, particularly, for an I-D about the found attacks) – in
> international and national organizations and technical committees. But, as Leo
> stresses himself, he does not have any (even preliminary) results of this kind.

> During the discussion at the CFRG session Leo claimed that he stressed in his
> papers explicitly, that he hasn’t found any methods for mounting the attacks or
> conditional results like “if the designers know a certain parameter X, then
> they can mount the following attack”.
> Leo, many thanks for that clarification of yours; you’ve been absolutely honest,
> I’ve found this in your papers:
> - “Kuznyechik seems to be in the second situation [although the S-Box and linear
> layer are defined over similar structures, a small function was added with the
> explicit purpose of breaking this similarity (case of the AES and the affine
> permutation used in its S-Box]”
> - “Open Problem 1. Is there a way to leverage the partition-preserving property
> of \uD835\uDF0B to mount an attack against Streebog?“

> Once more, Leo, your research is extremely interesting – I personally ask you to
> continue. But it will be correct to underline that currently it’s only an
> ongoing research, without any reasons for statements about
> backdoors/vulnerabilities.

> Best regards,
> Stanislav

> пн, 5 авг. 2019 г. в 22:07, Dmitry Belyavsky < [ mailto:beldmit@gmail.com |
> beldmit@gmail.com ] >:

>> Dear Leo, dear CFRG members,

>> Many thanks for your efforts in analysis of Russian GOST crypto algorithms.
>> I am not a cryptographer myself, but one of the implementors of Russian
>> cryptography.

>> On Mon, Aug 5, 2019 at 6:35 PM Leo Perrin < [ mailto:leo.perrin@inria.fr |
>> leo.perrin@inria.fr ] > wrote:

>>> Dear CFRG members,

>>> I would like to follow up on the short talk I gave several days ago. That talk
>>> was itself a follow-up to a previous e-mail [0] which presented a strange
>>> structure I found in pi, a component (S-box) which is shared by both Streebog
>>> (RFC 6986) and Kuznyechik (RFC 7801).
>>> At the time of writing this previous e-mail, I did not know that the designers
>>> of these algorithms were still claiming that they had generated their S-box
>>> randomly. However, they do! Here are links to relevant ISO documents that were
>>> leaked:
>>> - [ https://cdn.virgilsecurity.com/assets/docs/memo-on-kuznyechik-s-box.pdf |
>>> https://cdn.virgilsecurity.com/assets/docs/memo-on-kuznyechik-s-box.pdf ]
>>> - [
>>> https://cdn.virgilsecurity.com/assets/docs/meeting-report-for-the-discussion-on-kuznyechik-and-streebog.pdf
>>> |
>>> https://cdn.virgilsecurity.com/assets/docs/meeting-report-for-the-discussion-on-kuznyechik-and-streebog..pdf
>>> ]
>>> As we can see in the first one, the designers of pi did claim that they obtained
>>> this component by generating many random S-boxes and then choosing the best
>>> according to some (fair and usual) criteria. In fact, as we can see in the
>>> second, they insisted that they used this method even *after* the publication
>>> of my latest results [1].

>>> That is hardly possible. Let us see why.

>>> There are 2^{L+1}-1 bitstrings of length at most L. Thus, the total number of
>>> permutations with an implementation fitting in at most L bits in a given
>>> language (e.g. in C) is upper bounded by 2^{L+1}. The shortest implementation
>>> of pi in C we are aware of was found by ceilingcat [2] who improved a previous
>>> implementation by Xavier Bonnetain. That implementation is 139 ASCII characters
>>> long, i.e. 973 bits long. Thus, there are fewer than 2^{974} permutations with
>>> an implementation this simple. As there are 256! = 2^{1684} permutations
>>> operating on 8 bits, we easily conclude that the probability that a random
>>> permutation has a C implementation at most 973 bits long is smaller than
>>> 2^{974-1684} = 2^{-710}, which is beyond negligible. If we look at languages
>>> more compact than C (see [2]) then we can deduce even lower probabilities. In
>>> other words, the set of the permutations as simple as pi for a very broad
>>> definition of simplicity is extremely small, and thus the probability that we
>>> fall into it by chance is negligible. More detailed explanations are available
>>> at [3].

>>> In light of this argument, only two possibilities are left:
>>> 1.. the designers of pi really used a random process to generate their S-box,
>>> they were just lucky enough to find one of the very (very very very) few
>>> permutations with a short implementation, or
>>> 2. they used a TKlog deliberately but decided to hide it, the TKlog being the
>>> structure I identified in [1].
>>> The probability that the first one is true can be upper bounded using the
>>> discussion above: it is at most 2^{-710}. For comparison, there are about 10^80
>>> = 2^266 atoms in the universe. The only reasonable conclusion is then that we
>>> are in the second case: the use of the TKlog is a deliberate choice by its
>>> designers. Note that this reasoning is based purely on the simplicity of the
>>> TKlog structure. It does not take into account its very peculiar interactions
>>> with the finite field, nor does it consider the fact that the finite field used
>>> to define it is the same as the one used in the linear layer of Streebog. It is
>>> also unlikely that the differential and linear properties of pi can be
>>> encountered in a feasibly large set of random S-boxes [4].

>>> Because of these observations, I insist that the designers of these algorithms
>>> have provided cryptanalysts with misleading information. There cannot be any
>>> good reason for this, which is why I think these algorithms cannot be trusted,
>>> and thus that RFC 6986 and 7801 should be deprecated.

>>> Cheers,

>>> /Léo

>>> PS: Remember that the golfed implementations listed in [2] are only intended to
>>> be small. They are *not* intended to be secure! Their running time in CPU
>>> cycles is directly proportionnal to the discrete logarithm of their input,
>>> meaning that they are the opposite of constant time. An implementation using
>>> them would be trivially vulnerable to some side channel attacks.

>> You say that you have arguments that the designers may have said controversial
>> declarations about some details of the construction in the ISO, but this
>> doesn’t seem to correspond to attack.

>> Could you please clarify how does the existence of a compact code for generation
>> of the S-box imply a possibility of attack?

>> Currently, I don't see even a performance optimization as a consequence of the
>> compact representation of the S-box.
>> The most widespread open-source implementation ( [
>> https://github.com/gost-engine/engine/tree/openssl_1_1_0 |
>> https://github.com/gost-engine/engine/tree/openssl_1_1_0 ] )
>> of Streebog and Kuznyechik does not use the Pi transformations as a separate
>> step — they use a combined process.

>>> [0] [
>>> https://mailarchive.ietf.org/arch/msg/cfrg/4PmssKzCBsxTmLCieDgqD7Nynwg?fbclid=IwAR3JIGXt1_6aFwNOjDb-p7oC3ezHRuEEreznt-atyY7Clah2QZU_uU7FKS8
>>> |
>>> https://mailarchive.ietf.org/arch/msg/cfrg/4PmssKzCBsxTmLCieDgqD7Nynwg?fbclid=IwAR3JIGXt1_6aFwNOjDb-p7oC3ezHRuEEreznt-atyY7Clah2QZU_uU7FKS8
>>> ]
>>> [1] [ https://tosc.iacr.org/index.php/ToSC/article/view/7405 |
>>> https://tosc.iacr.org/index.php/ToSC/article/view/7405 ]
>>> [2] [
>>> https://codegolf.stackexchange.com/questions/186498/proving-that-a-russian-cryptographic-standard-is-too-structured
>>> |
>>> https://codegolf.stackexchange.com/questions/186498/proving-that-a-russian-cryptographic-standard-is-too-structured
>>> ]
>>> [3] [ https://who.rocq.inria.fr/Leo.Perrin/code-golf..html |
>>> https://who.rocq.inria.fr/Leo.Perrin/code-golf.html ]
>>> [4] [ https://eprint..iacr.org/2019/528 | https://eprint.iacr.org/2019/528 ]
>>> _______________________________________________
>>> Cfrg mailing list
>>> [ mailto:Cfrg@irtf.org | Cfrg@irtf.org ]
>>> [ https://www.irtf.org/mailman/listinfo/cfrg |
>>> https://www.irtf.org/mailman/listinfo/cfrg ]

>> --
>> SY, Dmitry Belyavsky
>> _______________________________________________
>> Cfrg mailing list
>> [ mailto:Cfrg@irtf.org | Cfrg@irtf.org ]
>> [ https://www.irtf.org/mailman/listinfo/cfrg |
>> https://www.irtf.org/mailman/listinfo/cfrg ]