Re: [Cfrg] Streaming AEAD

Hi James,

thanks for introducing this topic, more inline:

On 02/18/2014 12:47 AM, Manger, James wrote:
> Authenticated Encryption with Additional Data (AEAD) is a great primitive for securing messages, but it has one serious limitation: you only learn that the plaintext is authentic after processing all of the ciphertext. A crypto library performing decryption either has to: 1) cache the whole message (of arbitrary size) before returning the authentic plaintext; or 2) return unauthenticated plaintext chunks, and indicate authenticity only later when (or if) the end of the message is reached.
> Neither of these choices are ideal.

In "Pipelineable On-Line Encryption" (to be presented at FSE in a couple 
of weeks http://fse2014.isg.rhul.ac.uk/index.php?p=accepted) we call the 
act of releasing (would-be) plaintext prior to the completion of the 
authentication check "decryption misuse".   You are absolutely right 
that many crypto APIs enable or require decryption misuse, including 
openSSL and PKCS11 through their "decryptUpdate" functions.   A crypto 
module that does AEAD decryption of long messages at high data rates 
might be forced into doing decryption misuse if it has limited memory, 
or if the latency of plaintext buffering was unacceptable.   Decryption 
misuse is a problem that deserves to be addressed.

The simplest way to avoid decryption misuse is to never call the 
"decryptUpdate" function.   In some software environments, it is 
possible to re-write code so that this function is not called. This 
won't solve every problem, but it is worth mentioning.

Next up the scale in terms of difficultly, the length of the plaintext 
and associated data can be kept below some limit, to lower the latency 
and the memory requirements.   Do so requires that there is some 
flexibility in how the plaintext is "packetized", that is, how the 
sequence of plaintext messages is mapped onto the AEAD's plaintext inputs.

>
> What is needed is a way to use an AEAD algorithm to secure a message in chunks. Streaming-AEAD would allow a recipient to know they have received an authentic prefix of a message. It would allow crypto APIs to support streaming modes while never returning unauthentic plaintext.
>
> A discussion on this topic has started on the IETF TLS list (in a thread unhelpfully titled "Comments on")
> http://www.ietf.org/mail-archive/web/tls/current/msg11282.html
>
> I think this would be a perfect topic for CFRG to address.
>
> Instead of sending C = AEAD(N, A, P);
> send C = AEAD(N1, A1, P1) || AEAD(N2, A2, P2) || ... || AEAD(Nm, Am, Pm).
> Presumably incrementing nonces can be used (eg N2 = N1 + 1).
> How do you flag the end of the message: a flag in the nonce; a flag in the AAD; empty plaintext; inside a modified AEAD primitive?
> How do you flag the start of the message?
> Should the byte offset of each chunk be put in that chunk's AAD?
> Is the real AAD fed to the first or last AEAD instantiation; or can parts be passed to any AEAD instantiation?

Here you are pointing out that a single associated data and plaintext 
value (A,P) could be fragmented into multiple values before AEAD 
processing is used.   This is surely true.   If the fragmentation was 
handled by the higher layer protocol (TLS, ESP, whatever) then this 
could be done with any AEAD algorithm.

>
> Should this be built from AEAD primitives? What about ones without nonces (eg draft-mcgrew-aead-aes-cbc-hmac-sha2)?
> Is it sufficient for a Streaming-AEAD primitive to distinguish 4 modes (full message; 1st chunk; intermediate chunk; last chunk) plus a stipulation to use incrementing nonces for successive chunks?
> Or should we aim for a Streaming-AEAD that can create extra intermediate authentication tags, without affecting the final tag?
>
>
> As a strawman, what if OCB [draft-irtf-cfrg-ocb] was tweaked so the internal 128-bit Nonce field was changed from:
> Nonce = num2str(TAGLEN mod 128,7) || zeros(120-bitlen(N)) || 1 || N
> to:
> Nonce = num2str((TAGLEN mod 128)/4,5) || num2str(FLAG,2) || zeros(120-bitlen(N)) || 1 || N
> where
> FLAG = 00 means the AEAD operation covers a full message,
> FLAG = 01 means the AEAD operation covers the first chunk of a message,
> FLAG = 10 means the AEAD operation covers a subsequent chunk of a message (incrementing N),
> FLAG = 11 means the AEAD operation covers the last chunk of a message (incrementing N).
>
> With 5 minutes analyses from a non-cryptanalyst that looks sufficient for a secure streaming-AEAD :)

Alternatively, the associated data could be formatted so that it 
contains the usual header, followed by a fragment number.   That 
technique would work with any of the AEAD algorithms.

Fragmenting the plaintexts before applying AEAD requires that each AEAD 
message be independently authenticated, and thus this technique has more 
data/encapsulation overhead.

In the POE approach that I mentioned above, we describe a way to make an 
AEAD algorithm robust against decryption misuse, though partial 
non-malleability.   That is, if the attacker changes any part of a 
ciphertext, then the corresponding post-decryption plaintext will be 
indistinguishable from random.   If conventional AEAD is used with the 
decryptUpdate function, the attacker can cause the post-decryption 
plaintext output by that function to have any value.   In contrast, if 
POE is used, then the output of decryptUpdate will be indistinguishable 
from random, and not under the control of an attacker.

This robustness against decryption misuse is a valuable property for an 
AEAD algorithm to have.   Nonetheless, it would be better if the 
protocol using AEAD actually performed some sort of plaintext 
fragmentation, if it can accommodate the overhead, because the security 
would be better.   Another advantage to using a plaintext fragmentation 
scheme is that the scheme could be used to hide the lengths and timings 
of the actual plaintext messages, thus preventing traffic analysis (like 
that done in Dyer et. al. "Peek-a-Boo, I Still See You: Why Efficient 
Traffic Analysis Countermeasures Fail").

This line of thinking is part of the rationale behind AERO 
(draft-mcgrew-aero-01) with a wide PRP: since we need to avoid 
decryption misuse, we force the decrypter to maintain some state, and 
given that fact, we might as well get the best security possible.

>
> --
> James Manger
>
> -----Original Message-----
> From: TLS [mailto:tls-bounces@ietf.org] On Behalf Of Nikos Mavrogiannopoulos
> Sent: Monday, 17 February 2014 7:10 PM
> To: Adam Langley
> Cc: Niels Möller; tls@ietf.org
> Subject: Re: [TLS] Comments on
>
> On Fri, 2014-02-14 at 14:49 -0500, Adam Langley wrote:
>
>>> By streaming, I don't advocate you do the decryption in a pipe line;
>>> that's clearly a dangerous habit. Usecase is more like on one machine
>>> running
>>>    src-machine$ tar -cf - foo-dir | aead-encrypt | send
>> This use of streaming is fine by my criteria, assuming that
>> "aead-encrypt" is chunking the input into different blocks and
>> applying the AEAD to each block. This is perfectly fine with a
>> one-shot(*), AEAD API at the core.
> I'd pretty much agree with the chunked approach, but I now realize that
> it is more hard to get it right than the streaming one. In the chunked
> approach one would need to implement sequence numbers (could be implicit
> as additional data) and a termination block. So both approaches have
> quite some disadvantages. The streaming approach allows for misuse of
> the API which may cancel the benefits of AEAD, and the chunked approach
> requires the developer to create a safe protocol over AEAD.
>
> If the idea is for AEAD to be used by an average developer, it seems we
> need even a higher abstraction than that; even for such a simple
> use-cases.

It makes sense to consider higher-level abstractions, but hopefully the 
problem can be addressed in many cases by having the higher-layer 
protocol avoid long plaintexts.

In any event, what are up against here is an important principle in 
secure protocol design: a receiver should not trust any data that gets 
sent to it until that data has been authenticated, and it should not be 
forced to use up its memory (e.g. to store post-decryption plaintext) on 
unauthenticated data.

thanks,

David

>
> regards,
> Nikos
>
>
> _______________________________________________
> TLS mailing list
> TLS@ietf.org
> https://www.ietf.org/mailman/listinfo/tls
> _______________________________________________
> Cfrg mailing list
> Cfrg@irtf.org
> http://www.irtf.org/mailman/listinfo/cfrg