Re: [Pearg] Tsvart early review of draft-irtf-pearg-numeric-ids-generation-02

Fernando Gont <> Thu, 14 January 2021 02:18 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C54F33A0DD7 for <>; Wed, 13 Jan 2021 18:18:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.15
X-Spam-Status: No, score=-2.15 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.262, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id fdcBF7BKqLfv for <>; Wed, 13 Jan 2021 18:18:45 -0800 (PST)
Received: from ( [IPv6:2001:67c:27e4::14]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C45563A03F3 for <>; Wed, 13 Jan 2021 18:18:44 -0800 (PST)
Received: from [IPv6:2800:810:464:2b9:b8a6:9278:911:37d5] (unknown [IPv6:2800:810:464:2b9:b8a6:9278:911:37d5]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 9384F284689; Thu, 14 Jan 2021 02:18:40 +0000 (UTC)
From: Fernando Gont <>
To: Michael Tuexen <>
References: <> <> <>
Message-ID: <>
Date: Wed, 13 Jan 2021 23:18:23 -0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <>
Subject: Re: [Pearg] Tsvart early review of draft-irtf-pearg-numeric-ids-generation-02
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Privacy Enhancements and Assessment Proposed RG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 14 Jan 2021 02:18:50 -0000

Hello, Michael,

Based on Shivan's last email it looks like I haven't responded to this 
last email of yours. So, just in case, please find my responses in-line...

>> On 7/8/20 09:22, Michael Tüxen via Datatracker wrote:
>>> Reviewer: Michael Tüxen Review result: Ready with Issues The
>>> document is well written, provides algorithms which could be used
>>> to address identified problems. One  could add some text covering
>>> TCP timestamps.
>> You mean e.g. to spell out which of the proposed algorithms one
>> might use for TCP timestamps?
> For example.

FWIW, this is currently noted in Table 1:

    |   TCP initial    |   Monotonically-increasing (5)  |   Hard (5)   |
    |    timestamps    |                                 |              |

and then Table 2:

    |  4  |  Uniqueness, monotonically increasing |     TCP ISN, TCP    |
    |     |     within context (hard failure)     |  initial timestamps |

Algorithms for this category are described later on in this document.

>>> Section 1 states: "Recent history indicates that when new
>>> protocols are standardized or new protocol implementations are
>>> produced, the security and privacy properties of the associated
>>> identifiers tend to be overlooked,..." How does this related to
>>> recent/current activities like SCTP/DTLS or QUIC?
>> SCTP (RFC4960) is similar to TCP, in this respect. OTOH, I have
>> only skimmed through the DTLS (RFC6347), and it seems that it
>> initially sets sequence numbers to 0. -- while these are meant to
>> be protected, I'm curious if they could have done with
>> monotonically increasing sequence numbers ala 6528, or with a
>> random origin.
> My point is that SCTP/DTLS or QUIC are transport layers which use
> encryption. Doesn't this affect the statement: the security and
> privacy properties of the associated identifiers tend to be
> overlooked
> At least for the work on QUIC this doesn't seem to apply. Or am I
> missing something?

This does apply to at least some of QUIC IDs, such as connection-ids. 
Note that while connection-ids are not encrypted, this would still apply 
if they were -- e.g. consider an implementation that generated 
connection-ids from a global counter.

That is, in the case of IDs that produce information leakages,
the information is leaked to legitimate communicating
peers -- i.e., we do are not assuming sniffing, and hence the analysis 
remains unaffected in that respect.

However, use of cryptographic techniques may affect the analysis for 
other IDs (e.g., injection attacks based on predictable sequence 
numbers). (i.e., from the pov of e.g. blind data-injection attacks, you 
might be able to employ predictable sequence numbers safely if you 
employ cryptographic techniques).

So, an appropriate assessment still needs to be performed, and it should 
explain whether the use of cryptographic techniques mitigates these 
issues or not.

Section 1 of the last rev 
has a paragraph that makes this more clear:

    We note that the use of cryptographic techniques may readily mitigate
    some of the issues arising from predictable transient numeric
    identifiers.  For example, cryptographic integrity and authentication
    can readily mitigate data injection attacks even in the presence of
    predictable transient numeric identifiers (such as "sequence
    numbers").  However, use of flawed algorithms (such as global
    counters) for generating transient numeric identifiers could still
    result in information leakages even when cryptographic techniques are

Please also check the new Section 8.6 ("Exploitation of Predictable 
Transient Numeric Identifiers for Injection Attacks").

And also this text in "Appendix A.  Algorithms and Techniques with Known 

    The following subsections discuss algorithms and techniques with
    known negative security and privacy implications.


       As discussed in Section 1, the use of cryptographic techniques
       might allow for the safe use of some of these algorithms and
       techniques.  However, this should be evaluated on a case by case

>>> The Algorithms in 7.1.1 and 7.1.2 can be substantially simplified
>>> when check_suitable_id() always returns true. Why are not the
>>> simplified algorithms shown?
>> Are you referring to the case where an implementation need not
>> check the generated ID against other existing I-Ds?
> Yes.
>> (FWIW, in order for check_suitable_id()to always return "true", it
>> means that there are essentially no requirements on the ID -- just
>> pick any random number... at which point, there's not really much
>> of an algorithm ( Section 7.1.1 and Section 7.1.2 cover "recovery"
>> strategies for the cases where the algorithm finds collisions.
> Understood. But the document says that
> in many (if not most) cases, the algorithm will not need to check the
> suitability of a selected identifier (i.e., check_suitable_id() would
> always be "true").
> So why not show the simplified version, if it is used in 'most
> cases'.

The algorithms from Section 7.1 are used in both of these cases:

1) Uniqueness, soft failure

2) Uniqueness, hard failure (but collisions can be definitely determined
on the local system)

The general algorithm applies to both cases, whereas the case where
"check_suitable_id() always return 'true'" may only be used for #1.

Note that Section 7.2 refers to the algorithms in Section 7.1 without
repeating the algorithms. So including the "simplified versions" would 
probably add confusion.

FWUW, I've removed the "(if not most)", since that depends on the 
identifier category they are being employed for.

>>> The algorithm in 7.3 (and later) uses a function F and it is
>>> stated that F must be a cryptographically-secure hash function.
>>> Couldn't you also use something like SipHash.
>> Siphash *has* been employed for generating IDs.. although it seems
>> that the 64-bits versions are considered too weak. see e.g.
> Is the commit supporting '*has* been employed' or 'are considered too
> weak'?

I was meaning that they have been employed, sorry -- incorrectly 
conflated two different things.

Also, as you correctly note, the appropriate jargon to use is "PRF", and 
siphash could indeed be employed. I've modified the I-D accordingly.
Note that the specification of F() is now introduced when F() first 
appears in the document, and then reused from other sections.

>>> When reading the algorithm in 7.4.1, I had the impression that it
>>> should also work with id_inc > 1 (by just changing that
>>> parameter). If that is true, I guess you would need to change "if
>>> (next_id == max_id)" to "if (max_id - next_id < id_inc) {" and
>>> also "next_id = min_id;" to "next_id = min_id + (id_inc - (max_id
>>> - next_id + 1));".  If you don't want to be that generic, you 
>>> might want to remove id_inc and just use 1.
>> You raise a very good point. I'm curious whether it might make
>> sense to include the most generic version (as you suggest), and
>> then also show the simplified version when id_inc=1  ? Thoughts?
> Sounds good to me.

FWIW, we tweaked the code to be generic, noting that in the simplest 
version, an implementation could use constant increments of 1.

Please do let me know if the above text (and current rev at: 
addresses your comments.

Thanks a lot for your help!

Fernando Gont
SI6 Networks
PGP Fingerprint: 6666 31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492