Re: [tsvwg] Comments on draft-ietf-tsvwg-transport-encrypt-14

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Sat, 04 April 2020 07:13 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 031A03A14C8 for <tsvwg@ietfa.amsl.com>; Sat, 4 Apr 2020 00:13:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UutAtTKIqpRw for <tsvwg@ietfa.amsl.com>; Sat, 4 Apr 2020 00:13:25 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id DD1F43A148D for <tsvwg@ietf.org>; Sat, 4 Apr 2020 00:13:24 -0700 (PDT)
Received: from GF-MacBook-Pro.local (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 7EFE01B0012D; Sat, 4 Apr 2020 08:13:16 +0100 (BST)
To: "Black, David" <David.Black@dell.com>, Tom Herbert <tom@herbertland.com>
Cc: tsvwg <tsvwg@ietf.org>
References: <CALx6S345Ta5LjSkZ+XmNmH8dxKnM++VRCej2iGxfdUqDc+M-Jw@mail.gmail.com> <MN2PR19MB4045652C80DB5348A5A3505F83C70@MN2PR19MB4045.namprd19.prod.outlook.com> <CALx6S36yzDTLaxUhWibZjmK5Cxu2zfzxiawFRCbVn9aPF4rs1A@mail.gmail.com> <MN2PR19MB4045E873D0908044343F8C2283C40@MN2PR19MB4045.namprd19.prod.outlook.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <42914e6a-5602-7911-7447-e400d36eb0e6@erg.abdn.ac.uk>
Date: Sat, 04 Apr 2020 08:13:15 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.6.0
MIME-Version: 1.0
In-Reply-To: <MN2PR19MB4045E873D0908044343F8C2283C40@MN2PR19MB4045.namprd19.prod.outlook.com>
Content-Type: multipart/alternative; boundary="------------2C98D2F1317D5C97ECD56518"
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/KrwcwyIrsv5XvWcRVrmDkMXNRG8>
Subject: Re: [tsvwg] Comments on draft-ietf-tsvwg-transport-encrypt-14
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 04 Apr 2020 07:13:28 -0000

See below on (2).

On 04/04/2020 01:35, Black, David wrote:
> Tom,
>
> Ok, 1 issue settled, 1 more to go ...
>
> [1]
>>> Suggested revised text:
>>>
>>>     The usefulness of this information would be enhanced if the exposed
>>>     information could be verified to match the protocol's actual behavior,
>>>     e.g., by observing whether the network traffic sent by the protocol
>>>     is consistent with the exposed information in that traffic.
>>>
>> But then that would be akin to making inferences encrypted data, and
>> preventing such inferences in the network is precisely one of the
>> reasons the transport header is being encrypted by the end hosts.
> I think that response is overdone for several reasons:
>
> 1.  A decision to expose some transport header information is a decision to allow this sort of inference from observations (sorry, there's no free lunch available here).
>
> 2.  What is being inferred is protocol functional behavior, not the specific contents of the encrypted headers that the protocol is using as part of producing that behavior, so characterizing this as somehow reversing part of the encryption is a stretch.
>
> 3.  Beyond that, if one is determined to prevent all inference of protocol functional behavior, not exposing any transport protocol information does not suffice to frustrate all traffic analysis (e.g., set a few CE marks on packets inbound to a protocol receiver that supports ECN, sit back and watch what happens).  There's been extensive security work on traffic analysis countermeasures which is (IMHO) not germane to this draft.
> [2]
>>> Add this paragraph:
>>>
>>>     Network-layer optional headers explicitly indicate the information
>>>     that is exposed, whereas more effort may be required for a network
>>>     device to determine whether a packet contains a partially encrypted
>>>     transport header.  A particular concern is UDP-encapsulated protocols
>>>     because the UDP ports do not definitively indicate which protocol has
>>>     been encapsulated, even though some protocols are the predominant
>>>     usage of specific UDP destination ports (e.g., a packet sent to UDP
>>>     port 4500 is highly likely to contain UDP-encapsulated IKE [RFC3948]
>>>     or IKEv2 [RFC7296]).
>>>
>>> Would that work?
>>>
>> That's good, but I think there should be a reference to RFC7605 also.
> Ok, I think the factual statement suffices, but I won't object to a reference to RFC7605.
>
> Thanks, --David

The phrase "whereas more effort may be required" is speculation. It 
could be that hardware in routers, etc has been designed and optimised 
for network-header extensions to allow metrics to be extracted, or it 
could be that it hasn't, or that this is only true from some vendors or 
some models of equipment. It's possible to say that about many things. I 
don't think we should speculate here, unless we can point to references 
that have observed this.

I think we already have covered the ground for the second part of the 
additonal text.The current text on ports is:

    In some uses, a low-numbered (well-known) transport port number can
    identify the protocol.  However, port information alone is not
    sufficient to guarantee identification.  Applications can use
    arbitrary ports, multiple sessions can be multiplexed on a single
    port, and ports can be re-used by subsequent sessions.  UDP-based
    protocols often do not use well-known port numbers.  Some flows can
    be identified by observing signalling protocol data (e.g., [RFC3261],
    [I-D.ietf-rtcweb-overview]) or through the use of magic numbers
    placed in the first byte(s) of the datagram payload [RFC7983].

I do agree with Tom that this could benefit from the addition of a reference to RFC7605!

I do not personally buy this: "A particular concern is UDP-encapsulated protocols".
To me, this has little to do with encryption, but we could say something more here
if there is a need to say more. This seems true of other protocols. They can
be layered also, we see many services over HTTP/TCP and people shouldn't
disect traffic based only on a TCP port.

>> -----Original Message-----
>> From: Tom Herbert <tom@herbertland.com>
>> Sent: Friday, April 3, 2020 8:04 PM
>> To: Black, David
>> Cc: tsvwg
>> Subject: Re: [tsvwg] Comments on draft-ietf-tsvwg-transport-encrypt-14
>>
>>
>> [EXTERNAL EMAIL]
>>
>> On Fri, Apr 3, 2020 at 1:06 PM Black, David <David.Black@dell.com> wrote:
>>> Hi Tom,
>>> [writing as draft shepherd]
>>>
>>> [1]
>>>> I don't understand this statement from the draft:
>>>>
>>>> "The value of this information would be enhanced if the exposed
>>>> information could be verified to match the internal state of the
>>>> transport by observing the transport behaviour."
>>>>
>>>> I assume this means that the network nodes would need to understand
>>>> the internal state of transport protocols. How would this work?
>>> Oops, that's not a good assumption - in 20/20 hindsight, the quoted
>>> text should focus on protocol behavior rather than internal state.
>>>
>>> Suggested revised text:
>>>
>>>     The usefulness of this information would be enhanced if the exposed
>>>     information could be verified to match the protocol's actual behavior,
>>>     e.g., by observing whether the network traffic sent by the protocol
>>>     is consistent with the exposed information in that traffic.
>>>
>> But then that would be akin to making inferences encrypted data, and
>> preventing such inferences in the network is precisely one of the
>> reasons the transport header is being encrypted by the end hosts.
>>
>>> Beyond that, one could (partially) infer protocol state from observed
>>> traffic behavior, but I don't think it's important to say that.
>>>
>>> [2]
>>>> >From the draft:
>>>>
>>>> "An endpoint/protocol could choose to expose transport header
>>>> information to optimise the benefit it gets from the network
>>>> [RFC8558]."
>>>>
>>>> There is also the possibility that the endpoints didn't expose
>>>> transport layer information, but the network incorrectly thinks it
>>>> did.  The network may simply misinterpret bits in packets as being
>>>> transport layer information when in fact the data can be something
>>>> completely different and unrelated. The canonical example of this is
>>>> QUIC or any transport encapsulated in UDP payload.
>>> That's actually an observation about transport-layer vs. network-layer
>>> information exposure that would be better addressed slightly earlier in
>>> the draft where both of those layers are discussed.
>>>
>>> After the first paragraph in Section 6.1 (Exposing Transport Information
>>> in Extension Headers) looks like a good place to make that observation.
>>>
>>> After this paragraph:
>>>
>>>     At the network-layer, packets can carry optional headers (similar to
>>>     Section 5) that may be used to explicitly expose transport header
>>>     information to the on-path devices operating at the network layer
>>>     (Section 3.1.3).  For example, an endpoint that sends an IPv6 Hop-by-
>>>     Hop option [RFC8200] can provide explicit transport layer information
>>>     that can be observed and used by network devices on the path.
>>>
>>> Add this paragraph:
>>>
>>>     Network-layer optional headers explicitly indicate the information
>>>     that is exposed, whereas more effort may be required for a network
>>>     device to determine whether a packet contains a partially encrypted
>>>     transport header.  A particular concern is UDP-encapsulated protocols
>>>     because the UDP ports do not definitively indicate which protocol has
>>>     been encapsulated, even though some protocols are the predominant
>>>     usage of specific UDP destination ports (e.g., a packet sent to UDP
>>>     port 4500 is highly likely to contain UDP-encapsulated IKE [RFC3948]
>>>     or IKEv2 [RFC7296]).
>>>
>>> Would that work?
>>>
>> That's good, but I think there should be a reference to RFC7605 also.
>>
>> Tom
>>> Thanks, --David
>>>
>>>> -----Original Message-----
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Tom Herbert
>>>> Sent: Friday, April 3, 2020 12:49 PM
>>>> To: tsvwg
>>>> Subject: [tsvwg] Comments on draft-ietf-tsvwg-transport-encrypt-14
>>>>
>>>>
>>>> [EXTERNAL EMAIL]
>>>>
>>>> Hi, a few comments.
>>>>
>>>> I don't understand this statement from the draft:
>>>>
>>>> "The value of this information would be enhanced if the exposed
>>>> information could be verified to match the internal state of the
>>>> transport by observing the transport behaviour."
>>>>
>>>> I assume this means that the network nodes would need to understand
>>>> the internal state of transport protocols. How would this work?
>>>>
>>>> If the idea is that the exposed information is somehow verified with
>>>> the endpoints that securely provide the information then maybe I
>>>> understand that. But, if the idea is that intermediate nodes need to
>>>> autonomously deduce the internal state themselves, I think that is
>>>> problematic. Aside from all the known problems that stateful network
>>>> devices have caused, there seems to be a circular dependency here.
>>>> AFAIK the only way to deduce the internal state of a transport
>>>> connection in the network would be by inspecting the exposed transport
>>>> information, but this statement seems to be saying the exposed
>>>> information can't be trusted unless the internal state has been
>>>> deduced. Am I missing something?
>>>>
>>>> >From the draft:
>>>>
>>>> "An endpoint/protocol could choose to expose transport header
>>>> information to optimise the benefit it gets from the network
>>>> [RFC8558]."
>>>>
>>>> There is also the possibility that the endpoints didn't expose
>>>> transport layer information, but the network incorrectly thinks it
>>>> did. The network may simply misinterpret bits in packets as being
>>>> transport layer information when in fact the data can be something
>>>> completely different and unrelated. The canonical example of this is
>>>> QUIC or any transport encapsulated in UDP payload. Per, RFC7605,
>>>> transport port numbers only have meaning at end points. So for example
>>>> a network device may think a packet with UDP destination port 80 is
>>>> QUIC, when in fact it is something completely different, hence the
>>>> network device may misinterpret the payload as being QUIC. I suspect
>>>> it's unlikely that this situation will benefit the user, and more
>>>> likely the network would not treat such packets well. There's been
>>>> some work on this problem, like magic numbers in SPUD or analysis to
>>>> mitigate the effects of misinterpretation like done for QUIC spin bit,
>>>> but I don't believe anyone has proposed a general solution (other than
>>>> moving the necessary information into the network layer and have
>>>> intermediate nodes stop doing DPI).  I believe this problem should be
>>>> mentioned in the draft with a reference to RFC7605 .
>>>>
>>>> Tom