Re: [Int-area] AD evaluation: draft-ietf-intarea-nat-reveal-analysis

<mohamed.boucadair@orange.com> Wed, 13 February 2013 16:12 UTC

Return-Path: <mohamed.boucadair@orange.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 95F0221F86E7 for <int-area@ietfa.amsl.com>; Wed, 13 Feb 2013 08:12:19 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.136
X-Spam-Level:
X-Spam-Status: No, score=-2.136 tagged_above=-999 required=5 tests=[AWL=0.112, BAYES_00=-2.599, HELO_EQ_FR=0.35, UNPARSEABLE_RELAY=0.001]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id y05pZdOY7iHH for <int-area@ietfa.amsl.com>; Wed, 13 Feb 2013 08:12:18 -0800 (PST)
Received: from relais-inet.francetelecom.com (relais-ias92.francetelecom.com [193.251.215.92]) by ietfa.amsl.com (Postfix) with ESMTP id 7C69621F85DF for <int-area@ietf.org>; Wed, 13 Feb 2013 08:12:14 -0800 (PST)
Received: from omfedm05.si.francetelecom.fr (unknown [xx.xx.xx.1]) by omfedm10.si.francetelecom.fr (ESMTP service) with ESMTP id 6CFC0264A71; Wed, 13 Feb 2013 17:12:08 +0100 (CET)
Received: from PUEXCH61.nanterre.francetelecom.fr (unknown [10.101.44.32]) by omfedm05.si.francetelecom.fr (ESMTP service) with ESMTP id 4E05D35C055; Wed, 13 Feb 2013 17:12:08 +0100 (CET)
Received: from PUEXCB1B.nanterre.francetelecom.fr ([10.101.44.8]) by PUEXCH61.nanterre.francetelecom.fr ([10.101.44.32]) with mapi; Wed, 13 Feb 2013 17:12:08 +0100
From: mohamed.boucadair@orange.com
To: Brian Haberman <brian@innovationslab.net>
Date: Wed, 13 Feb 2013 17:12:06 +0100
Thread-Topic: [Int-area] AD evaluation: draft-ietf-intarea-nat-reveal-analysis
Thread-Index: Ac4J+1nHUqvbZhL6QzOVUGMHuon3+AAAXlWQ
Message-ID: <94C682931C08B048B7A8645303FDC9F36EAFB563F6@PUEXCB1B.nanterre.francetelecom.fr>
References: <51195E93.4090103@innovationslab.net> <94C682931C08B048B7A8645303FDC9F36EAEE11CD9@PUEXCB1B.nanterre.francetelecom.fr> <511BAB5B.8010702@innovationslab.net>
In-Reply-To: <511BAB5B.8010702@innovationslab.net>
Accept-Language: fr-FR
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: fr-FR
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-PMX-Version: 5.6.1.2065439, Antispam-Engine: 2.7.2.376379, Antispam-Data: 2013.2.13.61522
Cc: "draft-ietf-intarea-nat-reveal-analysis@tools.ietf.org" <draft-ietf-intarea-nat-reveal-analysis@tools.ietf.org>, "int-area@ietf.org" <int-area@ietf.org>
Subject: Re: [Int-area] AD evaluation: draft-ietf-intarea-nat-reveal-analysis
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF Internet Area Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/int-area>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Feb 2013 16:12:19 -0000

Re-,

Please see inline.

Cheers,
Med

>-----Message d'origine-----
>De : Brian Haberman [mailto:brian@innovationslab.net]
>Envoyé : mercredi 13 février 2013 16:04
>À : BOUCADAIR Mohamed OLNC/OLN
>Cc : int-area@ietf.org;
>draft-ietf-intarea-nat-reveal-analysis@tools.ietf.org
>Objet : Re: [Int-area] AD evaluation:
>draft-ietf-intarea-nat-reveal-analysis
>
>On 2/12/13 5:34 AM, mohamed.boucadair@orange.com wrote:
>> Dear Brian,
>>
>> Many thanks for the detailed review.
>>
>> Please see inline.
>>
>> Cheers, Med
>>
>>> -----Message d'origine----- De : int-area-bounces@ietf.org
>>> [mailto:int-area-bounces@ietf.org] De la part de Brian Haberman
>>> Envoyé : lundi 11 février 2013 22:12 À : int-area@ietf.org;
>>> draft-ietf-intarea-nat-reveal-analysis@tools.ietf.org Objet :
>>> [Int-area] AD evaluation: draft-ietf-intarea-nat-reveal-analysis
>>>
>>> All, I have completed my AD evaluation for the above draft and
>>> have some feedback for the group.  I will focus on the substantive
>>> comments for the time being since some of them may result in
>>> re-written text in places.  I will follow up with the document
>>> authors on editorial nits and such at a later time.
>>>
>>> 1. It is obvious from the way certain sections of text are written
>>> that the original intent was to make a recommendation on which of
>>> the described approaches should be used to disambiguate between
>>> multiple hosts behind a NAT/CGN.  Given that the document is now
>>> simply a characterization of those mechanisms, I would suggest
>>> spending some time cleaning up the Abstract, Section 1.1, and
>>> Section 2 so that they focus on the task of describing the
>>> mechanisms, rather than mentioning abstract requirements for those
>>> mechanisms. There are concrete suggestions a little later in this
>>> note.
>>
>> Med: It is true there was a version of the document which includes a
>> recommendation but the initial intent of the document was to analyze
>> candidate solution. I updated the text to make it explicit: I changed
>> the text in the sections you mentioned. In particular, the change in
>> the abstract is:
>>
>> OLD:
>>
>> This document analyzes a set of solution candidates to mitigate some
>> of the issues encountered when address sharing is used.  In
>> particular, this document focuses on means to reveal a host
>> identifier (HOST_ID) when a Carrier Grade NAT (CGN) or application
>> proxies are involved in the path.  This host identifier must be
>> unique to each host under the same shared IP address.
>>
>> NEW:
>>
>> This document is a collection of solutions to reveal a host
>> identifier (denoted as HOST_ID) when a Carrier Grade NAT (CGN) or
>> application proxies are involved in the path.  This host identifier
>> is used by a remote server to sort out the packets by sending host.
>> The host identifier must be unique to each host under the same
>> shared IP address.
>>
>> This document analyzes a set of solution candidates to reveal a host
>> identifier; no recommendation is sketched in the document.
>>
>
>The above text is a good addition to the document.
>
>>>
>>> 2. The mechanisms described in this draft fall into two broad
>>> categories, deployed and proposed. Those in the former category can
>>> be characterized based on actual usage scenarios, which would
>>> benefit the table shown in Figure 3. The latter should be described
>>> in terms of what they are proposed to do, but cannot be assessed
>>> against the same metrics as the other groups.
>>
>> Med: Figure 3 includes this note:
>>
>> (2)  This solution is widely deployed.
>>
>> Can you explicit what change you want to be added? Thanks.
>>
>
>Isn't the IP-ID approach also used in some deployments?
>
>I would suggest re-ordering the table so that deployed approaches are
>collected together (and labeling them as deployed).  If the HTTP
>Forwarded header is the only deployed approach, I would simply add a
>note for the others stating whether they are a documented
>proposal or a
>theoretical construct.

Med: I added these two notes:

 (7)  The solution is a theoretical construct.
 (8)  The solution is a documented proposal.

And updated the table accordingly.


>
>>>
>>> 3. The "Requirements Language" section should be removed.  As an
>>> Informational document describing mechanisms, there is no need to
>>> leverage 2119 keywords.
>>
>> Med: Done.
>>
>
>Ok.
>
>>>
>>> 4. It would be useful if the third paragraph of Section 1.1 was
>>> expanded to discuss the risks in more detail.  In fact, it may be
>>> clearer to understand this draft if the problem statement came
>>> before the context (Section 2).
>>
>> Med: I changed the text as follows:
>>
>> OLD:
>>
>> The sole use of the IPv4 address is not sufficient to uniquely
>> distinguish a host.  As a mitigation, it is tempting to investigate
>> means which would help in disclosing an information to be used by
>> the remote server as a means to uniquely disambiguate packets of
>> hosts using the same IPv4 address.
>>
>> NEW:
>>
>> In particular, some servers use the source IPv4 address as an
>> identifier to treat some incoming connections differently.  Due to
>> the deployment of CGNs (e.g., NAT44 [RFC3022], NAT64 [RFC6146]),
>> that address will be shared.  In particular, when a server receives
>> packets from the same source address, because this address is
>> shared, the server does not know which host is the sending host
>> [RFC6269]. The sole use of the IPv4 address is not sufficient to
>> uniquely distinguish a host.  As a mitigation, it is tempting to
>> investigate means which would help in disclosing an information to be
>> used by the remote server as a means to uniquely disambiguate packets
>> of hosts using the same IPv4 address.
>>
>
>That sounds better.
>
>>>
>>> 5. Section 2
>>>
>>> * The Observation text should provide some brief examples of how
>>> and why special treatment is needed/provided.
>>
>> Med: I updated the text with an example:
>>
>> Policies relying on source IP address which are enforced by some
>> servers will be applied to all hosts sharing the same IP address. For
>> example, blacklisting the IP address of a spammer host will result in
>> all other hosts sharing that address having their access to the
>> requested service restricted.  [RFC6269] describes the issues in
>> detail.  Therefore, due to address sharing, servers need an extra
>> information than the source IP address to differentiate the sending
>> host.  We call HOST_ID this information.
>>
>
>Ok.
>
>>
>> Is it sufficient to
>>> identify the sending host? application? user?
>>
>> Med: I added this sentence:
>>
>> HOST_ID does not reveal the identity of a user, a subscriber or an
>> application.
>>
>
>That should suffice.
>
>>
>> It should also note that there may be
>>> issues with the fact that some IP addresses will be shared and
>>> others may not.  How does that impact the performance of these
>>> mechanisms?
>>
>> Med: The document assumes the address sharing function injects the
>> host identifier. BTW, there is already a performance criterion listed
>> in Figure 3.
>>
>
>I was thinking more generically than the performance
>criterion.  Suppose
>a server employs the IP-ID approach.  If several packets
>arrive with the
>same source IP address and the same value in the IP-ID field, there is
>no way to know if the IP-ID value was injected by a NAT/CGN
>box.  Or is
>your response saying that scenario is covered by the metrics used in
>Figure 3?  If so, which metric?  None of the descriptive text
>in Section
>5 talks about this type of issue.

Med: For the particular case of IP-ID, we assume the address sharing function does not assign the same ID under the same IP address during a given interval time. The following note:

     (1)  Requires mechanism to advertise NAT is participating in this
          scheme (e.g., DNS PTR record).

Is here to precise another mechanism is needed to inform the server IP-ID is carrying a host identifier.

>
>>>
>>> * I would like some text in the Objective text to explain why such
>>> sorting is needed.  This relates back to the Context description
>>> in Section 1.1.
>>
>> Med: The new text is:
>>
>> Policies relying on source IP address which are enforced by some
>> servers will be applied to all hosts sharing the same IP address. For
>> example, blacklisting the IP address of a spammer host will result in
>> all other hosts sharing that address having their access to the
>> requested service restricted.  [RFC6269] describes the issues in
>> detail.  Therefore, due to address sharing, servers need an extra
>> information than the source IP address to differentiate the sending
>> host.  We call HOST_ID this information.
>>
>
>Ok.
>
>>>
>>> * I don't think there needs to be a description of a Requirement in
>>> this document any more, so that text can be removed.
>>
>> Med: Done.
>>
>
>Ok.
>
>>>
>>> 6. Section 3.1 should be removed.  This is simply an analysis of
>>> the mechanisms, so there is no new work which needs requirements
>>> defined at this point.
>>
>> Med: Section 3.1 was added as a result of a review from privacy
>> people. I do think it is useful to maintain it. Perhaps, move the
>> text to the security considerations?
>>
>
>If anything, these are privacy considerations that may be impacted by
>these types of functions.  They can't be requirements at this point.
>Keeping the text is a good idea in that light, but don't call them
>Requirements.  Moving them to the Security Considerations
>section would
>work.

Med: Section 3.1 is now entitled "Privacy-related Considerations". The text is cleared to not use "requirement" term.

>
>>>
>>> 7. In Section 4.1.2, it would be good to describe any issues that
>>> the approach has with the original use of the Identification field
>>> for fragmentation reassembly.  If a middlebox changes the ID field,
>>> weird things can/will happen if those packets are fragmented
>>> somewhere.
>>
>> Med: We thought having a reference to
>> draft-ietf-intarea-ipv4-id-update (now RFC6864) is sufficient. The
>> impact of Middleboxes is already discussed in that document (see
>> section 5.3).
>>
>
>So maybe the way to clarify this is to re-word the text in 4.1.2.  How
>about:
>
>OLD:
>This usage is not compliant with what is recommended in
>    [I-D.ietf-intarea-ipv4-id-update].
>
>NEW:
>This usage is not consistent with the fragment reassembly use of the
>Identification field [RFC791] or the updated handling rules for the
>Identification field [I-D.ietf-intarea-ipv4-id-update].

Med: Works for me. I updated the text. Thanks.

>
>>>
>>> 8. I don't see a need for a forward reference in Section 4.2.2. I
>>> would suggest simply stating that the IP Option approach will
>>> support any/all transport protocols.
>>
>> Med: Done.
>>
>
>Ok.
>
>>
>>>
>>> 9. In Section 4.3.2...
>>>
>>> * I would like to see some description of what risk(s) may arise
>>> with a TCP option, even though they are apparently low
>>> probability.
>>
>> Med: The main risk we had in mind is session failure due to handling
>> an unknown TCP option. Are you suggesting this text should be
>> expanded?
>>
>> The risk related to handling a new TCP Option is low as measured in
>> [Options].
>>
>
>It would be good to mention at least one risk, like session
>failure, in
>the text to give the readers some clue as to the type of risks being
>considered.

Med: The text reads now:

"The risk to experience session failures due to handling a new TCP Option is low as measured...".

>
>>>
>>> * Additionally, the text contains "0,103%", which I assume should
>>> be "0.103%" (i.e., 1/10th of 1%).
>>
>> Med: Fixed. Thanks.
>>
>
>Ok.
>
>>>
>>> *The third bullet mentions that having several NATs in the path
>>> may cause issues for a TCP option.  Isn't this true for other
>>> approaches discussed in the document?  These should be identified
>>> as well.
>>
>> Med: There are some proposals (e.g., XFF, Forward-For) which allow to
>> prepend several host-ids. This is already mentioned in the text:
>>
>> When several address sharing devices are crossed, XFF/Forwarded-For
>> header can convey the list of IP addresses (e.g., Figure 1).  The
>> origin HOST_ID can be exposed to the target server.
>>
>> For some proposals (e.g., IP Option), this point is not mentioned as
>> the analysis shows these proposals are a no starter.
>>
>> For the TCP option, the loss of the original host_id may not be a
>> problem as the target usage is between proxies of a CGN and server.
>> Only the information leaked in the last leg is likely to be useful.
>>
>
>Ok.  I can see how this is covered.
>
>>>
>>> 10. In Section 4.5.1, I would suggest adding some text that
>>> describes how to interpret Figure 2.
>>
>> Med: Done.
>>
>
>Ok.
>
>>>
>>> 11. Is Section 4.6 theoretical or is there a specific reference
>>> that can be added for this technique?
>>
>> Med: Added a ref to RFC6346.
>>
>
>Ok.
>
>>>
>>> 12. Section 4.7.2 should clearly state that HIP is an ideal
>>> solution for this identification problem, even though the document
>>> states there is a high cost for deployment. I would also like to
>>> see some description of why HIP does not work if "the address
>>> sharing function is required to act as a UDP/TCP-HIP relay".
>>
>> Med: The current text says:
>>
>> "If the address sharing function is required to act as a UDP/TCP-HIP
>> relay, this is not a viable option."
>>
>> This require ALL servers in the Internet are HIP-enabled. It is
>> obvious this is not a viable option for a deployable solution.
>>
>
>That is understood.  It is not clear why the "UDP/TCP-HIP
>relay" aspect
>is mentioned.  Is there something special about that deployment model
>that has additional issues (other than needing all servers to
>understand
>HIP)?

Med: That model is mentioned because it does not require the host to be HIP-enabled. I updated the text to make it more explicit:

"An alternative deployment model, which does not require the client to be HIP-enabled, is the address sharing function behave as a UDP/TCP-HIP relay. This model is also not viable as it assumes all servers are ported to be HIP-enabled."

>
>>>
>>> 13. Section 4.8.2
>>>
>>> * The text says that the ICMP approach is viable for TCP and UDP.
>>> Any reason why it may be an issue for other transport protocols
>>> (e.g., SCTP or RTP)?
>>
>> Med: The ICMP approach can work for any transport protocol making use
>> of a port number. We mentioned TCP and UDP as these are the widely
>> deployed transport protocol. I updated the text as follows:
>>
>> OLD:
>>
>> o  This ICMP proposal is valid for both UDP and TCP.  Address
>> sharing function may be configurable with the transport protocol
>> which is allowed to trigger those ICMP messages.
>>
>> NEW:
>>
>> o  This ICMP proposal is valid for any transport protocol that uses
>> a port number.  Address sharing function may be configurable with the
>> transport protocol which is allowed to trigger those ICMP messages.
>>
>
>That works.
>
>>
>>>
>>> * I would also like to see some text describing why the approach is
>>> not compatible with cascading NATs.
>>
>> Med: The main reason is that each NAT in the path will generate an
>> ICMP message. These messages will be translated by the downstream
>> NATs. The remote server will receive multiple ICMP messages and will
>> need to decide which host identifier to use.
>>
>
>The above text, or something similar, should be added to that bullet.

Med: Done.


>
>>>
>>> * The last bullet mentions FMC and Open WiFi with no context or
>>> references.  These should either have references or their mention
>>> should be removed since they don't add much to the description.
>>> The same goes for their mention in Section 4.9.2 (8th bullet).
>>
>> Med: I updated the text with a reference to a document where the
>> problem is described in detail:
>>
>> OLD:
>>
>> o  In some scenarios (e.g., Fixed-Mobile Convergence, Open WiFi,
>> etc.), HOST_ID should be interpreted by intermediate devices which
>> embed Policy Enforcement Points (PEP, [RFC2753]) responsible for
>> granting access to some services.  These PEPs need to inspect all
>> received packets in order to find the companion (traffic) messages to
>> be correlated with ICMP messages conveying HOST_IDs.  This induces
>> more complexity to these intermediate devices.
>>
>> NEW:
>>
>> o  In some scenarios (e.g., Section 3 of
>> [I-D.boucadair-pcp-nat-reveal]), HOST_ID should be interpreted by
>> intermediate devices which embed Policy Enforcement Points (PEP,
>> [RFC2753]) responsible for granting access to some services. These
>> PEPs need to inspect all received packets in order to find the
>> companion (traffic) messages to be correlated with ICMP messages
>> conveying HOST_IDs.  This induces more complexity to these
>> intermediate devices.
>>
>> I updated also the text in Section 4.9.2.
>>
>
>Ok.
>
>>>
>>> 14. In Section 4.9.2 (3rd bullet), is the solution to publish this
>>> info in DNS or is that just an example approach?  This should be
>>> clarified.
>>
>> Med: DNS is mentioned as an example. I updated the text as follows;
>>
>> OLD:
>>
>> o  A hint should be provided to the ultimate server (or intermediate
>> nodes) the address sharing function implements IDENT protocol. This
>> can be achieved by publishing this capability using DNS.
>>
>> NEW:
>>
>> o  A hint should be provided to the ultimate server (or intermediate
>> nodes) the address sharing function implements IDENT protocol.  A
>> solution example is to publish this capability using DNS; other
>> solutions can be envisaged.
>>
>
>Ok.
>
>>
>>>
>>> 15. Section 5
>>>
>>> * Shouldn't there be an additional metric that covers the
>>> impact/cost of needing client or middlebox code changes?
>>
>> Med: For almost all solutions, the host identifier is not injected by
>> the client. Host_id injection is done by an address sharing
>> function. The cost of the change in the address sharing will depend
>> on the capabilities supported by that device: a NAT device re-writing
>> packets can inject (in theory) L3/4 information without extra cost
>> but inspecting packets to inject application-related header would
>> require new features. We focused on the expected performance impact
>> rather than the expect induced cost.
>>
>
>Ok.
>
>>>
>>> * Where did the 100% success ratio for IP-ID come from?  There have
>>> been documented cases of OSes setting the Identification field to
>>> zero.  If that is true, the success ratio can't be 100% can it?
>>
>> Med: the IP-ID tweaking is implemented in the address sharing
>> function not the host/OS. In theory, if the address sharing functions
>> follows the rule for IP-ID field, failure is unlikely.
>>
>
>Even in the case where packets are fragmented after the middlebox sets
>the IP-ID?

Med: if the middlbox follows the rules in rfc6864 and the same ID is not re-assigned to another host sharing the same ip address during a given time interval, why downstream fragmentation will be an issue?

 It seems that the success ratio ignores those types of
>errors.  Are those errors counted in the "Possible Perf Impact" metric?

Med: No.

>
>>>
>>> * Given the goal of this document to describe these identification
>>> mechanisms, I don't see the need for the last bulleted list.
>>
>> Med: The intent of that text is to provide a kind of conclusion. No
>> problem to remove it if you think so.
>
>I would prefer that type of discussion be done as prose, rather than a
>list.  I will not object if the authors want to leave it as a list.

Med: I removed the list to avoid mis-interpreting that text is promoting a particular solution.

>
>I do have one other issue...
>
>The discussion in 4.4.1 inter-mixes two different HTTP
>headers.  The XFF
>header is now obsolete (RFC 6648).  It has been replaced by the
>Forwarded: header defined in the referenced draft.  Figure 1 uses the
>correct header name, but the supporting text references XFF in several
>places.  All uses of XFF should be replaced by Forwarded: to be
>consistent with the current specs.

Med: I cleared the text when it makes sense.

>
>Regards,
>Brian
>