Re: [Int-area] AD evaluation: draft-ietf-intarea-nat-reveal-analysis

<mohamed.boucadair@orange.com> Tue, 12 February 2013 10:34 UTC

Return-Path: <mohamed.boucadair@orange.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 415EC21F8BDE for <int-area@ietfa.amsl.com>; Tue, 12 Feb 2013 02:34:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.067
X-Spam-Level:
X-Spam-Status: No, score=-2.067 tagged_above=-999 required=5 tests=[AWL=0.181, BAYES_00=-2.599, HELO_EQ_FR=0.35, UNPARSEABLE_RELAY=0.001]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7Pu1uz3z+99u for <int-area@ietfa.amsl.com>; Tue, 12 Feb 2013 02:34:19 -0800 (PST)
Received: from relais-inet.francetelecom.com (relais-ias91.francetelecom.com [193.251.215.91]) by ietfa.amsl.com (Postfix) with ESMTP id B479921F8BC9 for <int-area@ietf.org>; Tue, 12 Feb 2013 02:34:18 -0800 (PST)
Received: from omfedm05.si.francetelecom.fr (unknown [xx.xx.xx.1]) by omfedm10.si.francetelecom.fr (ESMTP service) with ESMTP id 2B39626498B; Tue, 12 Feb 2013 11:34:17 +0100 (CET)
Received: from PUEXCH61.nanterre.francetelecom.fr (unknown [10.101.44.32]) by omfedm05.si.francetelecom.fr (ESMTP service) with ESMTP id 03C1C35C065; Tue, 12 Feb 2013 11:34:17 +0100 (CET)
Received: from PUEXCB1B.nanterre.francetelecom.fr ([10.101.44.8]) by PUEXCH61.nanterre.francetelecom.fr ([10.101.44.32]) with mapi; Tue, 12 Feb 2013 11:34:16 +0100
From: mohamed.boucadair@orange.com
To: Brian Haberman <brian@innovationslab.net>, "int-area@ietf.org" <int-area@ietf.org>, "draft-ietf-intarea-nat-reveal-analysis@tools.ietf.org" <draft-ietf-intarea-nat-reveal-analysis@tools.ietf.org>
Date: Tue, 12 Feb 2013 11:34:15 +0100
Thread-Topic: [Int-area] AD evaluation: draft-ietf-intarea-nat-reveal-analysis
Thread-Index: Ac4InG1wLPHxSSaES8SryKKau3sWtgAU+TtA
Message-ID: <94C682931C08B048B7A8645303FDC9F36EAEE11CD9@PUEXCB1B.nanterre.francetelecom.fr>
References: <51195E93.4090103@innovationslab.net>
In-Reply-To: <51195E93.4090103@innovationslab.net>
Accept-Language: fr-FR
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: fr-FR
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-PMX-Version: 5.6.1.2065439, Antispam-Engine: 2.7.2.376379, Antispam-Data: 2012.12.31.121227
Subject: Re: [Int-area] AD evaluation: draft-ietf-intarea-nat-reveal-analysis
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF Internet Area Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/int-area>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Feb 2013 10:34:20 -0000

Dear Brian,

Many thanks for the detailed review.

Please see inline.

Cheers,
Med

>-----Message d'origine-----
>De : int-area-bounces@ietf.org
>[mailto:int-area-bounces@ietf.org] De la part de Brian Haberman
>Envoyé : lundi 11 février 2013 22:12
>À : int-area@ietf.org;
>draft-ietf-intarea-nat-reveal-analysis@tools.ietf.org
>Objet : [Int-area] AD evaluation:
>draft-ietf-intarea-nat-reveal-analysis
>
>All,
>      I have completed my AD evaluation for the above draft and have
>some feedback for the group.  I will focus on the substantive comments
>for the time being since some of them may result in re-written text in
>places.  I will follow up with the document authors on editorial nits
>and such at a later time.
>
>1. It is obvious from the way certain sections of text are
>written that
>the original intent was to make a recommendation on which of the
>described approaches should be used to disambiguate between multiple
>hosts behind a NAT/CGN.  Given that the document is now simply a
>characterization of those mechanisms, I would suggest spending
>some time
>cleaning up the Abstract, Section 1.1, and Section 2 so that
>they focus
>on the task of describing the mechanisms, rather than mentioning
>abstract requirements for those mechanisms. There are concrete
>suggestions a little later in this note.

Med: It is true there was a version of the document which includes a recommendation but the initial intent of the document was to analyze candidate solution. I updated the text to make it explicit: I changed the text in the sections you mentioned. In particular, the change in the abstract is:

OLD:

   This document analyzes a set of solution candidates to mitigate some
   of the issues encountered when address sharing is used.  In
   particular, this document focuses on means to reveal a host
   identifier (HOST_ID) when a Carrier Grade NAT (CGN) or application
   proxies are involved in the path.  This host identifier must be
   unique to each host under the same shared IP address.

NEW:

   This document is a collection of solutions to reveal a host
   identifier (denoted as HOST_ID) when a Carrier Grade NAT (CGN) or
   application proxies are involved in the path.  This host identifier
   is used by a remote server to sort out the packets by sending host.
   The host identifier must be unique to each host under the same shared
   IP address.

   This document analyzes a set of solution candidates to reveal a host
   identifier; no recommendation is sketched in the document.

>
>2. The mechanisms described in this draft fall into two broad
>categories, deployed and proposed. Those in the former category can be
>characterized based on actual usage scenarios, which would benefit the
>table shown in Figure 3. The latter should be described in
>terms of what
>they are proposed to do, but cannot be assessed against the
>same metrics
>as the other groups.

Med: Figure 3 includes this note:

 (2)  This solution is widely deployed.

Can you explicit what change you want to be added? Thanks.

>
>3. The "Requirements Language" section should be removed.  As an
>Informational document describing mechanisms, there is no need to
>leverage 2119 keywords.

Med: Done.

>
>4. It would be useful if the third paragraph of Section 1.1
>was expanded
>to discuss the risks in more detail.  In fact, it may be clearer to
>understand this draft if the problem statement came before the context
>(Section 2).

Med: I changed the text as follows:

OLD:

   The sole use of the IPv4 address is not sufficient to uniquely
   distinguish a host.  As a mitigation, it is tempting to investigate
   means which would help in disclosing an information to be used by the
   remote server as a means to uniquely disambiguate packets of hosts
   using the same IPv4 address.

NEW:

   In particular, some servers use the source IPv4 address as an
   identifier to treat some incoming connections differently.  Due to
   the deployment of CGNs (e.g., NAT44 [RFC3022], NAT64 [RFC6146]), that
   address will be shared.  In particular, when a server receives
   packets from the same source address, because this address is shared,
   the server does not know which host is the sending host [RFC6269].
   The sole use of the IPv4 address is not sufficient to uniquely
   distinguish a host.  As a mitigation, it is tempting to investigate
   means which would help in disclosing an information to be used by the
   remote server as a means to uniquely disambiguate packets of hosts
   using the same IPv4 address.

>
>5. Section 2
>
>* The Observation text should provide some brief examples of
>how and why
>special treatment is needed/provided.

Med: I updated the text with an example:

   Policies relying on source IP address which are enforced by some
   servers will be applied to all hosts sharing the same IP address.
   For example, blacklisting the IP address of a spammer host will
   result in all other hosts sharing that address having their access to
   the requested service restricted.  [RFC6269] describes the issues in
   detail.  Therefore, due to address sharing, servers need an extra
   information than the source IP address to differentiate the sending
   host.  We call HOST_ID this information.


  Is it sufficient to
>identify the
>sending host? application? user?

Med: I added this sentence:

   HOST_ID does not reveal the identity of a user, a subscriber or an
   application.


It should also note that there may be
>issues with the fact that some IP addresses will be shared and others
>may not.  How does that impact the performance of these mechanisms?

Med: The document assumes the address sharing function injects the host identifier. BTW, there is already a performance criterion listed in Figure 3.

>
>* I would like some text in the Objective text to explain why such
>sorting is needed.  This relates back to the Context description in
>Section 1.1.

Med: The new text is:

   Policies relying on source IP address which are enforced by some
   servers will be applied to all hosts sharing the same IP address.
   For example, blacklisting the IP address of a spammer host will
   result in all other hosts sharing that address having their access to
   the requested service restricted.  [RFC6269] describes the issues in
   detail.  Therefore, due to address sharing, servers need an extra
   information than the source IP address to differentiate the sending
   host.  We call HOST_ID this information.

>
>* I don't think there needs to be a description of a
>Requirement in this
>document any more, so that text can be removed.

Med: Done.

>
>6. Section 3.1 should be removed.  This is simply an analysis of the
>mechanisms, so there is no new work which needs requirements
>defined at
>this point.

Med: Section 3.1 was added as a result of a review from privacy people. I do think it is useful to maintain it. Perhaps, move the text to the security considerations?

>
>7. In Section 4.1.2, it would be good to describe any issues that the
>approach has with the original use of the Identification field for
>fragmentation reassembly.  If a middlebox changes the ID field, weird
>things can/will happen if those packets are fragmented somewhere.

Med: We thought having a reference to draft-ietf-intarea-ipv4-id-update (now RFC6864) is sufficient. The impact of Middleboxes is already discussed in that document (see section 5.3).

>
>8. I don't see a need for a forward reference in Section
>4.2.2. I would
>suggest simply stating that the IP Option approach will
>support any/all
>transport protocols.

Med: Done.


>
>9. In Section 4.3.2...
>
>* I would like to see some description of what risk(s) may
>arise with a
>TCP option, even though they are apparently low probability.

Med: The main risk we had in mind is session failure due to handling an unknown TCP option. Are you suggesting this text should be expanded?

   The risk related to handling a new TCP Option is low as measured in
   [Options].

>
>* Additionally, the text contains "0,103%", which I assume should be
>"0.103%" (i.e., 1/10th of 1%).

Med: Fixed. Thanks.

>
>*The third bullet mentions that having several NATs in the path may
>cause issues for a TCP option.  Isn't this true for other approaches
>discussed in the document?  These should be identified as well.

Med: There are some proposals (e.g., XFF, Forward-For) which allow to prepend several host-ids. This is already mentioned in the text:

   When several address sharing devices are crossed, XFF/Forwarded-For
   header can convey the list of IP addresses (e.g., Figure 1).  The
   origin HOST_ID can be exposed to the target server.

For some proposals (e.g., IP Option), this point is not mentioned as the analysis shows these proposals are a no starter.

For the TCP option, the loss of the original host_id may not be a problem as the target usage is between proxies of a CGN and server. Only the information leaked in the last leg is likely to be useful.

>
>10. In Section 4.5.1, I would suggest adding some text that describes
>how to interpret Figure 2.

Med: Done.

>
>11. Is Section 4.6 theoretical or is there a specific
>reference that can
>be added for this technique?

Med: Added a ref to RFC6346.

>
>12. Section 4.7.2 should clearly state that HIP is an ideal
>solution for
>this identification problem, even though the document states
>there is a
>high cost for deployment. I would also like to see some description of
>why HIP does not work if "the address sharing function is required to
>act as a UDP/TCP-HIP relay".

Med: The current text says:

"If the address sharing function is required to act as a UDP/TCP-HIP relay, this is not a viable option."

This require ALL servers in the Internet are HIP-enabled. It is obvious this is not a viable option for a deployable solution.

>
>13. Section 4.8.2
>
>* The text says that the ICMP approach is viable for TCP and UDP.  Any
>reason why it may be an issue for other transport protocols
>(e.g., SCTP
>or RTP)?

Med: The ICMP approach can work for any transport protocol making use of a port number. We mentioned TCP and UDP as these are the widely deployed transport protocol. I updated the text as follows:

OLD:

   o  This ICMP proposal is valid for both UDP and TCP.  Address sharing
      function may be configurable with the transport protocol which is
      allowed to trigger those ICMP messages.

NEW:

   o  This ICMP proposal is valid for any transport protocol that uses a
      port number.  Address sharing function may be configurable with
      the transport protocol which is allowed to trigger those ICMP
      messages.


>
>* I would also like to see some text describing why the
>approach is not
>compatible with cascading NATs.

Med: The main reason is that each NAT in the path will generate an ICMP message. These messages will be translated by the downstream NATs. The remote server will receive multiple ICMP messages and will need to decide which host identifier to use.

>
>* The last bullet mentions FMC and Open WiFi with no context or
>references.  These should either have references or their
>mention should
>be removed since they don't add much to the description.  The
>same goes
>for their mention in Section 4.9.2 (8th bullet).

Med: I updated the text with a reference to a document where the problem is described in detail:

OLD:

   o  In some scenarios (e.g., Fixed-Mobile Convergence, Open WiFi,
      etc.), HOST_ID should be interpreted by intermediate devices which
      embed Policy Enforcement Points (PEP, [RFC2753]) responsible for
      granting access to some services.  These PEPs need to inspect all
      received packets in order to find the companion (traffic) messages
      to be correlated with ICMP messages conveying HOST_IDs.  This
      induces more complexity to these intermediate devices.

NEW:

   o  In some scenarios (e.g., Section 3 of
      [I-D.boucadair-pcp-nat-reveal]), HOST_ID should be interpreted by
      intermediate devices which embed Policy Enforcement Points (PEP,
      [RFC2753]) responsible for granting access to some services.
      These PEPs need to inspect all received packets in order to find
      the companion (traffic) messages to be correlated with ICMP
      messages conveying HOST_IDs.  This induces more complexity to
      these intermediate devices.

I updated also the text in Section 4.9.2.

>
>14. In Section 4.9.2 (3rd bullet), is the solution to publish
>this info
>in DNS or is that just an example approach?  This should be clarified.

Med: DNS is mentioned as an example. I updated the text as follows;

OLD:

   o  A hint should be provided to the ultimate server (or intermediate
      nodes) the address sharing function implements IDENT protocol.
      This can be achieved by publishing this capability using DNS.

NEW:

   o  A hint should be provided to the ultimate server (or intermediate
      nodes) the address sharing function implements IDENT protocol.  A
      solution example is to publish this capability using DNS; other
      solutions can be envisaged.


>
>15. Section 5
>
>* Shouldn't there be an additional metric that covers the
>impact/cost of
>needing client or middlebox code changes?

Med: For almost all solutions, the host identifier is not injected by the client. Host_id injection is done by an address sharing function.
The cost of the change in the address sharing will depend on the capabilities supported by that device: a NAT device re-writing packets can inject (in theory) L3/4 information without extra cost but inspecting packets to inject application-related header would require new features. We focused on the expected performance impact rather than the expect induced cost.

>
>* Where did the 100% success ratio for IP-ID come from?  There
>have been
>documented cases of OSes setting the Identification field to zero.  If
>that is true, the success ratio can't be 100% can it?

Med: the IP-ID tweaking is implemented in the address sharing function not the host/OS. In theory, if the address sharing functions follows the rule for IP-ID field, failure is unlikely.

>
>* Given the goal of this document to describe these identification
>mechanisms, I don't see the need for the last bulleted list.

Med: The intent of that text is to provide a kind of conclusion. No problem to remove it if you think so.

>
>Regards,
>Brian
>_______________________________________________
>Int-area mailing list
>Int-area@ietf.org
>https://www.ietf.org/mailman/listinfo/int-area
>