Re: [Ila] LISP for ILA

"Alberto Rodriguez Natal (natal)" <natal@cisco.com> Tue, 13 March 2018 19:29 UTC

Return-Path: <natal@cisco.com>
X-Original-To: ila@ietfa.amsl.com
Delivered-To: ila@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 44013129C56; Tue, 13 Mar 2018 12:29:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.53
X-Spam-Level:
X-Spam-Status: No, score=-14.53 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vdqhp23WdIiK; Tue, 13 Mar 2018 12:29:00 -0700 (PDT)
Received: from rcdn-iport-2.cisco.com (rcdn-iport-2.cisco.com [173.37.86.73]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C793012D7E5; Tue, 13 Mar 2018 12:28:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=8486; q=dns/txt; s=iport; t=1520969339; x=1522178939; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-id:content-transfer-encoding: mime-version; bh=nKRcHFoJz0uxNOpugVo7FxfvbVN96xPPUbJXe+Obxgk=; b=XXyTp/5ltq4y0JXrZSwhPyf4o/FjlvPP5DmqK+xC+51s5DrQq0RU0IXj ePnTRZGPIqzTaszvwT8XIGJJmOU5GFpp6Wa8PtrznaJoUuK4CTAaY5iQj LiXWy6moaYDQIJfGOAf+maR2JqDLofbc14mFDE9/ih94bSblmZZG7vBre s=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0DnAAB3Jaha/4sNJK1dGQEBAQEBAQEBAQEBAQcBAQEBAYNQZXAoCoNGih2NdIFbgT+UMYIVCiWFAAIagwYhNBgBAgEBAQEBAQJrJ4UkBAIjEToLEAIBCBIIAiYCAgIwFQIOAgQOBRuEfQ+sC4ImiGKCCoENhCQEgi6DOwEpDIJ5gy4BAQEBAYFYgx4wgjIEkzmHHQkChkJ/iSGBY06DZ4hJh3WCBYcpAhETAYErAR44gVJwFWQBghgJCYIdAxyBeXcBAY4ngRgBAQE
X-IronPort-AV: E=Sophos;i="5.47,466,1515456000"; d="scan'208";a="370250064"
Received: from alln-core-6.cisco.com ([173.36.13.139]) by rcdn-iport-2.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Mar 2018 19:28:58 +0000
Received: from XCH-ALN-016.cisco.com (xch-aln-016.cisco.com [173.36.7.26]) by alln-core-6.cisco.com (8.14.5/8.14.5) with ESMTP id w2DJSwgh006179 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 13 Mar 2018 19:28:58 GMT
Received: from xch-aln-005.cisco.com (173.36.7.15) by XCH-ALN-016.cisco.com (173.36.7.26) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Tue, 13 Mar 2018 14:28:58 -0500
Received: from xch-aln-005.cisco.com ([173.36.7.15]) by XCH-ALN-005.cisco.com ([173.36.7.15]) with mapi id 15.00.1320.000; Tue, 13 Mar 2018 14:28:58 -0500
From: "Alberto Rodriguez Natal (natal)" <natal@cisco.com>
To: Tom Herbert <tom@quantonium.net>
CC: "ila@ietf.org" <ila@ietf.org>, "lisp@ietf.org" <lisp@ietf.org>, "Fabio Maino (fmaino)" <fmaino@cisco.com>, Albert Cabellos <acabello@ac.upc.edu>, "Vina Ermagan (vermagan)" <vermagan@cisco.com>
Thread-Topic: [Ila] LISP for ILA
Thread-Index: AQHTs4N8SaQwoGnN/EueWlx2YLMdsaPCxXuAgAu1BQA=
Date: Tue, 13 Mar 2018 19:28:58 +0000
Message-ID: <F920CAE2-9042-41DF-B013-E8FE6F891596@cisco.com>
References: <F1093230-C087-4168-9C5F-8DA7AB677677@cisco.com> <CAPDqMer58nxEixtH=JuZh9WgM0xKkEQYEjwZ6zg3wTjD76gOHQ@mail.gmail.com>
In-Reply-To: <CAPDqMer58nxEixtH=JuZh9WgM0xKkEQYEjwZ6zg3wTjD76gOHQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.156.163.12]
Content-Type: text/plain; charset="utf-8"
Content-ID: <37BD98BCF7FEAD478DC588140986EFC3@emea.cisco.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/ila/3CtphqHxLDen2iFwmay3Ay2X52w>
Subject: Re: [Ila] LISP for ILA
X-BeenThere: ila@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Identifier Locator Addressing <ila.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ila>, <mailto:ila-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ila/>
List-Post: <mailto:ila@ietf.org>
List-Help: <mailto:ila-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ila>, <mailto:ila-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Mar 2018 19:29:02 -0000

Hi Tom,

Apologies for the delayed response. Thanks for your time reading the draft and for the feedback. See some comments inline.

On 3/5/18, 4:42 PM, "Tom Herbert" <tom@quantonium.net> wrote:

    Thanks for posting the draft!
    
    Overall, I think the approach straightforward, and it's very nice that
    there is no change required to the ILA architecture.
    
    I have some concerns about the LISP control plane in terms of
    DOSability and scalability. Btw, LISP is not in Linux kernel because
    of concerns about DOSability, so there was some prior discussion on
    this topic in related mailing lists,
    
    From the draft: "When an ILA-N has to send traffic towards a remote
    Identifier for which it does not have the associated Locator, it has
    to obtain it first from a MS."
    
    This is not actually true. The forwarding cache in the ILA-N is a
    routing optimization, if there is no entry on the cache then the
    packet is forwarded. If it needs to be transformed then that will be
    done by an ILA-R in the path. Until the cache is populated the routing
    might be sub-optimal but packets still flow.
    
As you point below, we're not saying otherwise in the draft. Sending the traffic to an ILA-R while the mapping is being retrieved is certainly an option. We'll update the text to be clearer on this.

    This is reflected below in: "While the mapping is being resolved via
    the Map-Request/  Map-Reply process, the ILA-N can send the data
    packets to the underlay using the SIR address."
    
    I think it should be assumed in ILA that not queuing packets and not
    dropping packets because of resolution are requirements (too much
    latency hit).

IMHO, these should not be hard requirements. Leveraging ILA-Rs for mapping resolution has another set of tradeoffs to be considered. An operator should be able to decide which set of tradeoffs makes sense for his/her particular scenario.
    
    If the map request is sent and the packet is forwarded, that means
    that a packet received at the ILA-N can generate two packets to be
    forwarded in the network. An obvious DOS attack is for a host to send
    random to destinations in the network to try to generate cache misses.
    Section 8.2 discusses this, but the solution to implement heavy
    hitters counters is not detailed. It would be nice to see more detail
    how this would work and how it will mitigate the DOS attack.
    
Heavy hitters counters are a well-known technique to mitigate DOS attacks in the data-plane (used not only in LISP). There are several papers on that in the literature, see [1] for a recent example. Regarding LISP in particular, you can find some research on the modeling of the LISP map-cache in [2][3]. Following that work, we did some designs on how to apply heavy hitters counters to the LISP map-cache back in the day. We'll try to make that research also available. 

    In ILAMP, a redirect method is defined. On a chache miss the packet is
    forwarded and no other action is taken. If an ILA-R does
    transformation it may send back a mapping redirect informing the ILA-N
    of a transformation. The redirects must be completely secure (one
    reason I'm partial to TCP) and are only sent to inform an ILA-N about
    a positive response. To a large extent this neutralizes the above
    random address DOS attack. There are other means of attack on the
    cache, but the exposure is narrowed I believe.
    
That model is supported in LISP via the use of Map-Notifies. However, moving the mapping resolution to the ILA-R comes at a cost. It's putting more load (in terms of both data and control plane) into an architectural component that it's not easy to scale out, since it requires (for instance) reconfiguring the underlay topology. 

    "LISP as defined in [I-D.ietf-lisp-rfc6833bis] runs over a UDP
    transport, however the exact same signaling can be used over a TCP
    transport without affecting the protocol operation."
    
    What is the status of TCP support? I believe the trend in datacenter
    control protocols is towards TCP and even RPC. Integrated security,
    congestion control, authentication, and tooling are strong points in
    favor of TCP. Is it reasonable to say that TCP is the preferred
    protocol? Can the LISP message easily be converted to RPC (REST,
    Thrift, GRPC, ...?
    
LISP can run as it is over TCP. It can also be extended with the mechanisms described in [4] when a reliable transport is in place. If TCP makes more sense for your particular scenario, then you can make it your preferred transport. In general, which transport to use will depend on the characteristics of each individual deployment. On you last point, please note that OpenDaylight already supports LISP over REST [5].

    Looking at the map-reply message format, I am concerned about its
    size. By my count, it's 40 bytes to provide one record with one
    locator where record and locator are 8 bytes. If we need to scale a
    system to billions of nodes this overhead could be an issue even if
    it's the control plane. Is there any plan to have a compressed version
    of this. For instance ,if there is only one RLOC returned wouldn't the
    priorities and weights be useless?
    
One thing that we can (and should) discuss is the best way to encode ILA Identifier/Locators into LISP messages. Regarding removing fields from the Map-Reply, I'm unsure that the cost of reducing protocol functionality, increasing signaling machinery and adding parsing complexity is worth saving a few bits. Specially if you are planning to later use an RPC version of the protocol.

Thanks again for your comments Tom. This is an interesting discussion :)

Best,
Alberto

[1] https://arxiv.org/pdf/1611.04825.pdf
[2] https://arxiv.org/pdf/1312.1378.pdf
[3] http://personals.ac.upc.edu/fcoras/publications/2015-fcoras-scalability.pdf
[4] https://tools.ietf.org/html/draft-kouvelas-lisp-map-server-reliable-transport-04
[5] https://wiki.opendaylight.org/view/OpenDaylight_Lisp_Flow_Mapping:Architecture