Re: [MBONED] WGLC for draft-ietf-mboned-driad-amt-discovery

"Holland, Jake" <jholland@akamai.com> Mon, 15 April 2019 23:06 UTC

Return-Path: <jholland@akamai.com>
X-Original-To: mboned@ietfa.amsl.com
Delivered-To: mboned@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9F1E01201BF for <mboned@ietfa.amsl.com>; Mon, 15 Apr 2019 16:06:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.339
X-Spam-Level:
X-Spam-Status: No, score=-1.339 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KHOP_DYNAMIC=1.363, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SatKDnAT2M9C for <mboned@ietfa.amsl.com>; Mon, 15 Apr 2019 16:06:32 -0700 (PDT)
Received: from mx0a-00190b01.pphosted.com (mx0a-00190b01.pphosted.com [IPv6:2620:100:9001:583::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E995A120284 for <mboned@ietf.org>; Mon, 15 Apr 2019 16:06:32 -0700 (PDT)
Received: from pps.filterd (m0122332.ppops.net [127.0.0.1]) by mx0a-00190b01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x3FMvTg6001476; Tue, 16 Apr 2019 00:06:28 +0100
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : subject : date : message-id : references : in-reply-to : content-type : content-id : content-transfer-encoding : mime-version; s=jan2016.eng; bh=5LKjeCJQFmH+iu9jgeIwdk+uT0dgH7+LqOVY04tec7s=; b=CxNALeEhEDpC/IgbG1bdroFyK0Yr7fBDkQZIxTsKCLXbKXcsHfRIZMOH5r3fJCG3k5uI n/UCZgaqG/h7sKe0d4mJwCJovEMO1/jpjnP7VjdR/U2k0J6HiVW4JK7b9kwO6C9NOTq0 ZYz9aZ1BtDigEzURrX6VKQQ2NVmrY60fZjLmW/pSZKXkT0Dskh2raa1CxchwZfpLu3/S w1DD0rxq8F5cahmwrQFm2UmcTbBkJvUhJ/eBfHtk9q8+o2nLMDjEmQfHJbesx+cDB8Lg SIwAn+H6MF9A+ku3VJRlnZA+6HASulAjLvmkS4/YRE1l6AxhBBdY2Hbseu9MyXdJNp9E fg==
Received: from prod-mail-ppoint4 (a96-6-114-87.deploy.static.akamaitechnologies.com [96.6.114.87] (may be forged)) by mx0a-00190b01.pphosted.com with ESMTP id 2ru7tfhh03-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 16 Apr 2019 00:06:28 +0100
Received: from pps.filterd (prod-mail-ppoint4.akamai.com [127.0.0.1]) by prod-mail-ppoint4.akamai.com (8.16.0.27/8.16.0.27) with SMTP id x3FMlUKc017235; Mon, 15 Apr 2019 19:06:27 -0400
Received: from email.msg.corp.akamai.com ([172.27.27.25]) by prod-mail-ppoint4.akamai.com with ESMTP id 2rub3vamh8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 15 Apr 2019 19:06:26 -0400
Received: from USTX2EX-DAG1MB4.msg.corp.akamai.com (172.27.27.104) by ustx2ex-dag1mb2.msg.corp.akamai.com (172.27.27.102) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 15 Apr 2019 18:06:26 -0500
Received: from USTX2EX-DAG1MB4.msg.corp.akamai.com ([172.27.6.134]) by ustx2ex-dag1mb4.msg.corp.akamai.com ([172.27.6.134]) with mapi id 15.00.1473.003; Mon, 15 Apr 2019 18:06:26 -0500
From: "Holland, Jake" <jholland@akamai.com>
To: Leonard Giuliano <lenny=40juniper.net@dmarc.ietf.org>, MBONED WG <mboned@ietf.org>
Thread-Topic: [MBONED] WGLC for draft-ietf-mboned-driad-amt-discovery
Thread-Index: AQHU8WqivJJRbqerfUWQRTVmyWr6C6Y9vGAA///+9oA=
Date: Mon, 15 Apr 2019 23:06:25 +0000
Message-ID: <49C3EA1D-FE7E-4229-BF6D-6060939D4A70@akamai.com>
References: <alpine.DEB.2.02.1904121206190.12864@contrail-ubm-wing.svec1.juniper.net> <alpine.DEB.2.02.1904150902150.11698@contrail-ubm-wing.svec1.juniper.net>
In-Reply-To: <alpine.DEB.2.02.1904150902150.11698@contrail-ubm-wing.svec1.juniper.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/10.17.1.190326
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [172.19.113.89]
Content-Type: text/plain; charset="utf-8"
Content-ID: <CEACD3A013242540B383589D23DD8EE4@akamai.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-04-15_08:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904150153
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-04-15_08:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904150154
Archived-At: <https://mailarchive.ietf.org/arch/msg/mboned/slVtdG8-qn3YehLV5zxRxCtAdns>
Subject: Re: [MBONED] WGLC for draft-ietf-mboned-driad-amt-discovery
X-BeenThere: mboned@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mail List for the Mboned Working Group <mboned.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mboned>, <mailto:mboned-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mboned/>
List-Post: <mailto:mboned@ietf.org>
List-Help: <mailto:mboned-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mboned>, <mailto:mboned-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Apr 2019 23:06:35 -0000

Thanks Lenny, much appreciated.  <jh>Responses inline.</jh>

On 2019-04-15, 09:10, "Leonard Giuliano" <lenny=40juniper.net@dmarc.ietf.org> wrote:

    
    <chair hat off>
    
    Overall, I think this doc is very thorough, clearly written and and
    addresses a much needed area of specification for AMT.  Some comments:
    
    Sect 2.3.2: should the definition of connection completion take into 
    consideration traffic health as well?  That is, the relay is up and happy, 
    but has no multicast connectivity to the source, hence you could have a 
    blackhole.  At the very least, should it be completion of the 3-way 
    handshake?

<jh>
I'm not completely sure what you mean by "3-way handshake" here, but I'm assuming
you mean the one mentioned in RFC 7450, particularly sections 5.1.3 and 5.1.4:
https://tools.ietf.org/html/rfc7450#section-5.1.3
https://tools.ietf.org/html/rfc7450#section-5.1.4

The definition in section 2.3.2 of this draft calls the connection complete
during the 2nd part of this 3-way handshake, which is receipt of the Membership
Query message.  (Since we're talking about a gateway-side decision, I don't
think there's anything the client knows about after the 3rd part of the handshake,
before starting to receive data traffic.)

I agree with you that it would be nice to have information about multicast
connectivity to the source, but I don't think this can be safely discovered when
probing connectivity of multiple connections in parallel (as described by the
Happy Eyeballs part), because if we actually forward the subscription to
one or more (S,G)s to multiple relays, we might start getting traffic from all
of them, and that traffic might be larger than we should receive, or could
result in forwarding multiple copies of packets routinely, if an implementation
doesn't take specific steps to avoid it.

Therefore, the way this doc handles it is to allow multiple connections to
start in parallel up until receiving the Membership Query (which is stage 2 of
the 3-way handshake), and then to pick the most preferred of those connections
to get the Membership Update including a subscription to data traffic, and then
after subscribing, to use the Traffic Health heuristics (section 2.5.4) to decide
whether the gateway needs to restart discovery with a hold-down for the relay that
had bad health on the traffic.

Do you think I need a reference to 2.5.4 in section 2.3.2 to make this more
clear?  Or is there a deeper objection here?
</jh>
    
    Sect 2.3.2: “See Section 2.5.5 for further …”
            -“See Section 2.5.5 of this doctment for further…” to eliminate 
    confusion, as when I first read this, I wasn't sure if it was referring to 
    RFCs 7450 or 8305 (turns out, neither).

<jh>
That sounds fine to me, my local copy is updated to add "of this document" after
the section reference and before "for further...".
</jh>
    
    Sect 2.4.1: How about #6- The application layer includes a suggested relay
    address (as a hint)
            -this is what we’ve done in the VLC with AMT GW build.
    Specifically, VLC has a configurable AMT relay address, which uses a
    well-known FQDN (amt-relay.m2icast.net) which has multiple A records of
    known, healthy relays.  Or is this scenario covered by #3?

<jh>
This scenario is covered by #3.

Do you think we need a definition of "administrative configuration" in section
1.2.2 or something (and maybe "administratively" added before "configured" in
#3)?  I had assumed the concept was reasonably well understood, but  if this
wasn't clear, maybe it's not obvious enough in the text.

(Any WG opinions here on whether #3 is already clear enough on allowing this
usage?)
</jh>
    
    Sect 2.4.2: I found this sect a little tough to follow.  There are 3
    enumerated options, but the text that follows includes other options (like
    admin config).  Also, I found it curious that you have Global Anycast so
    high in the list of prefs (before DRIAD).  Global Anycast seems very
    unlikely to ever be a good deployment option since it’s so vulnerable to
    DoS (recall Mikael and my comments in the mtg in Prague)
<jh>
My presentation in Prague focused largely on why I felt the update to put the
global anycast IP before DRIAD was necessary.  For easy reference, here's a
link to the video (from 3m44s into my presentation, where I begin directly
addressing this point):
https://www.youtube.com/watch?v=jIDYHFpJYV8&t=55m44s

If that was unclear, I guess I'd like to get some more specific questions about
it?

As painful as it is to see myself speak, I re-watched that whole thing, and I
heard Toerless give some comments speaking directly to this question (to which
I responded in another thread [1]), but I think Mikael's comment was not about
this, and I didn't see any mic comments from you about this.  Were there some
other comments you meant to recall?
[1] https://mailarchive.ietf.org/arch/msg/mboned/pgw9SLTUvjSGZ4jGKDjYmyheFOA
</jh>
    
    Anyway, could this section just include a simple list of all the options
    in order of pref?  Something like:
    
    1) DNS-SD
    2) DRIAD
    3) Admin config of GW or App level
    4) Global Anycast address

<jh>
I like this idea, thanks, I think it will make things clearer here.

However, I don't think the ordering you've given is right.  In the absence
of administrative config, I think (pending further discussion of the above
comment) the order would look like this:

   1) DNS-SD
   2) Global Anycast (mainly to support local usage!)
   3) DRIAD

I guess one way to look at this is to just put admin config in the very front:
   0) Administrative config

However, I think in general, administrative config is also capable of doing
things like suppressing one or more of these steps, or changing the ordering
of these steps, or adding other steps that take account of other information
about the network to influence ordering.

In that sense, I'm not sure just sticking "administrative config" on the front
of the list makes as much sense as keeping it outside this list, in order to be
a super-override for the list as a whole.

So my proposed update to incorporate your excellent suggestion for making a
numbered short summary list for easy reference is to make this change:

OLD:
   Accordingly, AMT gateways SHOULD by default prefer relays first by
   DNS-SD if available, then with the anycast addresses defined in
   Section 7 of [RFC7450] (namely: 192.52.193.1 and 2001:3::1), then by
   DRIAD as described in this document (in precedence order, as
   described in Section 4.2.1).

   This default behavior MAY be overridden by administrative
   configuration where other behavior is more appropriate for the
   gateway within its network.

NEW:
   Accordingly, AMT gateways SHOULD by default prefer relays in this
   order:

      1. DNS-SD
      2. Anycast addresses from Section 7 of [RFC7450]
      3. DRIAD

   This default behavior MAY be overridden by administrative
   configuration where other behavior is more appropriate for the
   gateway within its network.

Additionally, I'll remove the numbers on the long-form explanations
above this piece in 2.4.2, since I think multiple different numbering
schemes in the same section would add confusion.

If you think it's better, I could also put this in front of the big
explanations, take out "Accordingly, ", and add in something like "the
reasoning for this preference ordering is described below".  That's not
done in my local copy, but if I get responses with "yes, that's better"
here I'm happy to make the change.  Whatever's most clear would be great.

(Or if that doesn't address your concerns, can you suggest some text that
would, which takes into account the above considerations?  Thanks.)
</jh>
    
    Sect 3.2.1: 1st para, last sentence, “… by finding a A or AAAA records..”
            -“ by finding an A or AAAA record” or “by finding A or AAAA
    records”
<jh>
Fixed locally, thanks.
</jh>
    
    
    Other Relay discovery options- as I mentioned, in the VLC build with AMT,
    we have a configurable option for the relay address with a well-known fqdn
    with multiple A records as the default.  It will then receive all the A
    records as an ordered list and try to use one at a time until it receives
    data.  This method provides relay discovery and resilience, but not
    optimality.  In Prague, got a suggestion from Tom P that you could get
    optimality by pinging each of the relays from the list of A records and
    choosing the one with the lowest latency (or perhaps joining all relays
    and then selecting the one with the healthiest stream and pruning the
    others).  Do you think these options should be mentioned anywhere in this
    doc?
<jh>
I like this idea, and I agree it's a good concept to get into the doc somewhere.
I think it works well as a heuristic that should be mentioned somewhere.

I think recommending ping specifically is a bad idea (it's not so clear that the
ICMP path RTT will be the same as the UDP path RTT), though I guess it's maybe
reasonable in a lot of places.  RTT of the response between sending the Request
packet and receiving the Membership Query packet (plus history of that metric
when it's known) sounds like a better approach in general, so I guess if we're
putting something in about this, I'd like it to capture that usage, or I'd at
least prefer to avoid a mention of ping in particular as compared with other
measurements.

What do you think of a change in section 2.4.2 to integrate this suggestion?
Would this cover what you're looking for?

OLD:
   Among relay addresses that still have an equivalent preference after
   the above orderings, a gateway MUST make a non-deterministic choice
   for relay preference ordering, in order to support load balancing by
   DNS configurations that provide many relay options.  (Note that
   gateways not implementing a Happy Eyeballs algorithm are not required
   to use the Destination Address Selection ordering, but are still
   required to use non-deterministic ordering among equally preferred
   relays.)

NEW:
   Among relay addresses that still have an equivalent preference after
   the above orderings, a gateway MUST make a non-deterministic choice
   for relay preference ordering, in order to support load balancing by
   DNS configurations that provide many relay options.

   The gateway MAY introduce a bias in the non-deterministic choice
   according to network topology or timing information obtained out of
   band or from a historical record.  The collection of this information
   is out of scope for this document, but a gateway in possession of
   such information MAY use it to prefer topologically closer relays.

(This suggestion also cuts out what I think having re-read this bit is
probably a useless side note that happens to be in the same spot, but if
somebody doesn't like the edit, please let me know as a separate point.)
</jh>

<jh>
Thanks very much for your comments, and please anyone feel free to jump
in here with opinions and responses.

Best regards,
Jake
</jh>    
    
    On Fri, 12 Apr 2019, Leonard Giuliano wrote:
    
    | 
    | In Prague, there appeared to be solid support to initiate last call, so we
    | would like to officially begin working group last call for
    | draft-ietf-mboned-driad-amt-discovery.  Please post whether you support/oppose
    | the advancement of this draft as well as any comments you may have to the list
    | by May 3.  Also, please note if you are aware of any IPR involved in this
    | draft (we must hear from the author about IPR).
    | 
    | Most recent version of the draft can be found here:
    | 
    | https://datatracker.ietf.org/doc/draft-ietf-mboned-driad-amt-discovery/
    | 
    
    _______________________________________________
    MBONED mailing list
    MBONED@ietf.org
    https://www.ietf.org/mailman/listinfo/mboned