Re: [dhcwg] WG Adoption call for draft-gandhewar-dhc-relay-initiated-release and draft-gandhewar-dhc-v6-relay-initiated-release (Expires Oct 27, 2015)

Sunil Gandhewar <sgandhewar@juniper.net> Sat, 17 October 2015 12:35 UTC

Return-Path: <sgandhewar@juniper.net>
X-Original-To: dhcwg@ietfa.amsl.com
Delivered-To: dhcwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 68DA31B2A37 for <dhcwg@ietfa.amsl.com>; Sat, 17 Oct 2015 05:35:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oZ2XBAFLRYvw for <dhcwg@ietfa.amsl.com>; Sat, 17 Oct 2015 05:34:54 -0700 (PDT)
Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1on0769.outbound.protection.outlook.com [IPv6:2a01:111:f400:fc10::769]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 126831B2A31 for <dhcwg@ietf.org>; Sat, 17 Oct 2015 05:34:53 -0700 (PDT)
Received: from BLUPR0501MB1043.namprd05.prod.outlook.com (10.160.35.142) by BLUPR0501MB1042.namprd05.prod.outlook.com (10.160.35.141) with Microsoft SMTP Server (TLS) id 15.1.300.14; Sat, 17 Oct 2015 12:34:33 +0000
Received: from BLUPR0501MB1043.namprd05.prod.outlook.com ([10.160.35.142]) by BLUPR0501MB1043.namprd05.prod.outlook.com ([10.160.35.142]) with mapi id 15.01.0300.010; Sat, 17 Oct 2015 12:34:33 +0000
From: Sunil Gandhewar <sgandhewar@juniper.net>
To: "dhcwg@ietf.org" <dhcwg@ietf.org>
Thread-Topic: RE: [dhcwg] WG Adoption call for draft-gandhewar-dhc-relay-initiated-release and draft-gandhewar-dhc-v6-relay-initiated-release (Expires Oct 27, 2015)
Thread-Index: AdEI2CoY7u6miQAHTIuRoKpZ1XqeSQ==
Date: Sat, 17 Oct 2015 12:34:32 +0000
Message-ID: <BLUPR0501MB104312C0483A211A54CF776AC23C0@BLUPR0501MB1043.namprd05.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=sgandhewar@juniper.net;
x-originating-ip: [116.197.184.10]
x-microsoft-exchange-diagnostics: 1; BLUPR0501MB1042; 5:vrjD6KFy81wmHvWhgJUeyccEjQs8cwNctyZ+AFU7JrZmu0nIpOS8PrbWxoRj2IS8k+WBhlPrl/N/3h+jHZFgQwNhaP1hletcXKip5U+6/0ZJJYARG7LVBs3iLTkAUb0heRsKR+dCSI6aBx/FIB15YQ==; 24:IBAWfFTdpVjFysjvhl8jX+9WutijRx3UN6udlzeg5mJXP8HPz0bJCD41H/caBNjNJ4Ah5znjpu1KBcwAzbsTC6wYbhXa1b0Ruhq4rWcPAXA=; 20:RaZpQP76YrgH5SWelVyhQYcH+R77SxkG/DpmJuMHFTsHuiNO3isBitNd1W+V5csxOR/AVLAvMtrP+9wCHEBZGA==
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(42140001); SRVR:BLUPR0501MB1042;
x-microsoft-antispam-prvs: <BLUPR0501MB104255BD2040DCA6B1CF1291C23C0@BLUPR0501MB1042.namprd05.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(138986009662008)(42673675456677)(108003899814671)(83020558694031);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(3002001); SRVR:BLUPR0501MB1042; BCL:0; PCL:0; RULEID:; SRVR:BLUPR0501MB1042;
x-forefront-prvs: 07326CFBC4
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(199003)(189002)(85664002)(53754006)(97736004)(86362001)(575784001)(5004730100002)(33656002)(74316001)(81156007)(5003600100002)(189998001)(46102003)(110136002)(107886002)(5001960100002)(2351001)(5008740100001)(122556002)(11100500001)(2900100001)(5007970100001)(76576001)(5890100001)(2501003)(40100003)(230783001)(64706001)(66066001)(77096005)(102836002)(101416001)(15975445007)(50986999)(54356999)(5002640100001)(106356001)(92566002)(99286002)(105586002)(19580395003)(19580405001)(87936001)(10400500002)(4001430100001)(559001)(579004); DIR:OUT; SFP:1102; SCL:1; SRVR:BLUPR0501MB1042; H:BLUPR0501MB1043.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: juniper.net does not designate permitted sender hosts)
Content-Type: multipart/alternative; boundary="_000_BLUPR0501MB104312C0483A211A54CF776AC23C0BLUPR0501MB1043_"
MIME-Version: 1.0
X-OriginatorOrg: juniper.net
X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Oct 2015 12:34:32.4951 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR0501MB1042
Archived-At: <http://mailarchive.ietf.org/arch/msg/dhcwg/2udIiZGLmzhAUqFafaOZ3f66cVA>
Cc: Sunil Gandhewar <sgandhewar@juniper.net>, "Bernie Volz (volz)" <volz@cisco.com>, Ted Lemon <ted.lemon@nominum.com>
Subject: Re: [dhcwg] WG Adoption call for draft-gandhewar-dhc-relay-initiated-release and draft-gandhewar-dhc-v6-relay-initiated-release (Expires Oct 27, 2015)
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <dhcwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dhcwg/>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Oct 2015 12:35:03 -0000

Hi All,

Thank you all for your replies.

In this email please find my response to all the emails so far. Please let me know if I missed anything.

Multiple scenarios are described in the draft draft-gandhewar-dhc-v6-relay-initiated-release-01<https://tools.ietf.org/html/draft-gandhewar-dhc-v6-relay-initiated-release-01> (https://tools.ietf.org/html/draft-gandhewar-dhc-v6-relay-initiated-release-01) for DHCPv6 in section 1.1 e.g. device replaced,  client moved, Wi-Fi centers frequent login-logouts, etc where this functionality is needed. The problem scenario where client remembers the lease, may happen only in one out of the multiple cases described. It will happen if Relay wrongly generates Relay Initiated Release by mistaking the network disconnect as the client unavailable.

Section 1.3 describes many possible setups and configurations where this functionality is applicable and also points out where this is not applicable. In order to address the non-applicable scenarios,, the  draft suggests multiple solutions e.g. having the granular configurable knobs at Relay with which administrator can control when to generate the Relay Initiated Release. It also points out multiple examples on how to address in case one still configures e.g.
1.      To have liveness detection at the client and Relay, e.g. running BFD
2.      To have asymmetric lease at Relay. Server gives out longer leases e.g. few hours. Relay acts as proxy and gives out small leases e.g. 15 min to the clients. This helps the server to reduce the burden of handling frequent renews. I described this in more details below. When Relay identifies that Client did not renew the lease, it knows client is gone, but no way to communicate this to the Server.

I think all these are the different ways how one can identify that the client is really gone and not the network. However, I think, DHCP should provide a way (infrastructure) which can be used to clear such bindings. Should DHCP enforce which mechanism Relay should use to identify that the client is really gone? Section 1.3 also points out that there are no issues for the DSL based service providers as client reestablishes DHCP once underlying PPP gets established. The non-applicable scenario does not even happen. None of the above mentioned solutions are needed for these type of service providers.


>>>> I agree.   I don't remember support being shown at IETF-93, actually.   I said that I thought it was an interesting idea, but that should not be construed as support for the working group doing the work.

Sorry Ted if I misunderstood. You mentioned during my presentation at IETF-93: "In case this comes up after meetecho dies, I favor adoption of v4 draft", hence I considered it as support. Does this not mean support?


Some of the support emails are from the vendors who have already deployed the solution and using it since few years. The solution proposed in the draft was initially requested for clearing the binding administratively, where relay and server somehow went out-of-sync. Synchronizing the client state at all the network devices, is one of the pain-point for many Service Providers. This scenario is included in the draft. Then another service provider wanted to use this functionality where they were replacing the Set-top box and wanted this to be done automatically rather than administratively. Next was the requirement from another Service Provider where the relay was detecting that the client moved from one network to another. It became applicable where there were frequent logins and logouts at Wi-Fi Gateways. Although some of these supporting vendors might have joined the WG recently, they know IETF and RFCs very well, they might not be familiar to the culture of responding to the email in WG and the process. But if you see the email addresses, they are all authentic and have deployed the solution in their network.


>>> We see a claim that 95% of clients don't release before disconnecting, which seems likely to be true, but we don't hear why this is a problem

I thought I described it in section 1.1 where I clarified that not releasing the stale bindings reduces their Subscription rate. Service Providers need to deploy more BNG boxes to support the same number of customers as the scaling gets limited to the resources on the BNG. Please let me know if my wordings are not correct or need to rephrase. First paragraph from that section is again pasted for convenience at the bottom of this email.


>>>> It appears to be the case that this document is actually motivated by a use case where a WiFi provider wants to charge by the time unit

No, that's not the case. Box has limited capacity to scale. If stale clients are lying around, new clients will be denied the access as all the resources are exhausted on the box.


>>>> "why this particular solution is a better solution than using shorter lease times"

BNG may have either Server or Relay. BNG supports usually in the range of 512K clients or even more. With the short leases server has to deal with login rate as well as frequent renews. This causes even bigger performance problem.
Some vendors have implemented asymmetric leases i.e. Relay takes the longer lease from server and gives out short leases to the client. Since relay needs to deal with the smaller number of clients than Server, it can handle the frequent renews. The problem is when relay detects that the client did not renew, it knows client is gone but cannot initiate Release without the support of this draft. Asymmetric lease at relay helps here and I pointed it out in section 1.3.


>>>> From reading the draft, I think there is: 1) insufficient motivation for this approach, 2) no discussion of limitations of existing ways of dealing with the issue (e.g., short least times), and 3) no clear guidance on when/where this approach should (or should not) be used.
1. is addressed in Section 1.1 of https://tools.ietf.org/html/draft-gandhewar-dhc-v6-relay-initiated-release-01. Please let me know if this needs to be changed.
2. Short lease times at server burdens server to deal with logins and frequent renews, causing further performance issues. I described this further above.
3. Does the section 1.3 not enough? Please help in rewriting, I am open and willing to accept contributions.
What I am not getting is, everyone reiterating the same point that the information is not there in the draft. What am I missing here? Is it the language? Can't these sections be rephrased instead of rejecting the draft itself?


>>>>Yes, I too have concerns for this and feel it could easily be misused and result in a mess (many clients with duplicate leases). This is one reason I have been pushing to clarify where this can and cannot be used. (For example, it is 'safe' to use if the client is communicating over a circuit that gets torn down - and for the client to 'return' a new circuit needs to be started. But even then it may have issues if there are other devices behind the device the terminates the circuit - as they be unaware that the lease was 'released' on their behalf.)

Bernie, does the section 1.3 addresses your concern on where this is applicable and where not? I had been requesting you to please let me know if this can be done differently. Please help in rewriting, I am open and willing to accept contributions.
By having things configurable at Relay, wont' it isolate the situation? It is not possible to always have a full proof solution which is applicable in all the possible configurations and works ideally. But by having applicability section helps, rest of the Service Providers are not deprived of the solution.


I request all of you to please have one more read to the section 1.1, 1.2 and 1.3 in https://tools.ietf.org/html/draft-gandhewar-dhc-v6-relay-initiated-release-01 and let me know.
If you have any other suggestion on how to solve the problem or to rephrase these sections, please let me know your suggestions. Please send me your ideas and contributions so that we can update the draft accordingly. By not adopting the draft we are punishing all the other Service Providers who will not face such issues in their network and deprive them from the usefulness of the functionality. There are many Service Providers who are in need of this draft. I am willing to accept contributions and ideas in solving this problem. If there's something missing or can be done better way, why not come together and contribute to make it better solution. I welcome all the ideas, solutions and contributions.


>>>>To Brian's comment (separate email), it is interesting that none of the material from Slide 4 of https://www.ietf.org/proceedings/93/slides/slides-93-dhc-2.pdf made it into the draft as motivation for the original Relay Release feature? Or did I miss it?
It's already there, isn't it this? Please let me know Brian what is missing.

1.1. Problem Description
While providing an IPv6 address or IPv6 Prefix to the DHCPv6 Client,
Service Providers (e.g. Broadband Service Providers), creates a
logical interface per client, programs various routes (e.g. access
routes, framed routes) for the client to access the network and
services, attaches services (e.g. voice, video, data), maintains
policy, applies QoS. Along with these resources there is a need for
memory and bandwidth per client. Since all these resources are
limited on a network device (e.g. Broadband Network Gateway), it
defines the scaling capacity of the device. Since the availability
of the IPv6 addresses is large, subscription rate for the Service
Providers is thus limited by the availability of the resources on
their network device.


Regards,
Sunil Gandhewar
Juniper Networks, Inc.
sgandhewar@juniper.net