[dhcwg] WGLC comments on draft-ietf-dhc-dhcpv6-failover-protocol-01
Tomek Mrugalski <tomasz.mrugalski@gmail.com> Fri, 27 May 2016 18:18 UTC
Return-Path: <tomasz.mrugalski@gmail.com>
X-Original-To: dhcwg@ietfa.amsl.com
Delivered-To: dhcwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CBF6E12B01D for <dhcwg@ietfa.amsl.com>; Fri, 27 May 2016 11:18:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SJTzpGRyJHES for <dhcwg@ietfa.amsl.com>; Fri, 27 May 2016 11:18:21 -0700 (PDT)
Received: from mail-lf0-x22c.google.com (mail-lf0-x22c.google.com [IPv6:2a00:1450:4010:c07::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 65B4912D7A6 for <dhcwg@ietf.org>; Fri, 27 May 2016 11:17:42 -0700 (PDT)
Received: by mail-lf0-x22c.google.com with SMTP id s64so25298117lfe.0 for <dhcwg@ietf.org>; Fri, 27 May 2016 11:17:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=K+J4X1zKDwMkMfue1W3twpi+qIcWcgVuDoJmz/Tm0As=; b=lKDelMW/n3ZTmwDQPlISH6Myl12zzfoYpUMznqmUC++MHS9uz34oozu9YMCJa/Tagi OdL29G9nAs+7d5lezxU+HV2Ev7S8g5vECjjUwmvCwsUxtLtqSb3DtLm7ievNg+yVZukP ba0FE6clX14SXurCQdwQbUoCW0lixsXxH5kX/Kp8xafu1Tu2JBuf7DgPIdcrBB2gvgIw VeGy4uY1FG8Wh5HhGamktXxw7nuJWtdYgT9j4d0q4JIpStUOiu6/9cytJUFrgqQ6k+AZ 6IRVlGvhFSOz5B649kRFXM2UHlqFwpkjWSQWDcZqujv1nJ8e2GPyemlloluqPsvSUd9d dYvA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=K+J4X1zKDwMkMfue1W3twpi+qIcWcgVuDoJmz/Tm0As=; b=TFoxHiyn58bXHSXFY56KOQGc/oUBgGPkVeV2pjTETFZ+OhD79Xeewm5xk0Hs4Z3kgY No9ka32iUkgfPaZEIvmTmh7qcN1izPfGpJrJhq2MWfgBksZkY7VdP4jS5zOkVH3FJSLv 58VgYvXMx3jSxOrET1gplequxpC7615pDC8oZAACRwsPtcLq9bPXgmX+fHW7cXvQsGml vT7TogGZClacrXRoKa3xSkzZ/ivF9u6uhtF6XwRIB/pwj2tTM4glJ6X185J/05g9lt6t X6U8NSby0F0Lw0J+GnIvGtCXmgxjDowqaW2qTUsRLfv6eKCTLa6PB8Id4Ak3uuA3F9Cz GIIg==
X-Gm-Message-State: ALyK8tKsmAc9ESkwNimFtGNuwgQOjiyXxAPCgmB3rSx9ejpyvzrrZPcS3MwwTiFUCG8UYQ==
X-Received: by 10.25.126.210 with SMTP id z201mr5017113lfc.65.1464373060268; Fri, 27 May 2016 11:17:40 -0700 (PDT)
Received: from [10.0.0.100] (088156132194.dynamic-ww-04.vectranet.pl. [88.156.132.194]) by smtp.googlemail.com with ESMTPSA id f72sm760460lji.36.2016.05.27.11.17.38 for <dhcwg@ietf.org> (version=TLSv1/SSLv3 cipher=OTHER); Fri, 27 May 2016 11:17:39 -0700 (PDT)
To: dhcwg@ietf.org
References: <0a748e2ecef446ad8962a84e275e17c2@XCH-ALN-003.cisco.com>
From: Tomek Mrugalski <tomasz.mrugalski@gmail.com>
X-Enigmail-Draft-Status: N1110
Message-ID: <57488F41.1090906@gmail.com>
Date: Fri, 27 May 2016 20:17:37 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0
MIME-Version: 1.0
In-Reply-To: <0a748e2ecef446ad8962a84e275e17c2@XCH-ALN-003.cisco.com>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/dhcwg/n-w-6cgafypV3JzRQFy3oNVF8nE>
Subject: [dhcwg] WGLC comments on draft-ietf-dhc-dhcpv6-failover-protocol-01
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: <dhcwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dhcwg/>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 May 2016 18:18:24 -0000
On 28.04.2016 21:25, Bernie Volz (volz) wrote: > This message starts the DHC Working Group Last Call to advance > draft-ietf-dhc-dhcpv6-failover-protocol-01, DHCPv6 Failover Protocol. > This document’s intended status is Proposed Standard. At present, there > is no IPR file against this document. > > Please send your comments by May 22, 2016. If you do not feel this > document should advance, please state your reasons why. It definitely took much longer than I initially anticipated, but here it goes. While my name is still on this I-D, I have not been involved in this draft for at least a year. All the praise for this work should go to Kim. As such, please consider this review as if it was provided by non-author. I did read the draft in its entirety. It is long and complex, but very well written with concepts, implementation details and reasons for design choices being well explained. Here are my comments. Many of them are editorial in nature, so please forgive I did not segregate them out. Abstract: to provide for DHCPv6 failover => to provide DHCPv6 failover Section 1: Add "Active-active mode may be defined in the future." Section 3: To make it consistent, we should change all instances of Binding status (3 ocurrences) to binding-status (31 ocurrences). server when it is in the BACKUP state. => server when it is in the FREE-BACKUP state. (there's no BACKUP state) and BACKUP on the secondary server => and FREE-BACKUP on the secondary server Section 4.2.1 used for allocating individual DHCP IPv6 addresses => used for allocating individual IPv6 addresses When an IPv6 address goes PENDING-FREE Section 7.2 it is owned => When an IPv6 address goes PENDING-FREE (see Section 7.2) it is owned Section 4.2.1 says: An operational partner MUST NOT lease IPv6 addresses that were assigned to its downed partner and later expired or were released or declined by a client unless it is in PARTNER-DOWN state. When it is in PARTNER-DOWN state, a server SHOULD allocate from its own pool. It SHOULD NOT use its partner's pool. This text is inconsistent. Either we use SHOULD/SHOULD NOT everywhere or MUST/MUST NOT. Also, the last sentence should be clarified to "It SHOULD NOT use its partner's pool to allocate new leases.". I like the approach to allocate odd addresses on one partner and even on another. This implies exactly 50/50 split between them (we previously talked about configuration parameter for that), but that's ok. If someone is running out of IPv6 addresses, he has bigger problems than misbehaving failover. Section 4.3: "will have no record at all of the lease to the DHCP client." => "will have no record at all of the lease being assigned to the DHCP client." Section 5.1 The text does not define how to set transaction-id. There's reference to Section 5.1 of RFC5460, which in turn says that the meaning of the fields are defined in RFC3315. That RFC says that clients set them to random value. There are no clients in failover protocol. I suppose we need a text that whoever initiates the exchange, must set the transaction-id. How about setting it to increasing value. This is better than using the usual random, which has a slim, but non-zero chance of repeat itself. Random could work most of the time, but every couple months the PRNG would spit the same value sufficiently close to each other and then failover could skip an update. Q: section 5.2 says that msg-type is different than the one in DHCPv6. Why? We currently have 23 types defined. Do we really expect that many more to be defined? My thoughts about this are explained below when commenting Section 11. Section 5.3.9 I think CONNECTACK is not the right name for this message, as it can also be used to reject connection. ACK in DHCP world definitely means a positive acknowledgement. So better name would be CONNECT-REPLY or maybe CONNECT-RESPONSE? The same is true for BNDACK as BNDACK may actually contain NAK response. BND-REPLY or BINDING-REPLY seems better suited. Generic comment for Sections 5.4.*. Most of the options do not have proper names and they're referred to by the constant values. Most (all?) DHCPv6 options have proper names that are used to refer them. Do you think it would be useful to include them? Also, for each option there's an explanatory text that says what the content of the option is. It would be very useful to have one sentence for each option that explains when and where this option may appear. Section 5.4.3 There should be a text similar to: "Three encapsulated options are defined at this time, but additional options may be defined in the future." Sections 5.4.4 - 5.4.6 Extra text should be added: "This option may appear only in OPTION_F_DNS_REMOVAL_INFO" Section 5.4.7 I'm unclear whether this is the greatest lifetime for all leases ever processed by the server? Or is it supposed to be an encapsulated option and it describes a single lease only? Section 5.4.8 Getting the value of this option may be tricky. It depends on many things, but one of the limiting factors in Linux are socket buffers. They're limited by the number of bytes rather than number of packets. Section 5.4.12 When is this option sent? After connection with partner is reestablished? Section 5.4.13 "associated with this IPv6 address." => "associated with this IPv6 address or prefix." Section 5.4.14 The figure there has extra 32bits field called protocol-version. I think that's an editing artifact and should be removed. Also, do you think it would be reasonable to add a text that in general the major version must match or the servers won't be able to interoperate? This is just a proposal for consideration. In Kea we're using database versioning with major/minor numbers. Minor numbers are for backward compatible changes, like little tweaks. Once we change something significant, we bump up the major number. It's only an idea. I don't insist on it in any way. Section 5.4.15 I would recommend renaming this option to OPTION_F_KEEPALIVE_TIME. Keep-alive is immediately understood, while receive time may mean many things. Section 5.5 Description of AddressInUse status says: One client on one server has leases that are in conflict with the leases that the client has on another server. Alternatively, the address could be associated with a different IA_ID on each server. That's IAID. IA_ID looks like a new type of IA container, which confused me at first. Section 6.1.2 The text says that MCLT received in the CONNECT message should be compared with the configured MCLT value. Does it make sense at all to allow MCLT configuration on the secondary side? Removing this ability and accepting whatever primary sends seems the way to go here. One less parameter to misconfigure. Anyway, if you mistrust the values primary may send, why establish the relationship at all? Section 7.3 liftime => lifetime The time when the this server... => The time when this server... Section 7.4 The following text: Every BNDUPD message contains information about either a single client binding in an OPTION_CLIENT_DATA option, or a single prefix lease in an OPTION_IAPREFIX option. is a bit confusing and took me a while to understand what it says. How about this rewording: Every BNDUPD message contains information about either a single client binding in an OPTION_CLIENT_DATA option that include IAADDR or IAPREFIX options associated with that client, or a single prefix lease in an OPTION_IAPREFIX option for prefixes that are currently not associated with any clients. This section also mentions that OPTION_VSS MUST appear in BNDUDP message if certain conditions are met. I disagree with this. Support for RFC6607 is not mandatory and there may be valid, working failover deployments that don't support VSS. Hence I don't agree with the MUST here. I think we should to reword the list slightly. The sentence before the list says "MUST contain at least the data shown below in its client-options section", but then OPTION_F_RECONFIGURE_DATA has lower case "if any". I think it would be better to rephrase this list as: The OPTION_CLIENT_DATA option: - MUST contain option 1 - MAY contain option 2 - MUST NOT contain option 3 ... The list also mentions IA_TA. At the first glance it does not make much sense to use failover to provide lease stability for addresses that are temporary by nature. Do you think failover should cover temporary addresses at all? There's a tiny bit of usefulness to prevent duplicates in already assigned temporary addresses. But is it worth the effort? The following text is a bit confusing: OPTION_STATUS_CODE containing an error code, or containing a success code if a message is required. If the information in the corresponding OPTION_IA_NA, OPTION_IA_TA, or OPTION_IA_PD option in the BNDUPD was accepted, and no status message was required (which is the usual case), no OPTION_STATUS_CODE option appears. The first sentence suggests that there are cases when success status code sometimes is necessary, but the second one says it does not appear. There's no normative language, I admit, but nevertheless this should be clarified (something like status code may appear, but it's perfectly ok to skip it for success status). Section 7.7 partner.. (remove extra dot). The phrase NAK is used in section 7.7 only. Other sections refer to this condition using the word reject or rejected. Maybe we could make the naming consistent? Or at least add a text explaining that NAK means a rejected update? Section 7.7 ends with a list of 3 bullets. It's unclear what they relate to. Is this a continuation of the previous list? Then they should be 3, 4, and 5. If not, then some leading sentence should be added explaining what this list describes. Figure 4 - extra empty lines between counters could be useful here. It's confusing whether the (uncorrected) comment refers to the line above or below. Section 8.5.2 "communications fails". This one is genuinely confusing for non-native speaker, I must admit that. Shouldn't this be either "communication fails" or "communications fail"? Merriam-Webster dictionaly says that Communication is plural, but may be used as singular or plural. Communications is plural, but there's an example that says "Communications is...". Source: http://www.merriam-webster.com/dictionary/communications?show=0&t=1292594132 Section 8.6.2 "This is to allow any IPv6 addresses..." => "This is to allow any IPv6 addresses or prefixes..." Section 8.7.1 The text says that in the RECOVER-DONE state the server may respond to RENEW requests and extend existing leases. It MUST NOT allocate any additional leases when in RECOVER-DONE state. Ok, so how should the server behave if a Renew comes in and there's no lease for it. Respond with NoAddrsAvail or not repond at all? On one hand, not responding seems safer, because the client will likely retransmit and there's non-trivial chance that the server will be back in Normal state by then and everyone will be happy. On the other hand, responding with NoAddrsAvail status has the benefit of potentially providing other configuration options. If you don't have any preference, I think going with no response is the way to go. Regardless of the choice, the behavior should be clarified. Section 11 "These values should start at 20, in order to reduce the chance of collisions with existing DHCPv4 Messages using the same ports as DHCPv6 failover.". I don't like that approach for several reasons. First, the arbitrarily chosen 20 is already outdated (we have standard DHCP message types up to 23 defined now). No matter what values you start from, there will likely be a conflict some time in the future. But this is a minor thing. Second, more important issue is the confusion it brings. When you ask what message type 1 is, every DHCP engineer with a clue will instantly tell it's Solicit. With this proposal, the answer will be "in what context: dhcp proper or dhcp failover"? This has more implications than it looks at the first glance. Many logging routines in the code would have to be updated to pass extra info whether the message should be interpreted as DHCP or DHCP failover message. Also, think about cases where you use tools like wireshark to inspect existing traffic. They would have to be extended with extra logic to do conditional parsing based on the TCP port number. To make it more convoluted, we already have cases of TCP-based traffic in bulk and active leasequery. We could have more if draft-volz-dhc-relay-server-security evolves towards TLS solution. Oh, and the failover solution *is* using some messages from the standard DHCP. STARTTLS is used when establishing connection. It happens to have message code 23, which directly conflicts with proposal to start the codes from 20. So what's the problem with using existing message type registry anyway? With the proposed 12 messages, we'd have 35 message types defined in DHCPv6. There's still 220 more available until we run out of types. And seriously, if we ever define 200 message types, it would mean something somewhere went horribly wrong. I would very much prefer to use regular message types. Failover is complicated enough without extra message type tricks. -------------------- Partners send CONTACT messages as a mechanism to keep the connection alive. Do you think it would make sense to get rid of this message and send STATE messages instead? The benefit is that the server would have more up-to-date configuration that its partner is in specific state. There would be one less message type, so it would make the spec marginally simpler. I think this could be useful to detect cases when one way of communication fails, but the other is still functional. On the other hand, CONTACT message is probably empty, so sending STATE with all the options would generate more traffic even during normal, healthy operation. -------------------- Ok, that's it. Overall, I think Kim did a fantastic job with this draft. There are some nits to be improved, but only a handful of technical comments to be addressed. For an almost 90 pages long draft this is constitutes a great shape. Thank you for this work, Kim. Tomek
- Re: [dhcwg] WGLC comments on draft-ietf-dhc-dhcpv… Kim Kinnear
- [dhcwg] WGLC on draft-ietf-dhc-dhcpv6-failover-pr… Bernie Volz (volz)
- Re: [dhcwg] WGLC on draft-ietf-dhc-dhcpv6-failove… Bernie Volz (volz)
- [dhcwg] WGLC comments on draft-ietf-dhc-dhcpv6-fa… Tomek Mrugalski