Re: [dhcwg] Mirja Kühlewind's No Objection on draft-ietf-dhc-dhcpv6-failover-protocol-04: (with COMMENT)

Suresh Krishnan <> Tue, 28 February 2017 03:45 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B9E211294B6; Mon, 27 Feb 2017 19:45:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id v1D4zPdg-Hrl; Mon, 27 Feb 2017 19:45:24 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id B19AE129494; Mon, 27 Feb 2017 19:45:23 -0800 (PST)
X-AuditID: c618062d-d5fff700000009d8-f4-58b502ce20a7
Received: from (Unknown_Domain []) by (Symantec Mail Security) with SMTP id 71.AF.02520.EC205B85; Tue, 28 Feb 2017 05:55:44 +0100 (CET)
Received: from ([]) by ([]) with mapi id 14.03.0319.002; Mon, 27 Feb 2017 22:45:20 -0500
From: Suresh Krishnan <>
To: "Mirja Kuehlewind (IETF)" <>
Thread-Topic: =?utf-8?B?W2RoY3dnXSBNaXJqYSBLw7xobGV3aW5kJ3MgTm8gT2JqZWN0aW9uIG9uIGRy?= =?utf-8?B?YWZ0LWlldGYtZGhjLWRoY3B2Ni1mYWlsb3Zlci1wcm90b2NvbC0wNDogKHdp?= =?utf-8?Q?th_COMMENT)?=
Thread-Index: AQHSkTDjW7Up3LdW3kGw2MqZvI6Ba6F9pbiAgAB14YA=
Date: Tue, 28 Feb 2017 03:45:20 +0000
Message-ID: <>
References: <> <> <> <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
x-originating-ip: []
Content-Type: multipart/signed; boundary="Apple-Mail=_D6B69CCA-F71F-4059-8D53-0DDDC80E8533"; protocol="application/pkcs7-signature"; micalg=sha1
MIME-Version: 1.0
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrAIsWRmVeSWpSXmKPExsUyuXRPiO4Fpq0RBqs7mC3mdO9msrjb0cJo sfBztcWMPxOZLV5c/8hssa3hM5PF8hmaDuweU35vZPVYsuQnk0fLx4WsAcxRXDYpqTmZZalF +nYJXBlXd21hK9iVV9H65zxTA+PWhC5GDg4JAROJj+uiuhi5OIQE1jNKHN90lBnCWc4o8ezM JMYuRk4ONqCiDTs/M4HYIgLGEocnf2cFKWIW6GaSeH/2K5gjLLCLUWLj0inMEFW7GSUae4og bCuJX+cvgsVZBFQlvm7Yzw5i8wrYS1x8NokJYl0/s8S5r72sIAlOASeJ3cd+gTUwCohJfD+1 Bmw1s4C4xK0n88FsCQERiYcXT7NB2KISLx//Y4WwlSTmvL7GDHHeFEaJr4v2s0JsE5Q4OfMJ ywRGkVlIZs1CVjcLSR1EkbbEsoWvmSFsTYn93cuh4qYSr49+ZISwrSVm/DrIBmErSkzpfsi+ gJFjFSNHaXFBTm66kcEmRmBcHpNg093BeH+65yFGAQ5GJR5eg/ItEUKsiWXFlbmHGFWAWh9t WH2BUYolLz8vVUmE9/4moDRvSmJlVWpRfnxRaU5q8SFGaQ4WJXHeuNX3w4UE0hNLUrNTUwtS i2CyTBycUg2MOZczriQtKJZmuRByVZ61lcWhyq1YRzb94DNNjv0+provtA8ztDHEK2wLO36w trdvzU8NB6YFb3ZMVJCpF8u4c0Rj5YspUsy8UZl9dw24ArJKjpSKGNW3WXl/bPseornoSush s23z/vuplhsdYrtyY4Em10n/jKkHn4l8nf9tvUFkSIJA4T0lluKMREMt5qLiRAALtnxG0wIA AA==
Archived-At: <>
Cc: Bernie Volz <>, "" <>, The IESG <>, "" <>, dhcwg <>, kkinnear <>
Subject: Re: [dhcwg] =?utf-8?q?Mirja_K=C3=BChlewind=27s_No_Objection_on_draft-?= =?utf-8?q?ietf-dhc-dhcpv6-failover-protocol-04=3A_=28with_COMMENT=29?=
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 28 Feb 2017 03:45:25 -0000

Thanks Mirja. Yours was the last pending comment. I will go ahead and approve the document.


> On Feb 27, 2017, at 3:43 PM, Mirja Kuehlewind (IETF) <> wrote:
> Great! Thanks!
>> Am 27.02.2017 um 20:36 schrieb kkinnear <>om>:
>> Mirja,
>> Thanks for your comment.  I will make two (identical) changes based
>> on your suggestion:
>> I will expand the first occurrence of the use of MAX_UNACKED_BNDUPD
>> in Section 6.1.1 "Sending a CONNECT message", from the current:
>>>  o  OPTION_F_MAX_UNACKED_BNDUPD containing the maximum number of BNDUPD
>>>     messages that this server is prepared to accept over the failover
>>>     connection without causing the connection to block.
>> to the following:
>>  o  OPTION_F_MAX_UNACKED_BNDUPD containing the maximum number of BNDUPD
>>     messages that this server is prepared to accept over the failover
>>     connection without causing the connection to block.  This is to
>>     implement application level flow control over the connection, so
>>     that a flood of BNDUPD messages does not cause the connection to block
>>     and thereby prevent other messages from being transmitted 
>>     over the connection and received by the failover partner.
>> I will also change the second place where OPTION_F_MAX_UNACKED_BNDUPD
>> is transmitted in Section 6.1.2 "Receiving a CONNECT message", where it
>> discusses creating a CONNECTREPLY message, to say the same thing (as it
>> currently has the same "current" text).
>> Thanks -- Kim
>>> On Feb 27, 2017, at 2:09 PM, Mirja Kuehlewind (IETF) <> wrote:
>>> Hi Kim,
>>> sorry for my late reply. Thanks for the explanation. Makes sense to me. I think slightly more explanation in the draft could be good to make clear that the TCP blocking itself is not the problem but that one kind of application layer message will block another kind which can lead to total blocking given only one TCP connection is used for all kind of message to reduce complexity in connection management. Because currently it reads a little like if there is a problem with TCP which is not really the case.
>>> Thanks!
>>> Mirja
>>>> Am 02.02.2017 um 18:43 schrieb kkinnear <>om>:
>>>> Mirja,
>>>> More comments, below...
>>>>> On Feb 2, 2017, at 12:07 PM, Mirja Kuehlewind (IETF) <> wrote:
>>>>> [... removed already handled issue -- Kim]
>>>>>> 6.1.  Creating Connections
>>>>>>> - Also not really clear to me is why OPTION_F_MAX_UNACKED_BNDUPD  is
>>>>>>> needed and how the server should know the right value. I guess you would
>>>>>>> want to calculate this based on the send buffer, however, not all message
>>>>>>> have the same size and as such I don't know how to calculate that. And is
>>>>>>> that really needed? If messages will not be accepted by the receiver-side
>>>>>>> server, the receive window will be zero and the socket on the sending
>>>>>>> side will be blocked; no additional message can be send. What will be
>>>>>>> different if the sender knows in advance when it could potentially happen
>>>>>>> (but also might not if the other end processes the messages quickly and
>>>>>>> there is no excessive loss).
>>>>>> 	The intent here is to keep the TCP connection unblocked, so
>>>>>> 	that information can flow in both directions.  If one
>>>>>> 	direction is is maxed out, it shouldn't keep information from
>>>>>> 	flowing in the other direction.  At a TCP level it won't, but
>>>>>> 	at an application level it will.  Much of the failover
>>>>>> 	information flow involves one server sending a BNDUPD and then
>>>>>> 	the partner sends a BNDREPLY.  If one server server sends more
>>>>>> 	BNDUPD's than the other server can absorb, the TCP connection
>>>>>> 	will block.  This will mean that any BNDREPLY's from the
>>>>>> 	server that sent the BNDUPD's will also be blocked.  Ideally,
>>>>>> 	the BNDUPD->BNDREPLY flow from each server to the other would
>>>>>> 	be independent, and the OPTION_F_MAX_UNACKED_BNDUPD count is
>>>>>> 	designed to help that be true.
>>>>> So you mean this is purely an application parameter saying I will not process more than X messages at once (before sending out a BNDREPLY). So this is rather independent of any socket buffer configuration, expect that the buffer needs to be large enough to at least handle X (max-size) messages which maybe is a good thing to notice as well.
>>>> 	This is an application parameter saying that I can accept up
>>>> 	to X messages at once without blocking the TCP connection.
>>>> 	That isn't in conflict with what you said, but is focused a
>>>> 	bit differently.  It is independent of any socket buffer
>>>> 	configuration -- this is application level flow control.
>>>>> However, this basically means that you at sender-side anyway need a way to cache BNDUPD message that you are not allowed to send out yet. Why don’t you just basically set this value implicitly always to 1 and say you can’t send another BNDUPD if an BNDREPLY is still outstanding…? I would guess it’s anyway rather unlikely that you need to send more than one message at once, no?
>>>> 	Servers frequently need send far more than one BNDUPD at once.
>>>> 	The most extreme typical case is when one server is updating a
>>>> 	partner which has been down with information about what has
>>>> 	been happening while the partner was down.  This will generate
>>>> 	thousands to tens of thousands of BNDUPD's.  When one server
>>>> 	has lost its stable storage completely and needs to
>>>> 	essentially be initialized by the other server, millions of
>>>> 	BNDUPD's may need to flow across the link.
>>>> 	Doing them one at a time, while technically correct, typically
>>>> 	leaves a lot of performance on the table and could easily
>>>> 	extend the time before the servers synchronize from seconds to
>>>> 	tens of minutes (and possibly hours).  Many DHCP servers are
>>>> 	multi-threaded and can process multiple BNDUPD's at the same
>>>> 	time (though they may batch up the writes to the disk).  Thus,
>>>> 	we would expect that most servers implementing this protocol
>>>> 	would set this value to something substantial.
>>>>>> 	Additionally, there are messages other than BNDUPD/BNDREPLY
>>>>>> 	(e.g. STATE, DISCONNECT, UPDDONE) that are important to
>>>>>> 	transmit from one server to the other and not have backed up
>>>>>> 	behind a blocked TCP connection that has been overloaded with
>>>>>> 	BNDUPD's for the partner to process.
>>>>>> 	We could have created a separate TCP connection for these
>>>>>> 	control messages, but the overhead of doing that (and
>>>>>> 	specifying that) was great enough that it seemed like using
>>>>>> 	the application-level flow control of the
>>>>>> 	OPTION_F_MAX_UNACKED_BNDUPD was a good tradeoff.
>>>>> I would actually say that the overhead is rather low. Maybe one should discuss this option at least as one potential implementation possibility. The only hard requirement is that the receiver side would be able to process message coming from different connections from the same endpoint, which I assume would be easy given you anyway have to handle different connections from different endpoints, no?
>>>> 	Having different implementation possibilities in something as
>>>> 	basic as connection management in a protocol already this
>>>> 	complex is something we have tried hard to avoid, and we could
>>>> 	only justify it if it were necessary to solve a very pressing
>>>> 	problem.
>>>> Thanks -- Kim
> _______________________________________________
> dhcwg mailing list