Re: [conex] Fwd: Review: draft-ietf-conex-destopt-06

Mirja Kühlewind <> Mon, 25 August 2014 17:36 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 06DE31A014A for <>; Mon, 25 Aug 2014 10:36:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.858
X-Spam-Status: No, score=-1.858 tagged_above=-999 required=5 tests=[BAYES_50=0.8, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.668, T_TVD_FUZZY_SECURITIES=0.01] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id jGcvTZsiJYLU for <>; Mon, 25 Aug 2014 10:36:29 -0700 (PDT)
Received: from ( []) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 6E5BE1A00D6 for <>; Mon, 25 Aug 2014 10:36:28 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3706AD930A; Mon, 25 Aug 2014 19:36:27 +0200 (MEST)
X-Virus-Scanned: by amavisd-new on
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with LMTP id KdqEEAZJJo2D; Mon, 25 Aug 2014 19:36:27 +0200 (MEST)
Received: from [] ( []) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: mirjak) by (Postfix) with ESMTPSA id E823AD9303; Mon, 25 Aug 2014 19:36:26 +0200 (MEST)
Message-ID: <>
Date: Mon, 25 Aug 2014 19:36:26 +0200
From: Mirja Kühlewind <>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0
MIME-Version: 1.0
To: Bob Briscoe <>
References: <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Cc: Carlos Ucendo <>, ConEx IETF list <>
Subject: Re: [conex] Fwd: Review: draft-ietf-conex-destopt-06
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Congestion Exposure working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 25 Aug 2014 17:36:34 -0000

On 14.08.2014 21:15, Bob Briscoe wrote:
> Mirja,
> At 17:51 14/08/2014, Mirja Kühlewind wrote:
>> Hi again,
>> sorry; pressed sent without being done... see further below....
>> On 14.08.2014 18:41, Mirja Kühlewind wrote:
>>> Hi Bob,
>>> see inline
>>>>>> * Added subsection of intro on experiment goals: criteria for success
>>>>>> and duration
>>>>> I believe most of the text actually should go in the tcp mods draft. I
>>>>> not sure if there is a common sense in the mean time to have such a
>>>>> section in very exp document. But if not I would rather not have it in
>>>>> this document because I'm not sure how to define if an experiment was
>>>>> successful. If fact the CDO is the only approach at could fulfill our
>>>>> requirements, so there is no other  option. And if the coding itself
>>>>> with the four bits is useful or not, is not really a question of this
>>>>> document, but amybe more of the whole mechanism (incl. auditing and
>>>>> policing) or maybe the tcp mods document...?
>>>> The solution to this would be to refer to one doc that has the expt
>>>> goals. However, I believe each doc can be seen as a separate piece of a
>>>> bigger expt.
>>>> * The expt that this doc describes is a choice of encoding (see below -
>>>> there are other choices).
>>>> * The main expt that the TCP doc describes is how to set credit.
>>> So is it okay if I just add one sentence in the intro (instead of having
>>> an own section):
>>> "This specification is experimental to allow the IETF to assess whether
>>> the decision to implement the ConEx signal as a destination option
>>> fulfills the requirements stated in this document, as well as to
>>> evaluate the proposed encoding of the ConEx signals."
>>> Does this work for you?
> Why resist the good practice of defining full criteria for success and
> duration of experiment? This doc (not TCP) is the primary basis of any
> ConEx experiment. I don't mind if it's not in a separate section, but
> these elements should be in an expt RFC.
> I think "requirements in abstract-mech" would be more objective.
> <RANT>I would like it recorded somewhere that it is going to take many
> years to find a scenario where a v6-only experiment in traffic
> management (on a tiny %age of the traffic) makes any sense whatsoever.
> It will be near-impossible to measure whether there is any benefit.
> </RANT>

Okay added one more paragraph in the intro based on our initial proposal:
"The duration of this experiment is expected to be no less than two 
years from publication of this document as infrastructure is needed to 
be set up to determine the outcome of this experiment. Given ConEx is 
only chartered for IPv6, it might take longer to find a suitable test 
scenario where only IPv6 traffic is managed using ConEx."

>>>>>> ==Requirements==
>>>>>> * Referred to abstract-mech for requirements, explained that it would
>>>>>> be hard to satisfy them all, and explained which one wasn't satisfied
>>>>>> (visibility in outer), referring to section on fast-path performance.
>>>>> Took over some of your text at the beginning of this section (and also
>>>>> reused some text we already used to have there).
>>>>> Do you really think that the two requirements you've added are needed?
>>>>> Because both basically say that the ConEx coding should encode the
>>>>> Conex signal, which is, from my point of view, the whole purpose of
>>>>> this document.
>>>> Perhaps you're too close to ConEx. If these reqs hadn't been defined in
>>>> abstract-mech, you wouldn't know what "the ConEx signal" was. These
>>>> reqs
>>>> say what the component parts of the ConEx signal are. It could say you
>>>> need separate signals for ECN-credit and loss-credit. It could say
>>>> ConEx
>>>> nodes must successfully negotiate ECN (then re-ECN would be a
>>>> solution).
>>>> The alternative is to refer to abstract-mech for all the requirements
>>>> and not list them here, or list the reqs in abstract-mech that are
>>>> relevant to the network layer encoding.
>>>>> I do have the feeling that the other requirements listed here are on a
>>>>> slightly different level because they are more related to deployment
>>>>> issues.
>>>> You could say that (but you don't). That's why it seemed odd that you
>>>> listed some requirements, but not the basic ones.
>>> Okay the intent of these requirements was not to rephrase what is
>>> already written down in the abstract-mech because I assume that people
>>> reading this document do not really care why the conex signals look as
>>> they do but just want to know which conex signals are there (and what
>>> there meaning is).
>>> The reason to have the requirement listed is simply to justify the
>>> potentially awkward choice to encode them in a destination option.
>>> I now have the following text:
>>> "A set of requirement for an ideal concrete ConEx wire protocol is given
>>> in <xref target="I-D.ietf-ConEx-abstract-mech"/>. In the ConEx working
>>> group is was recognized that it will be difficult to find an encoding in
>>> IPv6 that satisfies all requirements. The choice in this document to
>>> implement the ConEx information in a destination option satisfies most
>>> of those requirements, briefly summarized below:"
>>> + I added your paragraph on visibility at the end.
>>> Okay?
> How about not including the 2 extra requirements and changing your last
> sentence now I understand what you want to say:
> "The choice in this document to implement the ConEx information in a
> destination option aims to satisfy those requirements that constrain the
> placement of ConEx information:"

>>>>> And regarding tunneling: you are right that we need to give more
>>>>> advise on tunneling. Shouldn't we just say that one MUST copy the
>>>>> inner ConEx Option to the outer header (to solve the visibility
>>>>> problem)?
>>>> You can't say MUST, because IPv6 dest opts are meant to be e2e only, so
>>>> no IPv6 tunnel currently even understands what a dest opt is, let alone
>>>> copies any dest opt to the outer. A MUST would ring huge alarm bells in
>>>> the vendor community.
>>>> I added it as a MAY, purely because it might be considered a
>>>> performance
>>>> optimisation. I'm still worried about the size of the alarm bells that
>>>> will ring. It could prevent this draft progressing thru the IESG.
>>>> Suresh
>>>> will know best how this might be received in the IESG.
>>> Okay, got it; see further below...
>>>>>> ==CDO==
>>>>>> * Specified precisely which IP header is included in the byte count.
>>>>> So you suggest to not include any options?
>>>> I didn't say that, did I? If the wording I used is ambiguous, pls
>>>> fix it.
>>>>> Why? I'd say you either include all bits because all of them
>>>>> contribute to congestion, or none of the IP header bits because that's
>>>>> just the overhead you can't avoid if you what to send anything. Also
>>>>> of course you can generate a larger percentage of overhead if you send
>>>>> smaller packets.
>>>> That's why I think it is reasonable to include the IP header (and its
>>>> options) that immediately encapsulates the ConEx dest opt.
>>> Okay I misread your text.
>>> I just though, if you detect a CDO in the header you are currently
>>> looking at, you will simply look at the playload length and next hop
>>> fields of this header and than add another 40 bytes for the header
>>> itself. But in fact that might be wrong if you have another IP header
>>> encapsulated in the payload... so what is actually the right number of
>>> bytes here?
>>> I was about to write the following but I'm not sure if that is actually
>>> write and/or clear:
>>> "... IP packet (including the IP header that carries the CDO and all
>>> associated options)..."?
>>> That just doesn't say if you should look what the next header is and
>>> subtract all other IP header bytes you can find. Is this needed?
> I'm happy with your something like your last sentence. For precision I
> suggest rewording:
> IP packet (including the IP header that directly encapsulates the CDO
> and everything that IP header encapsulates).

> It doesn't have to go searching for any more deeply encapsulated CDO. If
> there is one, that will be dealt with by whatever higher layer it was
> written in, at the point when the outer layers have been peeled off and
> some function at that higher layer is processing these inner dest opts.
My point actually was that you will get a different number of marked 
bytes if you look at a point with or without encapsulation. But as 
usually all packets (at least of the same flow) get encapsulated or not, 
the ratio between marked and not-marked bytes should still be the same, 
so I guess that's fine.

>>>>>> * Suggested deleting example of Not-ConEx-capable packets (see
>>>>>> separate thread to conex-tcp-modifications authors about TCP pure
>>>>>> ACKs).
>>>>> I can remove the example but not sure why you are suggesting this. If
>>>>> you actually imply that the X bit should never be zero that we have to
>>>>> discuss if the X bit is needed at all.
>>>> I have never thought the X flag was needed. There's probably some email
>>>> on the list somewhere in the past from me that says that.
>>>> As I put in one of the comment bubbles:
>>>> "The only need I can see for the X-flag is if
>>>> the Reserved field gets used in future for
>>>> something in addition to ConEx. Then there
>>>> would be a need to identify packets that
>>>> are not ConEx-capable but still carry the
>>>> CDO option (for the new reason)."
>>>> Can anyone think of a use for the X flag?
>>> I thought the X bit unset means: I'm a ConEx aware sender and i want to
>>> follow the rules but I don't have any feedback for this (control) data
>>> so I'm unable to give you useful ConEx information and if you use this
>>> packet for your estimation of the current congestion level, you might
>>> underestimate it.
>>> Doesn't that make sense...?
> Not to me. What does "feedback for this (control) data" mean? Feedback
> is about a path used by a 5-tuple. This control data is about to be sent
> over such a path. If the sender has feedback about that path, the
> feedback applies to everything sent over the path, at the IP layer,
> whatever categorisation the next packet has at L4.
If you do not get any feedback on a path, e.g. a receiver only sending 
ACKs, you will never be able to send any ConEx markings. So what's the 
point about marking a packet as ConEx-enabled?

Further note, in the TCP mods we only look at the payload because we 
assume, for simplification, all packets have the same size. Therefore a 
packet that carries no data would not decrease the CEG/LEG. If ACKs 
should get marked, we need to rewrite all this stuff in the tcp mods doc...

> (Even if control data is somehow being sent over a different path, e.g.
> using MPTCP or something, and there has never been feedback over that
> path, then that would warrant Credit, not absence of ConEx.)
I don't think credit does help here. Note credit cannot replace 
ConEx-markings anymore. And if you only send a small amount of control 
data, it is not very likely that your packets gets drop and thus 
probably you do not sent any credit.

>>>>>> ==Fast-path==
>>>>>> * CDO as first destination option: changed from MUST to SHOULD (with
>>>>>> an example of when not to).
>>>>> I believe this really needs to be a MUST. I know that might restrict
>>>>> the use of ConEx with potential other options that might have the same
>>>>> requirement (for different reasons). But if you don't put a MUST here,
>>>>> you cannot implemented the suggested way in the fast path.
>>>> A SHOULD still means it will be the first option in all current
>>>> implementations. However, I suggest a SHOULD, precisely because
>>>> performance reasons are not absolute, so they don't require a MUST. If
>>>> another dest opt cannot work at all unless it is first, that would be a
>>>> valid reason for CDO coming second, because it still works, it's /just/
>>>> slower.
>>>> The IESG will (rightly) be very wary of any draft that says an option
>>>> MUST be the first option.
>>>> I suggested the following text after this: "(This is not
>>>> stated as a 'MUST', because some future destination option might
>>>> need to
>>>> be placed first for functional rather than just performance reasons.)"
>>> So our fast path implementation must simply assume that there is no CDO
>>> in case it cannot find it as the first option. Otherwise all non-ConEx
>>> packets would need to go to the slow path to make sure there is no ConEx
>>> option. That means to me that this must be a MUST...?
> OK, I see the problem, but how much of a performance problem would it
> really be for the fast path of a ConEx function to step along dest opts
> until it gets to CDO then stops (rather than stop if CDO is not first)?
So that's the different between you looking at one bit at a defined 
position or having a chain of conditional look-ups where the length is 
unknown. I believe that is something you would avoid to implement in 
fast path as the processing time is not fixed anymore... that would be 
my guess but I'm not an expert in this area.

> Then "CDO SHOULD be first" would give no different performance to "CDO
> MUST be first", if CDO actually was first. If CDO had to be placed
> second on a certain packet, "CDO SHOULD be first" would take just one
> more op than "CDO MUST be first".
> Note: I've just re-read the spec of the IPv6 header. We need to specify
> that CDO goes in the "Destination Options (before routing header)", not
> the "Destination Options (before upper-layer header)". Then it won't be
> encrypted by an ESP header.
Thanks. I wasn't fully aware of this. But the difference for my 
understanding is if immediate node listed in the routing header should 
proceed this option or not. In our case it is probably not important 
which one we choose as it should be processed by none of the receivers. 
Where did you read that the later one is not encrypted though?

If so, I can simply add one sentence to the first paragraph of section 4:
"The CDO MUST be placed in the destination option before routing header 
such that it does not get encrypted and can be read by immediate 
ConEx-aware nodes."
And then remove the first paragraph of the IPSec section (and probably 
move the other paragraph somewhere else so that the section is removed 

>>>>>> ==IPsec compatibility==
>>>>>> * Suggested ConEx counts the AH header, and the outer tunnel mode
>>>>>> header, with reasoning.
>>>>> Yes, need to be more precise. Will add.
>>>> This one wasn't just clarity. I've actually contradicted what was said,
>>>> so pls make sure there wasn't a good reason for why it was like it was.
>>>> I was most concerned about suggesting this change, because it was the
>>>> only one that caused a technical difference.
>>> Ohh, I didn't read your comments carefully and was just looking at the
>>> text changes... this whole accounting is a mess :-(
> I don't think it has to be, if we keep to the rule we just agreed above.
>>> Maybe we should only account the IPv6 header itself and the destination
>>> options...?
> Why? I really don't understand why the IPsec accounting was written like
> it was. Pls explain.
The problem about tunneling is that the number of ConEx marked bytes 
might be different depending on where at the path you look at the 
packets. But I guess that's less a problem than I initially though. If 
so I guess I can remove this paragraph about accounting in the IPSec 
section (if still needed at all).

>>> Moreover, isn't this here the same case than with tunneling in general.
>>> Only if the node that does the encapsulation is ConEx-aware it can copy
>>> the CDO, otherwise it will be not visible anymore.
>>> So this should either be a should, or we have to say something like: if
>>> the node is ConEx-aware is MUST copy the CDO...?
>> And then we can the same thing for tunneling in general...?
> That's surely a circular argument. What would make a tunnel endpoint
> into a ConEx-aware tunnel endpoint, so that it would have to copy the
> CDO? It would only become ConEx-aware if it had code added to look for
> the CDO, and why would it have that code added unless it was going to do
> something with CDO? That's why I think my 'MAY copy as a performance
> optimisation' formula is the best we can do.
What you say above is the point. If the node does not know anything 
about ConEx, it simple cannot copy the option, which is the case for all 
currently existent nodes. So we cannot say MUST in general. But if the 
node does know that ConEx exists for any reason, it really must copy the 
CDO...? But you right that is a little pathologic. I'm will to change if 
that helps understanding/is less confusing.

> There is no point trying to fix the IPv6 facilities for tunnelling new
> extension headers. The people whose job it was to design this didn't do
> their job. Their design is now burned into IPv6 hardware processors
> everywhere. Full stop.
> All we can hope to do is ensure that CDO is not encrypted with ESP. That
> is feasible.
> Whatever we do, in many cases, the IPv6 header containing the CDO will
> be encapsulated in other IP headers. So ConEx functions will just have
> to live with that. To find CDO, they will have to look for an IP header
> that encapsulates an upper layer protocol header. And even then, they
> will have to look one level deeper in case IP headers start again.
> There's loads of kit these days that has to do that anyway (e.g. CGNATs
> looking for the transport header or DPI looking for the app-layer). This
> is all we can hope to do at this experimental stage.
I guess we should write this point more explicitly:
"A network node that assesses ConEx information SHOULD search for 
encapsulated IP headers until a CDO is found or no further IP headers 
can be found." (should or SHOULD?)

> We need to prove ConEx is useful, then it can be performance optimised.
> Header parsing performance is generally not a big problem these days.
>>>>>> * Suggested optional copying of CDO to outer, but also a simpler 'Do
>>>>>> not copy CDO' alternative.
>>>>> I don't really get you SHOULD NOT but MAY here...?
>>>> See earlier. Tunnels don't normally understand dest opts, which is
>>>> why I
>>>> said SHOULD NOT. But the MAY is a performance optimisation. Am I
>>>> helping?
>> Okay, understood. But why SHOULD NOT? Isn't it sufficient to say
>> MAY...? (or even MUST/SHOULD if ConEx-aware...?)
> You're right, we could leave out the SHOULD NOT. I suggest:
> "As with any destination option, an ingress tunnel endpoint will not
> natively copy the CDO when adding an encapsulating outer IP header.
> However, it MAY..."

Done. But one question: Why MAY and not SHOULD? Wouldn't it actually be 
nice if all future tunneling nodes would copy the header.

> Bob
>> Mirja
>>>>>> ==Security Considerations==
>>>>>> * Added lots, all pointers to where security issues are discussed in
>>>>>> other places (which is what security directorate reviewers need).
>>>>> Okay I can add that if you think it's necessary (I would say it's just
>>>>> redundant, but you be might right that it just helps the sec dir).
>>>> It's not always obvious which aspects relate to security. Especially
>>>> when the security is structural rather than crypto. So I think these
>>>> sentences are useful to sec dir.
>>>>>> ==IANA==
>>>>>> * I think the act bits need to be 00 not 10 to avoid ConEx packets
>>>>>> being dropped by non-ConEx nodes (including by non-ConEx receivers)?
>>>>>> But I'm willing to be corrected.
>>>>> I agree; Will ask Suresh why he has put a 10 though.
>>>> Yes, he's the right guy to check with.
>>>> Bob
>>>>> Thanks,
>>>>> Mirja
>>>>>> Regards
>>>>>> Bob
>>>> {Note 1}
>>>> For anyone watching on the list, the tentative idea that Mirja has
>>>> reminded me of is documented in 11.3.1 of my PhD thesis entitled
>>>> "Covert
>>>> Markings as a Policer Signal".
>>>> The potential problem: A ConEx policer punishes punishment. If a
>>>> congestion policer starts dropping packets because the user has
>>>> contributed excessively to congestion, in subsequent rounds the user
>>>> has
>>>> to re-echo 'L' markings for the policer drops as well. This can drive
>>>> the policer further into 'debit'. This might make it difficult for the
>>>> user to get out of trouble once she's started getting into trouble.
>>>> The basic idea was that when a congestion policer drops packets
>>>> (because
>>>> the user is causing more congestion than her allowance), it will also
>>>> remove ConEx markings. Then (if there is some way for the receiver to
>>>> feed this back), the sender knows not to send more ConEx marks because
>>>> these aren't congestion drops, they are policer drops.
>>>> We didn't that double punishment made it hard to get out of trouble in
>>>> any policer experiments so far, so let's not allow for a possible
>>>> solution to a problem that we probably don't even have. The current
>>>> crop
>>>> of ConEx drafts are experimental anyway. If this problem does surface,
>>>> then we can reconsider.
>>>> ________________________________________________________________
>>>> Bob Briscoe,                                                  BT
>> --
>> ------------------------------------------
>> Dipl.-Ing. Mirja Kühlewind
>> Communication Systems Group
>> Institute TIK, ETH Zürich
>> Gloriastrasse 35, 8092 Zürich, Switzerland
>> Room ETZ G93
>> phone: +41 44 63 26932
>> email:
>> ------------------------------------------
> ________________________________________________________________
> Bob Briscoe,                                                  BT

Dipl.-Ing. Mirja Kühlewind
Communication Systems Group
Institute TIK, ETH Zürich
Gloriastrasse 35, 8092 Zürich, Switzerland

Room ETZ G93
phone: +41 44 63 26932