Re: [conex] Fwd: Review: draft-ietf-conex-destopt-06

Bob Briscoe <bob.briscoe@bt.com> Thu, 28 August 2014 20:06 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: conex@ietfa.amsl.com
Delivered-To: conex@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 11D301A89EF for <conex@ietfa.amsl.com>; Thu, 28 Aug 2014 13:06:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.959
X-Spam-Level:
X-Spam-Status: No, score=-2.959 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.668, SPF_PASS=-0.001, T_TVD_FUZZY_SECURITIES=0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XZrLT0_pVjQr for <conex@ietfa.amsl.com>; Thu, 28 Aug 2014 13:06:16 -0700 (PDT)
Received: from hubrelay-by-03.bt.com (hubrelay-by-03.bt.com [62.7.242.139]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 489231A89E7 for <conex@ietf.org>; Thu, 28 Aug 2014 13:05:56 -0700 (PDT)
Received: from EVMHR72-UKRD.domain1.systemhost.net (10.36.3.110) by EVMHR03-UKBR.bt.com (10.216.161.35) with Microsoft SMTP Server (TLS) id 8.3.348.2; Thu, 28 Aug 2014 21:05:52 +0100
Received: from EPHR01-UKIP.domain1.systemhost.net (147.149.196.177) by EVMHR72-UKRD.domain1.systemhost.net (10.36.3.110) with Microsoft SMTP Server (TLS) id 8.3.348.2; Thu, 28 Aug 2014 21:05:52 +0100
Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR01-UKIP.domain1.systemhost.net (147.149.196.177) with Microsoft SMTP Server id 14.3.181.6; Thu, 28 Aug 2014 21:05:50 +0100
Received: from BTP075694.jungle.bt.co.uk ([10.109.133.183]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id s7SK5ke4004064; Thu, 28 Aug 2014 21:05:47 +0100
Message-ID: <201408282005.s7SK5ke4004064@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Thu, 28 Aug 2014 21:05:44 +0100
To: Mirja =?iso-8859-1?Q?K=FChlewind?= <mirja.kuehlewind@tik.ee.ethz.ch>
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <53FF4E3F.4060502@tik.ee.ethz.ch>
References: <201408121058.09210.mirja.kuehlewind@ikr.uni-stuttgart.de> <53EA6068.6090100@tik.ee.ethz.ch> <201408131906.s7DJ6V2s029587@bagheera.jungle.bt.co.uk> <53ECE6C9.40300@tik.ee.ethz.ch> <53ECE917.6000803@tik.ee.ethz.ch> <201408141915.s7EJFVI8000808@bagheera.jungle.bt.co.uk> <53FB741A.9010500@tik.ee.ethz.ch> <201408261727.s7QHRlxB026767@bagheera.jungle.bt.co.uk> <53FF4E3F.4060502@tik.ee.ethz.ch>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Archived-At: http://mailarchive.ietf.org/arch/msg/conex/XCuXzy1mMXDCU7JkiS3pBsNs8zc
Cc: Carlos Ucendo <ralli@tid.es>, ConEx IETF list <conex@ietf.org>
Subject: Re: [conex] Fwd: Review: draft-ietf-conex-destopt-06
X-BeenThere: conex@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Congestion Exposure working group discussion list <conex.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/conex>, <mailto:conex-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/conex/>
List-Post: <mailto:conex@ietf.org>
List-Help: <mailto:conex-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/conex>, <mailto:conex-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Aug 2014 20:06:22 -0000

Mirja,

At 16:43 28/08/2014, Mirja Kühlewind wrote:
>Hi Bob,
>
>again inline...
>
>On 26.08.2014 19:27, Bob Briscoe wrote:
>>>>>>>>>==CDO==
>>>>>>>>
>>>>>>>>>* Specified precisely which IP header is included in the byte
>>>>>>>>>count.
>>>>>>>>So you suggest to not include any options?
>>>>>>>
>>>>>>>I didn't say that, did I? If the wording I used is ambiguous, pls
>>>>>>>fix it.
>>>>>>>
>>>>>>>>Why? I'd say you either include all bits because all of them
>>>>>>>>contribute to congestion, or none of the IP header bits because
>>>>>>>>that's
>>>>>>>>just the overhead you can't avoid if you what to send anything. Also
>>>>>>>>of course you can generate a larger percentage of overhead if you
>>>>>>>>send
>>>>>>>>smaller packets.
>>>>>>>
>>>>>>>That's why I think it is reasonable to include the IP header (and its
>>>>>>>options) that immediately encapsulates the ConEx dest opt.
>>>>>>Okay I misread your text.
>>>>>>
>>>>>>I just though, if you detect a CDO in the header you are currently
>>>>>>looking at, you will simply look at the playload length and next hop
>>>>>>fields of this header and than add another 40 bytes for the header
>>>>>>itself. But in fact that might be wrong if you have another IP header
>>>>>>encapsulated in the payload... so what is actually the right number of
>>>>>>bytes here?
>>>>>>
>>>>>>I was about to write the following but I'm not sure if that is
>>>>>>actually
>>>>>>write and/or clear:
>>>>>>"... IP packet (including the IP header that carries the CDO and all
>>>>>>associated options)..."?
>>>>>>That just doesn't say if you should look what the next header is and
>>>>>>subtract all other IP header bytes you can find. Is this needed?
>>>>
>>>>I'm happy with your something like your last sentence. For precision I
>>>>suggest rewording:
>>>>
>>>>IP packet (including the IP header that directly encapsulates the CDO
>>>>and everything that IP header encapsulates).
>>>Done
>>>
>>>>
>>>>It doesn't have to go searching for any more deeply encapsulated CDO. If
>>>>there is one, that will be dealt with by whatever higher layer it was
>>>>written in, at the point when the outer layers have been peeled off and
>>>>some function at that higher layer is processing these inner dest opts.
>>>My point actually was that you will get a different number of marked
>>>bytes if you look at a point with or without encapsulation. But as
>>>usually all packets (at least of the same flow) get encapsulated or
>>>not, the ratio between marked and not-marked bytes should still be the
>>>same, so I guess that's fine.
>>
>>Yup. When relative proportions are used, there's not a problem.
>>
>>Even where absolute packet size matters (e.g. a congestion policer or
>>more generally a policy device configured with a certain rate of
>>congestion allowance), the proposed definition can work,... as follows
>>
>>There are two possibilities where there might be a more deeply
>>encapsulated CDO:
>>i) Two independent systems are initiating different congestion feedback
>>loops, and they are both reinserting ConEx markings.
>>ii) The inner CDO has at some point been copied to the outer
>>
>>I was talking about case (i) before, where the sender and all policy
>>devices are working at their own layer on their own control loop,
>>unaware of another ConEx control loop at another layer (which is fine).
>>
>>Case (ii) could lead to a bias in the amount of congestion counted. This
>>is why I suggested that the only time a tunnel endpoint copies CDO to
>>the outer is as a performance optimisation. E.g. where an operator is
>>tunnelling packets across its own network and it knows there are ConEx
>>policy devices within its own network, so it saves them the hassle of
>>burying into the tunnel.
>>
>>So case (ii) is also fine, as long as the operator doing the
>>optimisation is aware of both its tunnels and its policy devices, and
>>configures the policy device to make the (constant) allowance for the
>>additional tunnel headers.
>>
>>Perhaps we need to add text where we suggest this optimisation to say
>>that the operator should ensure all packets are decapsulated back how
>>they were before passing on to another operator that is not aware of the
>>optimisation. But, this is getting over-detailed...
>>
>>
>>
>>>>>>>>>* Suggested deleting example of Not-ConEx-capable packets (see
>>>>>>>>>separate thread to conex-tcp-modifications authors about TCP pure
>>>>>>>>>ACKs).
>>>>>>>>I can remove the example but not sure why you are suggesting
>>>>>>>>this. If
>>>>>>>>you actually imply that the X bit should never be zero that we
>>>>>>>>have to
>>>>>>>>discuss if the X bit is needed at all.
>>>>>>>
>>>>>>>I have never thought the X flag was needed. There's probably some
>>>>>>>email
>>>>>>>on the list somewhere in the past from me that says that.
>>>>>>>
>>>>>>>As I put in one of the comment bubbles:
>>>>>>>"The only need I can see for the X-flag is if
>>>>>>>the Reserved field gets used in future for
>>>>>>>something in addition to ConEx. Then there
>>>>>>>would be a need to identify packets that
>>>>>>>are not ConEx-capable but still carry the
>>>>>>>CDO option (for the new reason)."
>>>>>>>
>>>>>>>Can anyone think of a use for the X flag?
>>>>>>I thought the X bit unset means: I'm a ConEx aware sender and i
>>>>>>want to
>>>>>>follow the rules but I don't have any feedback for this (control) data
>>>>>>so I'm unable to give you useful ConEx information and if you use this
>>>>>>packet for your estimation of the current congestion level, you might
>>>>>>underestimate it.
>>>>>>
>>>>>>Doesn't that make sense...?
>>>>
>>>>Not to me. What does "feedback for this (control) data" mean? Feedback
>>>>is about a path used by a 5-tuple. This control data is about to be sent
>>>>over such a path. If the sender has feedback about that path, the
>>>>feedback applies to everything sent over the path, at the IP layer,
>>>>whatever categorisation the next packet has at L4.
>>>If you do not get any feedback on a path, e.g. a receiver only sending
>>>ACKs, you will never be able to send any ConEx markings. So what's the
>>>point about marking a packet as ConEx-enabled?
>>
>>OK, this is a good example for when a ConEx-enabled flag might be
>>useful. However,...
>>
>>...This doesn't justify marking pure ACKs as not-ConEx-enabled. If a
>>sender sends a pure ACK now, all it knows is that it might not have
>>enough feedback to be able to set ConEx markings on a whole sequence of
>>packets later in the flow,... but only if it keeps sending solely pure
>>ACKs from now on. However, a sender can't be sure that it won't have
>>enough feedback in future, because usually an app (let alone the
>>transport layer) cannot predict whether there will be more data to send
>>later, even if it's not sending any now.
>>
>>Once a sender has had no feedback for at least a round trip, it has 2
>>options for subsequent packets:
>>a) turn off ConEx-enabled;
>>b) keep sending packets with ConEx-enabled set, but conservatively add
>>some credit.
>>
>>Even if it subsequently sends some data, it will still have to do (a) or
>>(b) on these data packets, at least for one further round trip, until it
>>gets the feedback. So this is nothing to do with whether the packet
>>being sent is a pure ACK. It is to do with whether feedback has recently
>>been received.
>
>Okay, rewrote the paragraph slightly:
>
>"If the X bit is zero all other three bits are 
>undefined and thus should be ignored and 
>forwarded unchanged by network nodes. The X bit 
>set to zero means that the connection is 
>ConEx-capable but this packet MUST NOT be 
>accounted when determining ConEx information in 
>an audit function. This can be the case if no 
>feedback on the congestion status is (currently) 
>available for e.g. for control packets (not 
>carrying any user data). As an example a TCP 
>receiver that only sends pure ACKs will usually 
>send them as ACK are usually not ECN-capable as 
>ACK usually are not ECN-capable and TCP does not 
>have a mechanism to announce ACK lost. Thus 
>congestion information about ACKs are not available."
>
>Is this okay?

The main problem is saying 'not available *for* 
control packets'. But just changing 'for' to 
'from' would still make this too unclear to be understood.

Also need to:
* Make it clear the example is TCP-specific.
* Focus on loss first, then ECN.
* 'mechanism to announce ACK loss' is not really understandable.
* Avoid 'control packets', which is too general, 
given this is an example, so it can be specific.
* Nit: duplicated word (for e.g. for) and 
duplicated phrase (as ACK are usually not 
ECN-capable as ACK usually are not ECN-capable).

How about:

First 2 sentences unchanged, then...
"This can be the case if no congestion feedback 
is (currently) available e.g. in TCP if one 
endpoint has been receiving data but sending 
nothing but pure ACKs (no user data) for some 
time. This is because pure ACKs do not advance 
the sequence number, so the TCP endpoint 
receiving them cannot reliably tell whether any 
have been lost due to congestion. Pure TCP ACKs 
cannot be ECN-marked either [RFC3168]."




>>>Further note, in the TCP mods we only look at the payload because we
>>>assume, for simplification, all packets have the same size. Therefore
>>>a packet that carries no data would not decrease the CEG/LEG. If ACKs
>>>should get marked, we need to rewrite all this stuff in the tcp mods
>>>doc...
>>
>>I don't think we should avoid changing tcp-mods if its 'not right'.
>>
>>I hope you see the problem from my explanation above - whether there is
>>enough feedback /now/ to ConEx-mark a packet has nothing to do with
>>whether the packet being sent /now/ is capable of generating feedback
>>/in the next round/.
>>
>>If you want to make a simplifying assumption, it is on the safe side for
>>a sender to assume that all incoming feedback is about packets of the
>>same size. It's not safe for a sender to assume that all packets it is
>>sending are the same size. Anyway, it knows what size it is sending, so
>>it doesn't need this simplification.
>Okay, the assumption is (only) that feedback is 
>based on packets that are the same size. If we 
>send you a packet we of course decrease the 
>LEG/CEG by the actually payload bytes. But 
>taking this assumption be simply do not account 
>for headers at all (nor incoming neither 
>outcoming) because we can anyway just estimated 
>the header bits and there simply assume it will 
>equal out. Which mean if we send a pure ACK we 
>will not decrease the LEG/CEG because there are 
>no payload bytes. I believe that this 
>simplification makes thing much simpler and is 
>therefore useful but will not allow for marking pure ACKs...

I thought the earlier definition said that ConEx 
accounts for the size of the IP header that 
contains the CDO and everything within it. Also, 
there's the TCP header size on a pure ACK.

That's the basis on which I am assuming that pure 
ACKs are worth counting. A pure ACK will count as 
at least 86B (and more if there are additional TCP options or IP extensions).

IPv6 header: 40B
CDO dest opt: 6B
TCP header: 40B
Total: 86B

If there are more IP extensions, I guess it will 
be hard for TCP to know though.


>You didn't convince me (yet) that this should be 
>changed but this would need to be changed in the 
>tcp mods doc and not this one anyway.

Agreed (that this would affect tcp-mods, not destopt).

What is 'this' that you aren't yet convinced by?


>>The simplification I propose (that feedback is all about the same size
>>packets, rather than all the sent packets are the same size) is likely
>>to be pretty good, given the receiver doesn't get loss or ECN info about
>>pure ACKs, so they are automatically removed from the set of packets
>>that the sender assumes to be the same size. And, and if some of the
>>feedback is about smaller data packets, at least this simplification
>>will always be on the safe side.
>>
>>If I correctly understand the simplification you propose, a ConEx sender
>>will more often under-declare congestion than over-declaring, which is
>>not safe.
>I don't believe so. Was this just of a different 
>understanding of what we proposed or can you explain further...?

I thought you were proposing that a TCP sender 
assumes all the packets it sends are full-sized, 
even if they aren't. But I believe you have said that is not what you proposed.


>>>>(Even if control data is somehow being sent over a different path, e.g.
>>>>using MPTCP or something, and there has never been feedback over that
>>>>path, then that would warrant Credit, not absence of ConEx.)
>>>I don't think credit does help here. Note credit cannot replace
>>>ConEx-markings anymore.
>>
>>OK, I should have said "then that would warrant conservatively sending
>>some credit and corresponding L or E markings."
>>
>>A sender can always send more ConEx marks than actual congestion, and if
>>it doesn't know actual congestion, it can at least hold an initial
>>estimate of what worst-case congestion might be (e.g. 1%).
>>
>>However, I admit that, if it only sends pure ACKs and this estimate is
>>too low, it will never know it is too low, and the audit function might
>>be dropping loads of its pure ACKs. Ug. This is a new issue I hadn't
>>thought of before... so I think we should recommend option a), not b).
>>
>>>  And if you only send a small amount of control data, it is not very
>>>likely that your packets gets drop and thus probably you do not sent
>>>any credit.
>>
>>That's reasonable.
>>
>>
>>
>>
>>
>>>>>>>>>==Fast-path==
>>>>>>>>>
>>>>>>>>>* CDO as first destination option: changed from MUST to SHOULD
>>>>>>>>>(with
>>>>>>>>>an example of when not to).
>>>>>>>>I believe this really needs to be a MUST. I know that might restrict
>>>>>>>>the use of ConEx with potential other options that might have the
>>>>>>>>same
>>>>>>>>requirement (for different reasons). But if you don't put a MUST
>>>>>>>>here,
>>>>>>>>you cannot implemented the suggested way in the fast path.
>>>>>>>
>>>>>>>A SHOULD still means it will be the first option in all current
>>>>>>>implementations. However, I suggest a SHOULD, precisely because
>>>>>>>performance reasons are not absolute, so they don't require a
>>>>>>>MUST. If
>>>>>>>another dest opt cannot work at all unless it is first, that would
>>>>>>>be a
>>>>>>>valid reason for CDO coming second, because it still works, it's
>>>>>>>/just/
>>>>>>>slower.
>>>>>>>
>>>>>>>The IESG will (rightly) be very wary of any draft that says an option
>>>>>>>MUST be the first option.
>>>>>>>
>>>>>>>I suggested the following text after this: "(This is not
>>>>>>>stated as a 'MUST', because some future destination option might
>>>>>>>need to
>>>>>>>be placed first for functional rather than just performance
>>>>>>>reasons.)"
>>>>>>So our fast path implementation must simply assume that there is no
>>>>>>CDO
>>>>>>in case it cannot find it as the first option. Otherwise all non-ConEx
>>>>>>packets would need to go to the slow path to make sure there is no
>>>>>>ConEx
>>>>>>option. That means to me that this must be a MUST...?
>>>>
>>>>OK, I see the problem, but how much of a performance problem would it
>>>>really be for the fast path of a ConEx function to step along dest opts
>>>>until it gets to CDO then stops (rather than stop if CDO is not first)?
>>>So that's the different between you looking at one bit at a defined
>>>position or having a chain of conditional look-ups where the length is
>>>unknown. I believe that is something you would avoid to implement in
>>>fast path as the processing time is not fixed anymore... that would be
>>>my guess but I'm not an expert in this area.
>>
>>AFAICT, fast path implementations generally work along sequences of
>>extensions. So I don't think this is a problem. Bear in mind that we are
>>not asking general fast path forwarding implementations to do this. Only
>>ConEx functions specifically written to find the ConEx header.{Note 1}
>>
>>{Note 1} OK, we do suggest that general forwarding functions could do
>>DoS protection using the ConEx header. But that's stated as optional and
>>'aspirational'. If such an experiment proves useful, you never know,
>>there could be demand for ConEx to migrate into the hop-by-hop options
>>(according to the v6 spec, hop-by-hop and dest options share the same
>>option number space, so this would be a straightforward migration, just
>>moving where the CDO is placed, but using the same option number and
>>format).
>
>There might be also further use cases for e.g. 
>traffic management or multipath routing where 
>general forwarding nodes need to access this information.
>
>So what's the solution here?

I think this will get thrown back by the IESG if 
we say 'MUST be first'. And I think 'SHOULD be 
first' is a doable implementation for ConEx-aware 
nodes. That is sufficient for experimental. Any 
experiments where general forwarding nodes access 
ConEx will already be reading a destopt at every 
hop, which is not what was intended, but it would 
be doable just for an experiment that wanted to prove ConEx has wider uses.

Everyone involved in IPv6 knows that the attempt 
to design extensibility into v6 failed. It won't 
be news to the IESG that we can't add an 
extension that can be processed at every hop on the fast path.

If a destopt is sufficient to prove ConEx useful, 
then implementers will want to satisfy this demand. Then
* either there is even more pressure on the IETF 
to address this failing in v6 (and maybe someone will),
* or ConEx has to continue with this destopt 
solution, just like everyone else is finding hacks round this failing in v6.

But don't ask me. Ask Suresh.



>>>>Then "CDO SHOULD be first" would give no different performance to "CDO
>>>>MUST be first", if CDO actually was first. If CDO had to be placed
>>>>second on a certain packet, "CDO SHOULD be first" would take just one
>>>>more op than "CDO MUST be first".
>>>>
>>>>Note: I've just re-read the spec of the IPv6 header. We need to specify
>>>>that CDO goes in the "Destination Options (before routing header)", not
>>>>the "Destination Options (before upper-layer header)". Then it won't be
>>>>encrypted by an ESP header.
>>>Thanks. I wasn't fully aware of this. But the difference for my
>>>understanding is if immediate node listed in the routing header should
>>>proceed this option or not. In our case it is probably not important
>>>which one we choose as it should be processed by none of the receivers.
>>
>>You're correct that CDO isn't processed by any of the nodes listed in
>>the routing header as destinations. The phrase "before routing header"
>>is just how its placement is described. We should clarify that this
>>isn't anything to do with the processing of the routing header.
>>
>>>Where did you read that the later one is not encrypted though?
>>
>>ESP encrypts everything after the ESP header, and it comes just before
>>the second dest opts. So it would be no good putting CDO after it.
>>
>>See the ESP spec, on "ESP Header Location":
>><http://tools.ietf.org/html/rfc2406#section-3.1>
>>"  The destination options extension header(s) could appear
>>     either before or after the ESP header depending on the semantics
>>     desired.  However, since ESP protects only fields after the ESP
>>     header, it generally may be desirable to place the destination
>>     options header(s) after the ESP header.
>>"
>Thanks. Wasn't able to find this sentence!
>
>>
>>Also see the IPv6 spec on "Extension Header Order":
>><http://tools.ietf.org/html/rfc2460#section-4.1>
>>
>>I believe one reason there are two places for the dest opt is because if
>>ESP is encrypting everything for the destination, it will normally be
>>expected that the dest opts need to be encrypted too. But this wouldn't
>>work if you have multiple destinations on the path in the routing header
>>(that probably don't hold the relevant key).  Fortunately, this
>>exception is also needed for ConEx.
>>
>>>If so, I can simply add one sentence to the first paragraph of section 4:
>>>"The CDO MUST be placed in the destination option before routing
>>>header such that it does not get encrypted and can be read by
>>>immediate ConEx-aware nodes."
>>>And then remove the first paragraph of the IPSec section (and probably
>>>move the other paragraph somewhere else so that the section is removed
>>>completely)...?
>>
>>I've lost track of all the proposed changes to the IPsec section. But I
>>think there is value in spelling out exactly how ConEx and IPsec
>>interact, so I wouldn't remove the section completely, even if it
>>repeats info elsewhere.
>
>Okay I just realized that we recommend to to use 
>TPSec for authentication but I believe if the 
>ConEx option should not be encrypted by using 
>the respective header, it will also not be 
>authenticated...? So you can have either one of 
>the two...? I believe we still need the IPSec 
>section but right now I'm not sure what to right in there...? Any proposal?

* How to do ConEx when IPsec is also required 
(tunnel & transport modes, and what to count). 
This may all be obvious now, but (IMO) it would 
still be worth spelling out obvious things.
* How to use IPsec to protect the integrity of CDO.



>>>>>>>>>==IPsec compatibility==
>>>>>>>>>
>>>>>>>>>* Suggested ConEx counts the AH header, and the outer tunnel mode
>>>>>>>>>header, with reasoning.
>>>>>>>>Yes, need to be more precise. Will add.
>>>>>>>
>>>>>>>This one wasn't just clarity. I've actually contradicted what was
>>>>>>>said,
>>>>>>>so pls make sure there wasn't a good reason for why it was like it
>>>>>>>was.
>>>>>>>
>>>>>>>I was most concerned about suggesting this change, because it was the
>>>>>>>only one that caused a technical difference.
>>>>>>Ohh, I didn't read your comments carefully and was just looking at the
>>>>>>text changes... this whole accounting is a mess :-(
>>>>
>>>>I don't think it has to be, if we keep to the rule we just agreed above.
>>>>
>>>>>>Maybe we should only account the IPv6 header itself and the
>>>>>>destination
>>>>>>options...?
>>>>
>>>>Why? I really don't understand why the IPsec accounting was written like
>>>>it was. Pls explain.
>>>The problem about tunneling is that the number of ConEx marked bytes
>>>might be different depending on where at the path you look at the
>>>packets. But I guess that's less a problem than I initially though. If
>>>so I guess I can remove this paragraph about accounting in the IPSec
>>>section (if still needed at all).
>>
>>If you're saying that the new definition of what to count removes the
>>problem, because counting is no longer dependent on whether there are
>>encapsulating headers, I agree.
>
>Done.
>
>>
>>>>>>Moreover, isn't this here the same case than with tunneling in
>>>>>>general.
>>>>>>Only if the node that does the encapsulation is ConEx-aware it can
>>>>>>copy
>>>>>>the CDO, otherwise it will be not visible anymore.
>>>>>>
>>>>>>So this should either be a should, or we have to say something
>>>>>>like: if
>>>>>>the node is ConEx-aware is MUST copy the CDO...?
>>>>>And then we can the same thing for tunneling in general...?
>>>>
>>>>That's surely a circular argument. What would make a tunnel endpoint
>>>>into a ConEx-aware tunnel endpoint, so that it would have to copy the
>>>>CDO? It would only become ConEx-aware if it had code added to look for
>>>>the CDO, and why would it have that code added unless it was going to do
>>>>something with CDO? That's why I think my 'MAY copy as a performance
>>>>optimisation' formula is the best we can do.
>>>What you say above is the point. If the node does not know anything
>>>about ConEx, it simple cannot copy the option, which is the case for
>>>all currently existent nodes. So we cannot say MUST in general. But if
>>>the node does know that ConEx exists for any reason, it really must
>>>copy the CDO...? But you right that is a little pathologic. I'm will
>>>to change if that helps understanding/is less confusing.
>>
>>I think we're talking past each other. Given we cannot copy CDO to the
>>outer everywhere, for consistency I don't think that copying CDO to the
>>outer at all is a good idea, UNLESS it's done deliberately as part of an
>>operator's whole approach to handling ConEx. Ie. tunnel endpoints SHOULD
>>NOT copy CDO to the outer by default, but they MAY copy CDO to the outer
>>for a specific purpose (e.g. optimisation for ConEx functions elsewhere
>>in the same operator's network).
>Now understood.
>
>I've tried to make this point a little more clear, not sure if I succeeded:
>"As with any destination option, an ingress 
>tunnel endpoint will not natively copy the CDO 
>when adding an encapsulating outer IP header. In 
>general an ingress tunnel SHOULD not copy the 
>CDO to the outer header as this would changed 
>the number of bytes that would be accounted. 
>However, it MAY copy the CDO to the outer in 
>order to facilitate visibility by subsequent 
>on-path ConEx functions if the tunnel ingree is 
>aware of these nodes and theses nodes are aware 
>of the tunneling. This trades off the 
>performance of ConEx functions against that of tunnel processing. "

OK. Rather than implying that equipment has 
evolved conscious awareness, a better formulation would be something like:
"..the configuration of the tunnel ingress and 
the ConEx nodes is co-ordinated."

Nits:
s/SHOULD not/SHOULD NOT/
s/accounted/counted/
   (in English, accounted is not a transitive 
verb, it has to have 'for' after it)
s/ingree/ingress/
s/theses/these/






>>>>There is no point trying to fix the IPv6 facilities for tunnelling new
>>>>extension headers. The people whose job it was to design this didn't do
>>>>their job. Their design is now burned into IPv6 hardware processors
>>>>everywhere. Full stop.
>>>>
>>>>All we can hope to do is ensure that CDO is not encrypted with ESP. That
>>>>is feasible.
>>>>
>>>>Whatever we do, in many cases, the IPv6 header containing the CDO will
>>>>be encapsulated in other IP headers. So ConEx functions will just have
>>>>to live with that. To find CDO, they will have to look for an IP header
>>>>that encapsulates an upper layer protocol header. And even then, they
>>>>will have to look one level deeper in case IP headers start again.
>>>>There's loads of kit these days that has to do that anyway (e.g. CGNATs
>>>>looking for the transport header or DPI looking for the app-layer). This
>>>>is all we can hope to do at this experimental stage.
>>>I guess we should write this point more explicitly:
>>>"A network node that assesses ConEx information SHOULD search for
>>>encapsulated IP headers until a CDO is found or no further IP headers
>>>can be found." (should or SHOULD?)
>>
>>SHOULD. But I wouldn't word the last clause like that, because searching
>>until no further IP headers can be found could go on and on and on and on.
>>
>>How about:
>>"A network node that assesses ConEx information SHOULD search for
>>encapsulated IP headers until a CDO is found. At any specific network
>>location, the maximum necessary depth of search is likely to be the same
>>for all packets." ?
>
>Done.
>
>>
>>
>>
>>>>We need to prove ConEx is useful, then it can be performance optimised.
>>>>Header parsing performance is generally not a big problem these days.
>>>>
>>>>
>>>>>>>>>* Suggested optional copying of CDO to outer, but also a simpler
>>>>>>>>>'Do
>>>>>>>>>not copy CDO' alternative.
>>>>>>>>I don't really get you SHOULD NOT but MAY here...?
>>>>>>>
>>>>>>>See earlier. Tunnels don't normally understand dest opts, which is
>>>>>>>why I
>>>>>>>said SHOULD NOT. But the MAY is a performance optimisation. Am I
>>>>>>>helping?
>>>>>Okay, understood. But why SHOULD NOT? Isn't it sufficient to say
>>>>>MAY...? (or even MUST/SHOULD if ConEx-aware...?)
>>>>
>>>>You're right, we could leave out the SHOULD NOT. I suggest:
>>>>
>>>>"As with any destination option, an ingress tunnel endpoint will not
>>>>natively copy the CDO when adding an encapsulating outer IP header.
>>>>However, it MAY..."
>>>
>>>Done. But one question: Why MAY and not SHOULD? Wouldn't it actually
>>>be nice if all future tunneling nodes would copy the header.
>>
>>See above about consistency, and ensuring it only happens if the
>>operator is fully aware of the consequences. If it can't be a large
>>majority, it's best not to be any (IMO).
>
>See text above. Or do you want to add more text to this?

Nope, no need for more.

We're getting there!
But we really do need Suresh's expert eye on this.


Cheers


Bob


>Mirja
>
>>
>>HTH
>>(Delayed 'cos it was a public holday in the UK yesterday.)
>>
>>
>>Bob
>>
>>
>>
>>
>>
>>>>Bob
>>>>
>>>>
>>>>>Mirja
>>>>>
>>>>>
>>>>>>>>>==Security Considerations==
>>>>>>>>>
>>>>>>>>>* Added lots, all pointers to where security issues are
>>>>>>>>>discussed in
>>>>>>>>>other places (which is what security directorate reviewers need).
>>>>>>>>Okay I can add that if you think it's necessary (I would say it's
>>>>>>>>just
>>>>>>>>redundant, but you be might right that it just helps the sec dir).
>>>>>>>
>>>>>>>It's not always obvious which aspects relate to security. Especially
>>>>>>>when the security is structural rather than crypto. So I think these
>>>>>>>sentences are useful to sec dir.
>>>>>>>
>>>>>>>
>>>>>>>>>==IANA==
>>>>>>>>>
>>>>>>>>>* I think the act bits need to be 00 not 10 to avoid ConEx packets
>>>>>>>>>being dropped by non-ConEx nodes (including by non-ConEx
>>>>>>>>>receivers)?
>>>>>>>>>But I'm willing to be corrected.
>>>>>>>>I agree; Will ask Suresh why he has put a 10 though.
>>>>>>>
>>>>>>>Yes, he's the right guy to check with.
>>>>>>>
>>>>>>>
>>>>>>>Bob
>>>>>>>
>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Mirja
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Bob
>>>>>>>
>>>>>>>{Note 1}
>>>>>>>For anyone watching on the list, the tentative idea that Mirja has
>>>>>>>reminded me of is documented in 11.3.1 of my PhD thesis entitled
>>>>>>>"Covert
>>>>>>>Markings as a Policer Signal".
>>>>>>>
>>>>>>>The potential problem: A ConEx policer punishes punishment. If a
>>>>>>>congestion policer starts dropping packets because the user has
>>>>>>>contributed excessively to congestion, in subsequent rounds the user
>>>>>>>has
>>>>>>>to re-echo 'L' markings for the policer drops as well. This can drive
>>>>>>>the policer further into 'debit'. This might make it difficult for
>>>>>>>the
>>>>>>>user to get out of trouble once she's started getting into trouble.
>>>>>>>
>>>>>>>The basic idea was that when a congestion policer drops packets
>>>>>>>(because
>>>>>>>the user is causing more congestion than her allowance), it will also
>>>>>>>remove ConEx markings. Then (if there is some way for the receiver to
>>>>>>>feed this back), the sender knows not to send more ConEx marks
>>>>>>>because
>>>>>>>these aren't congestion drops, they are policer drops.
>>>>>>>
>>>>>>>We didn't that double punishment made it hard to get out of
>>>>>>>trouble in
>>>>>>>any policer experiments so far, so let's not allow for a possible
>>>>>>>solution to a problem that we probably don't even have. The current
>>>>>>>crop
>>>>>>>of ConEx drafts are experimental anyway. If this problem does
>>>>>>>surface,
>>>>>>>then we can reconsider.
>>>>>>>
>>>>>>>
>>>>>
>>>>>>>________________________________________________________________
>>>>>>>Bob Briscoe,                                                  BT
>>>>>
>>>>>--
>>>>>------------------------------------------
>>>>>Dipl.-Ing. Mirja Kühlewind
>>>>>Communication Systems Group
>>>>>Institute TIK, ETH Zürich
>>>>>Gloriastrasse 35, 8092 Zürich, Switzerland
>>>>>
>>>>>Room ETZ G93
>>>>>phone: +41 44 63 26932
>>>>>email: mirja.kuehlewind@tik.ee.ethz.ch
>>>>>------------------------------------------
>>>>
>>>>________________________________________________________________
>>>>Bob Briscoe,                                                  BT
>>>
>>>--
>>>------------------------------------------
>>>Dipl.-Ing. Mirja Kühlewind
>>>Communication Systems Group
>>>Institute TIK, ETH Zürich
>>>Gloriastrasse 35, 8092 Zürich, Switzerland
>>>
>>>Room ETZ G93
>>>phone: +41 44 63 26932
>>>email: mirja.kuehlewind@tik.ee.ethz.ch
>>>------------------------------------------
>>
>>________________________________________________________________
>>Bob Briscoe,                                                  BT
>
>--
>------------------------------------------
>Dipl.-Ing. Mirja Kühlewind
>Communication Systems Group
>Institute TIK, ETH Zürich
>Gloriastrasse 35, 8092 Zürich, Switzerland
>
>Room ETZ G93
>phone: +41 44 63 26932
>email: mirja.kuehlewind@tik.ee.ethz.ch
>------------------------------------------

________________________________________________________________
Bob Briscoe,                                                  BT