Re: [iccrg] [tsvwg] Policing and L4S WAS RE: BBR and other congestion control mechanisms

HI,

Just to be clear. I support the ecn work and potential adoption of this
work.

My comments do not go in the disqualifying the work, rather the opposite.

Inline below.

BR, Karen

*From:* iccrg [mailto:iccrg-bounces@irtf.org] *On Behalf Of *Bob Briscoe
*Sent:* 1. december 2016 02:23
*To:* Black, David <David.Black@dell.com>; Karen Elisabeth Egede Nielsen <
karen.nielsen@tieto.com>
*Cc:* iccrg@irtf.org; tsvwg@ietf.org
*Subject:* Re: [iccrg] [tsvwg] Policing and L4S WAS RE: BBR and other
congestion control mechanisms

David,

On 19/11/16 04:19, Black, David wrote:

Commenting as an individual (not WG chair), at least for now ...

If I read this correctly, that L4S is not compatible with
Diffserv-style bandwidth

policing and token buckets in particular,

Well, no. L4S is not incompatible with Diffserv-style bandwidth policing.

Because, on loss the L4S requirements say a source must fall back to a
Classic loss-response (e.g. Reno, Cubic, or something equivalent if it's a
real-time transport). So a tb bandwidth policer (2-colour, 3-colour,
1-rate, 2-rate, etc) will work just fine to limit an L4S source.

I think what Karen meant was the other way round: That an L4S source will
not get L4S service from a tb bandwidth policer,

*[Karen Elisabeth Egede Nielsen] Yes that was what I was addressing.*

just as it won't get L4S service from a non-L4S bottleneck router or
switch. Precisely because L4S falls back to non-L4S (Classic) behaviour in
both cases; whenever it gets a loss.

*[Karen Elisabeth Egede Nielsen] And you’re doing a great deal to provide
mechanisms that routers can deploy to support good interworking with
ECN/l4s.*

*We need the same for policiers.*

*It is not really enough that ECN/l4s does not do worse than classic where
there is a drop policier, if this often can be the case.*

Interaction with Policing is in the L4S architecture doc (which used to be
the L4S problem statement):
See section 8.1 Traffic (Non-)Policing
<https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-arch-00#section-8.1>,
which is in section 8 'Security Considerations'.

Section 8.2 'Latency Friendliness'
<https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-arch-00#section-8.2>
explains that some operators might introduce burst policing for L4S.
However, 8.2 recommends that we start without policing and instead we
recommend ways to get a good service while minimizing burstiness. I call it
'latency friendliness', because it's based on "social pressure"; somewhat
like the traditional approach to TCP friendliness, where we recommend good
algorithms in the expectation they will avoid the need for policing. But it
doesn't preclude operators deploying policers.

*[Karen Elisabeth Egede Nielsen**] I think that ECN would be much more
powerful if one could device how to make policing interwork with it.*

*I would rather like to see ECN not as a replacement of DiffServ, but as a
supplement as also indicated in section 5.2 of the draft (the underlined
parts below):*

Diffserv:  Diffserv addresses the problem of bandwidth apportionment

      for important traffic as well as queuing latency for delay-

      sensitive traffic.  L4S solely addresses the problem of queuing

      latency.  Diffserv will still be necessary where important traffic

      requires priority (e.g. for commercial reasons, or for protection

      of critical infrastructure traffic).  Nonetheless, if there are

^^^^^^^^^^^^^^^^^^^^^^

      Diffserv classes for important traffic, the L4S approach can

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

      provide low latency for _all_ traffic within each Diffserv class^

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

      (including the case where there is only one Diffserv class).

*Something else, suddenly in this paragraph it is written that:* *“L4S
solely addresses the problem of queuing*

*      latency.  “ **As also indicated elsewhere in the draft, keeping loss
down is also a primary part of L4s and a prerequisite for that l4s works
well.*

*I think that the draft could better motivate the issue with the policiers
and the necessity to deploy L4S friendly policies (as you discuss below).*

*This was my point.*

*BR, Karen*

then the assertion that L4S is universally

applicable to all Internet traffic just took a serious hit.

I guess you're referring to the opening sentence of the abstract of the L4S
architecture: "...a new service that the Internet could provide to
eventually replace best efforts for all traffic...". That doesn't mean that
when a sender turns on the ECT(1) codepoint it magically creates an L4S
implementation in every bottleneck and every policer that the sender
traverses.

Nonetheless, the point of L4S is for performance to be so good, that it
magically makes operators want to deploy an L4S AQM at their bottlenecks
(and to deploy more L4S-friendly policers).

I might suggest considering

which PHBs are more vs. less appropriate for L4S, as token buckets are rather

common out there and will have to be dealt with as part of incremental

deployment of L4S.  Karen - thank you for pointing this out.

I think you have got the wrong end of the stick. When we said "for all
traffic", we meant the default L4S service could give the service that the
traffic requires (that is, for example, the EF or AF /service/, not the EF
or AF /per hop behaviour/). We didn't mean the traffic has to be marked EF
or AF to get that service, we just meant that the default L4S service would
be /roughly/ as good as EF or AF, without having to implement EF or AF.

The inclusion of EF is probably an overclaim - there's no explicit
mechanism in L4S to give any guarantee of EF-like service. Nonetheless, I
included EF in my list, because the Internet has always proved good at
meeting performance goals without being able to guarantee that it will.

That said, L4S gives only the low queueing delay feature of Diffserv, but
not the "priority over bandwidth during anomalies" feature that some
Differv classes give over others. We will need to make that clear - it's
not clear at the moment.

For bandwidth priority, it would be appropriate to use relevant non-default
Diffserv PHBs, and it would be appropriate to use L4S within those classes.
I have analysed this for the specific case of the classes used by my
previous employer, but not for all the Diffserv classes that the IETF has
ever dreamed of. So yes, we need to suggest which PHBs are appropriate for
L4S. I'd be happy to work with someone doing that, but it's not my priority
at the moment. I'll certainly add an open issues section in the
architecture doc and include this so it doesn't get lost.

In addition to Karen's concern, I have another concern with the text quoted from

the L4S architecture draft -  it contains an assertion that networks
ought to just

ignore the L4S identifier and pass that identifier through unchanged
for traffic that

does not receive L4S service.  The analogous exhortation for DSCPs has failed

miserably for operational reasons - bleaching to zero is entirely too common so

that traffic is marked for the service it is supposed to (authorized
to) receive from

the network, or as stated in Section 2 of See
draft-ietf-tsvwg-diffserv-intercon:

   RFC2474's recommendation to forward traffic with unrecognized DSCPs

   with Default (best effort) service without rewriting the DSCP has

   proven to be a poor operational practice.  Network operation and

   management are simplified when there is a 1-1 match between the DSCP

   marked on the packet and the forwarding treatment (PHB) applied by

   network nodes.  When this is done, CS0 (the all-zero DSCP) is the

   only DSCP used for Default forwarding of best effort traffic, and a

   common practice is to remark to CS0 any traffic received with

   unrecognized or unsupported DSCPs at network edges.

If the L4S identifier selects a distinct low latency network service,
it will become a

good network operational practice to remove that identifier from traffic that is

not authorized to receive that low latency service.

Bleaching ECT(1) cannot be ruled out as a future scenario. But we have
tried to do as much as we can to head-off this possibility:
1) By explaining exactly how operators can offer exclusivity without
needing to bleach
2) We are starting from a position where the ECN field is near-universally
not bleached. So to bleach, an operator will have to actively destroy
something they currently support. Operators know they would risk referral
to their local regulator for that.
3) The ECN field is part of the Internet's congestion control system, so
there is a higher bar to tampering with ECN than there was for Diffserv.

That last point is subtle.
L4S is not a "network service" like Diffserv. L4S is a service largely
induced by the sender behaviour, with the network playing a minor but
important complementary role (by notifying congestion at a shallow queue
threshold). So, unlike Diffserv, the ECN field is not just a request down
the layers asking the network for a particular service. It is also the
place where congestion notification is carried up the layers, from network
to transport.

Sorry this reply took a while. When you sent it, I had no laptop (my
charger failed on the Wed of the IETF). I got it fixed, but I'm only just
getting back to those emails I missed.

Cheers

Bob

Thanks, --David

-----Original Message-----

From: tsvwg [mailto:tsvwg-bounces@ietf.org <tsvwg-bounces@ietf.org>]
On Behalf Of Karen Elisabeth

Egede Nielsen

Sent: Friday, November 18, 2016 4:55 AM

To: in@bobbriscoe.net

Cc: iccrg@irtf.org; tsvwg@ietf.org; Michael Welzl

Subject: Re: [tsvwg] Policing and L4S WAS RE: [iccrg] BBR and other congestion

control mechanisms

HI Bob,

Yes, well aware that L4S will only be as good as the best ISP on the path.

[Karen Elisabeth Egede Nielsen] yes.

One would hope that an operator deploying an L4S AQM would ensure its

policers were compatible, but of course your point is that traffic can

traverse

other operators too.

[Karen Elisabeth Egede Nielsen] I wish :-). Not close to having that grand

overview at all, simply trying to

understand the dependencies and how to make this work in a product.

Using token bucket policing with drop to enforce SLAs pops up as an obvious

issue then.

This is perfectly fine and as it should be, BUT it is actually a very

important point looking at the QOS features being developed for products,

 e.g., smart NICs, If we have token buckets there or don't get ECN in there

in the Shaper logic, then really

the interworking of Diff Server QoS and ECN will not be possible as

indicated in the draft.

I am generally assuming that operators would initially want to provide

exclusive access to L4S for their own (probably paying) customers.

l4s-arch says:

   Certain network operators might choose to restict access to the L4S

   class, perhaps only to customers who have paid a premium.  In the

   packet classifer (item 2 in Figure 1), they could identify such

   customers using some other field than ECN (e.g. source address

   range), and just ignore the L4S identifier for non-paying customers.

   This would ensure that the L4S identifier survives end-to-end even

   though the service does not have to be supported at every hop.  Such

   arrangements would only require simple registered/not-registered

   packet classification, rather than the managed application-specific

   traffic policing against customer-specific traffic contracts that

   Diffserv requires.

[Karen Elisabeth Egede Nielsen] Yes I saw this text but it doesn't really

say that ECN l4s service requires that

the operators do not ingress BW police the traffic  - sure this is clear

from a technical point (also finally even for me now ..) but

as BW policing is really a basic part of Diff Serv and QoS arch I think that

it could be good to stress that point in Section 5.2.

You may use l4s to regulate traffic within a diff serv class but the traffic

must not be subject to non ECN regulated drop at any point

not in potential prior diff serv classification steps either.

As an example, if you like, then in OPenstack QoS API there is (only yet) a

simple BW policier function. Products are being build

on this. If we want to provide l4s better be aware either to disable the

function or that it need be done proper (..).

Virtual services that are afforded l4s may not have a physical NIC at their

sole disposal and BW policing for control of the shared

access to the NIC resources is being used. Better make clear from the start

that if one want ever to be able to provide l4s then

these BW policing functions must eventually implement ECN enabled AQM.

BR, Karen

Bob

On Thu, November 17, 2016 12:36 pm, Karen Elisabeth Egede Nielsen wrote:

HI Bob, All,

Very interesting from my perspective. Thanks a lot for this

presentation and mail discussion.

With the point of view of acc ECN/l4s then I wonder if it was worth to

include some text about token bucket policing and L4S in the L4S

Architecture draft, draft-briscoe-tsvwg-l4s-arch-00.

There is some considerations there already in section 5.2 and section

8.1, but would it be relevant to make it more explicit that the

following statement (section 5.2)

Nonetheless, if there are

Diffserv classes for important traffic, the L4S approach can provide

low latency for _all_ traffic within each Diffserv class (including

the case where there is only one Diffserv class).

Is only really truly valid if ingress diff serv

classification/policing/shaping  it self - e.g., a BW policier - does

not implement drop and in case of a shaper (and queuing) then ECN must

be implemented by the shaper as well. I suppose the latter somehow

falls into ECN recommendations for AQM, so perhaps these

considerations (end the differences to a policier that do

drop) are already totally clear to everybody.

ECN sort of is the alternative to let the CC be latency increase

aware, but as pointed out here then with bold drop token bucket

policing we really have neither.

Thanks

BR, Karen

-----Original Message-----

From: iccrg [mailto:iccrg-bounces@irtf.org <iccrg-bounces@irtf.org>]
On Behalf Of Bob Briscoe

Sent: 17. november 2016 08:35

To: Michael Welzl <michawe@ifi.uio.no> <michawe@ifi.uio.no>

Cc: iccrg@irtf.org; Yuchung Cheng <ycheng@google.com> <ycheng@google.com>

Subject: Re: [iccrg] BBR and other congestion control mechanisms

Michael,

On Thu, November 17, 2016 1:49 am, Michael Welzl wrote:

Hi,

An add-on to my previous email below:

Mechanisms like Veno and CTCP, if I get this right, only decide that

a loss is due to congestion when it also comes with delay growth. I

just witnessed your presentation in MAPRG about policing, and how

BBR interacts with it:

https://www.ietf.org/proceedings/97/slides/slides-97-maprg-traffic-p

o lici ng-in-the-internet-yuchung-cheng-and-neal-cardwell-00.pdf

<https://www.ietf.org/proceedings/97/slides/slides-97-maprg-traffic-
<https://www.ietf.org/proceedings/97/slides/slides-97-maprg-traffic-policing-in-the-internet-yuchung-cheng-and-neal-cardwell-00.pdf>

p olic ing-in-the-internet-yuchung-cheng-and-neal-cardwell-00.pdf>
<https://www.ietf.org/proceedings/97/slides/slides-97-maprg-traffic-policing-in-the-internet-yuchung-cheng-and-neal-cardwell-00.pdf>
(

Thanks for giving this presentation, this was very interesting. )

So this would mean that Veno, CTCP etc. would get it wrong in the

face of a token bucket - there would be loss without delay growth,

yet it IS due to a rate limit. Given the prevalence of policing that

you have shown, and how BBR nicely interacts with it, I find this

very interesting (and good); I’m not sure if this has been

considered in the design of any congestion control before?  Does

anyone know of any other examples?

1/ I would be interested if there is enough data in the BBR dataset

to zoom in on clients in specific ISPs. I know at least one traffic

management vendor uses a congestion policer that limits the rate of

heavy users, but the rate at which it limits varies all the time,

because it is the outcome congestion limit, not a rate limit. More

precisely, it enforces a "congestion rate token bucket" limit on the

per-user contribution to shared congestion.

This would look similar to the varying capacity of a scheduler, but

distinguished from 3G/LTE or WiFi by only tiny additional delay,

because  the measurement of shared congestion in the policer is based

on an AQM on a very high bandwidth link (typically 10G or 40Gb/s), so

queuing delay is very low before loss appears.

2/ HULL (High throughput, utra-low latency) uses DCTCP at the sender.

In

the network it marks ECN on the real packets based on the length of a

'phantom' queue, which is a number incremented by the size of every

arriving packet, and decremented slightly more slowly than the drain

rate of the real queue. This keeps the real queue extremely short.

This is an example where there is zero delay build-up before the

intended operating point. Because of the lack of delay, it also uses

hardware pacing on the sender NIC, because there is no queuing delay

to space out TCP's ACK clock otherwise.

I don't know whether any DC operators are using HULL (does anyone?).

You would not want to put Not-ECT packets (like current BBR) into

such a queue structure.

Bob

Cheers,

Michael

On Nov 17, 2016, at 9:57 AM, Michael Welzl <michawe@ifi.uio.no>
<michawe@ifi.uio.no>

wrote:

Hi Yuchung,

On Nov 17, 2016, at 8:22 AM, Yuchung Cheng <ycheng@google.com

<mailto:ycheng@google.com> <ycheng@google.com>> wrote:

On Wed, Nov 16, 2016 at 5:14 PM, Michael Welzl <michawe@ifi.uio.no

<mailto:michawe@ifi.uio.no> <michawe@ifi.uio.no>> wrote:

Dear all,

Thanks very much to the authors of BBR for an interesting

presentation yesterday! (see

https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-

cong estion-control-01.pdf

<https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr
<https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-01.pdf>

-con <https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-01.pdf>

gestion-control-01.pdf>
<https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-01.pdf>
if you missed it).

I’d be interested to know how BBR compares to some other TCP

variants. For example, TIMELY:

http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p537.pdf

<http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p537.pdf>
<http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p537.pdf>

Any idea, anyone?

( I tried, honestly, I did - but I just can’t manage to hold back

a tiny bit of sarcasm here:  ...is BBR also 133 times faster than

TIMELY?  :-)  )

I'd like to explain further about 133x result of BBR vs Cubic

first.

This is a spot-test. It is not a general quantification of BBR's

improvement on our B4 network (we'll publish that data in a

separate report). This is a 8MB RPC every 30s on a warmed

connection on east-US <-> west-EU using the lowest QoS (best

effort). That particular path has quite a bit of burst induced

losses. In this case, Cubic just died due to 1/sqrt(p). Notice the

network path is fairly different from the typical Internet access:

it's +10G from start to end, with little buffers (compared to BDP)

on the way to build persistent large queues. The losses are caused

by burst traffic (of same or higher QoS). What the number 133x

highlighted was when loss does not indicate a persistent queue,

using it as a sole congestion signal is brittle. It is not to say

loss signal is completely useless, but it does not always indicate

self-induced queues.

Thanks for this information!

Comparing TIMELY and BBR are at one level comparing apples to

oranges. TIMELY is designed for a very specific hw/transport

(RDMA/kernel-bypass) and networks (intra-DC) with the goal to

reduce tail latency. Hence it uses specific NIC features like HW

timestamps to perform its control.

Well … I understand it was made for intra-DC communication, and it

uses these features to get the necessary timer granularity. I can’t

see why this wouldn’t make it applicable to networks with larger

RTTs?

BBR is designed to be a generic good transport to achieve high

tput with adequate queue. It faces an unknown network and sole

feedback is TCP-acks hence it must first perform a exponential

search to get a basic sense of the network. It also needs to deal

with all sorts of adverse issues in the wild Internet as we'v

described.

Right… E.g. I guess Timely wouldn’t work so well when facing

losses.

A more interesting comparison would be vs other Internet-CC like

Vegas (which we are working on).

No, I think that’s a terribly uninteresting comparison to do. I’d

even call it ridiculous, as many many MANY mechanisms work better

than Vegas, after all it's now 16 years old.

May I propose comparing against the state of the art, not the state

of 2000?

Some of the direct follow-ups on Vegas for example, like FAST? Or

delay-based algorithms that also use a bit more information than

just delay?

E.g. Westwood comes to mind, as a mechanism that also looks at the

rate at which ACKs arrive. AFAIK this one is readily available in

Linux for comparison.

Also, some TCP variants try to distinguish between random packet

loss and congestion-induced packet loss - so similar to BBR, you

wouldn’t overreact to random losses with them. Veno is an example,

and CTCP (Coded TCP, not Compound TCP) too.

Cheers,

Michael

_______________________________________________

iccrg mailing list iccrg@irtf.org

https://www.irtf.org/mailman/listinfo/iccrg

_______________________________________________

iccrg mailing list iccrg@irtf.org

https://www.irtf.org/mailman/listinfo/iccrg

_______________________________________________

iccrg mailing list iccrg@irtf.org

https://www.irtf.org/mailman/listinfo/iccrg

-- 

________________________________________________________________

Bob Briscoe                               http://bobbriscoe.net/