Re: [Detnet] The scalabilities of the queuing solutions in ADN

Toerless, thanks for the careful review.
I appreciate your interest.
I have commented in line with JJ mark.

Best,
Jinoo

On Fri, Dec 16, 2022 at 4:02 PM Toerless Eckert <tte@cs.fau.de> wrote:

> On Tue, Dec 13, 2022 at 02:21:48PM +0900, 정진우_교수_휴먼지능정보공학전공 wrote:
> > Dear Toerless, and members of DetNet WG,
> >
> > Thanks for pointing out the scalability issue on the queuing solutions,
> > at yesterday's interim meeting.
> > I would like to add some comments, which I couldn't make due to limited
> > time.
> > We have to always be deterministic, you know :)
>
> I very much liked your presentation and would have liked if we had a lot
> more
> time.

JJ: Thanks a lot. Hope we can use more time in the future.

>
>
> Hopefully the time constraints for discussions in online meetings will be
> less if/when we have a design team. I'd like a format like they have in
> the MPLS WG.
>
> > The ATS IR has to maintain individual flow states, and it is the limiting
> > factor on its scalability.
> > However, the IR has only one queue per input port,
>
>
>      PE1 ---->             --> P4 --> PE3
>                P1 ----> P2
>      PE2 ---->             --> P5 --> PE4
>
> Assume we are just looking at two aggregate flows, F1: PE1->PE3 and F2:
> PE2->PE4.
> Normally, we use one IR on P2 for F1 and another for F2. This is how i read
> the UBS calculus paper. If we would wanted to just have one IR for both
> flows, this looks as if it could result in a different calculus - where
> would i find the calculus description of that (either explaining how
> it does or does not result in the same or different latency calculus for
> the P1->P2 hop for F1 and F2.
>

JJ: The per-input port queues are placed at an output port module.
They are a per-input/output ports pair, to be precise.
In P2, F1 and F2 are put into different output ports thus separated queues.

>
> The way i imagine the challenge here is that i would have to configure the
> IR against the sum of the bitrates of F1 and F2 and the sum of the
> burst sizes of F1 and F2. But then the burstyness of F1 on the IR of P4
> would not be the original burstyness of F1 anymore, but the burstyness of
> that aggregate. I guess the IR on P4 could bring down that burstyness
> again, but i can't fathom how the calculus for the buffering required
> by that IR would stay unchanged over the non-aggregated case.
> So it seems this aggregation approach might at least have implications
> on the buffering and latency calculus, so i'd like to see that calculus.
>
> Certainly also more work for a controller having to take this all into
> acount ... i guess.
>

JJ: Let's say F1 is a flow aggregate (FA), and f1 is a flow within F1.
The IR regulates f1 (not F1) to restore its original shape with the initial
parameter.
The max burst of F1 therefore remains as the sum of the max bursts of its
flows.

> > therefore I think it is feasible to be implemented in core routers.
> > The only real-time operation here is to look up the flow table and to
> > overwrite once per packet,
> > no matter how many the flows are active. (This was also stated by the
> > authors of ATS.)
>
> As written in draft-eckert-detnet-bounded-latency-problems, the main issue
> is the question of cost of the high-speed read/write cycle to the
> same register (IR memory) when we go beyond 1 Gbps routers into
> 10, 100, 1000 Gbps interfaces. And the fact that maybe just some
> low percentage of traffic may be DetNet but the memory speed needs
> to support the full link rates.

JJ: Yes, I agree. A fast memory access would be crucial.

>
>
> > But I understand if there are millions of flows, then simply looking up
> can
> > take time.
>
> Lookup alone can be easily pipelined and parallelized. We couldn't build
> internet routers with millions of BGP forwarding lookup entries. read/write
> cycle for IR state at smallest packet size speed is more challenging.
>
> > The FAIR (Flow aggregate - IR) framework can be considered as a
> generalized
> > ATS.
>
> Did you provide the references to the appropriate docs some place ?
> (hopefully we could start to build e.g. a wiki page for the references
>  to all the queuing mechanisms we want to consider).
>

JJ: Yes. The presentation material has the reference [FAIR].

>
> > Therefore it has the same scalability as the ATS.
>
> ? Seems like you're claiming above it has better scalability ?
>
> > However, in the FAIR framework, the IRs can be placed only where they are
> > necessary.
> > Moreover, the IR can be implemented independently of queuing and
> forwarding
> > functions,
> > for example in a separate device, in a more flexible way.
> >
> > The PFAR (Per port FA regulator) has only one queue per input port,
> > and maintains a flow aggregate (FA) state (per input port),
>
> One of the core aspects of UBS that i felt to be very useful was the
> per-hop priority of a flow because it did allow differentiation of
> bounded latency for flos in a fine-grained manner (arguably at good
> cost to the Admission Controller).
>

JJ: The ATS (or UBS) for a flow in one hop is a combination of a FIFO
system and an IR.
The level of flow protection or the service differentiation depends on how
the FIFO system behaves, not the regulator.
If it is just a single FIFO queue, then the flows are mixed and the
latencies are undistighuiable.
But the FIFO system can be more complex to provide differentiated services.
For example, as in the TSN, the FIFO system could be defined per class.

>
> > instead of individual flow states. It is therefore more scalable.
> > As flows become active and inactive, the FA state parameters
> > have to be modified, but the modification process is not a real-time
> > operation
> > and can be executed in bulk.
>
> That of course is the other big underlying issue of ATS, that
> service providers have in their mayority all attempted (because of
> bad experience) to remove the need for dynamic state-change in
> P router forwarding because of the performance, operational, reliability
> challenges they have experienced. They are accepting it (because
> there is no way around it), when its because of network topology
> change, but they certainly would prefer solutions that can live without
> this factor when its just for subscriber traffic changes.
>

JJ: Yes. Dynamic flow state change is one of our big challenges.

>
> > The C-SCORE (Work-conserving stateless core)
> > enjoys the best flow protection and the best scalability.
> > Usually the classical fair queuing algorithms (like Virtual Clock)
> require
> > per-flow queue,
> > but the C-SCORE can be implemented, as it is suggested in the
> > draft-joung-detnet-asynch-detnet-framework-01, with per-input port
> queues.
> > The scheduler only has to decide which HoQ packet has the smallest finish
> > time.
>
> I couldn't quickly determine from your draft what the normative
> reference for C-SCORE is. Is your draft that original spec ?
>

JJ: No, the reference is currently under construction. :)

>
> > The drawback of C-SCORE is that packets are required to carry metadata.
> > And that metadata has to be updated in real-time as the packet is being
> > transmitted,
> > just like the flow state in ATS has to be updated.
> >
> > The CQF and its variations seem to be scalable in packet level,
> > but I think the difficulty is in the schedulability.
> > I just wonder how the millions of dynamic flows can be scheduled to fit
> in
> > the slots, in real operating environments.
> > Larger slots would make the slot scheduling easier, but introduce more
> > latency and jitter.
>
> TCQF (draft-eckert-detnet-tcqf) is our short-term proposal for high-speed
> networks 100Gbps links
> and faster. Short-term because it does not require to standardize new
> headers and it was proven to be implementable and deployable through
> real-router product implementation and simulation. The simulation
> validation reference is in our draft, the deployment is described
> for example in the paper referenced in https://ceni.org.cn/406.html
> (sorry, chinese). With the higher speed of interfaces (more likely
> 400Gbps by the time any solution rolls out), i think it easy to imagine
> how one can start supporting reasonable number of flows.

JJ: Thanks for the URL. Could you specify how many flows are there in the
emulation?

> Of course, i am all for also looking for any mechanisms that have even
> better properties,
> and i am a big fan of achieving further enhancements in the forwarding
> plane through metadata in packets, but that is in my experience just
> a lot longer process to successful deployment, so maybe we could consider
> different phases of standardization targets (short term, longer term). And
> when it comes
> to high-speed router implementation feasibility, real hardware validation
> seems like a very prudent thing to have for making decisions in
> standardization.
>

JJ: I think it could be interesting to have a variety of solutions, as long
as they are effective.
Cheers.

>
> >
> > Best regards,
> > Jinoo
>
> > _______________________________________________
> > detnet mailing list
> > detnet@ietf.org
> > https://www.ietf.org/mailman/listinfo/detnet
>
>
> --
> ---
> tte@cs.fau.de
>