Re: [alto] ALTO service query spanning multiple domains (ECS)

Jensen Zhang <jingxuan.n.zhang@gmail.com> Tue, 13 September 2022 12:51 UTC

Return-Path: <jingxuan.n.zhang@gmail.com>
X-Original-To: alto@ietfa.amsl.com
Delivered-To: alto@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 118A1C14CE35 for <alto@ietfa.amsl.com>; Tue, 13 Sep 2022 05:51:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.103
X-Spam-Level:
X-Spam-Status: No, score=-2.103 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FKoHAXLxUk7P for <alto@ietfa.amsl.com>; Tue, 13 Sep 2022 05:51:43 -0700 (PDT)
Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0D8CFC1522CB for <alto@ietf.org>; Tue, 13 Sep 2022 05:51:43 -0700 (PDT)
Received: by mail-wm1-x333.google.com with SMTP id bd26-20020a05600c1f1a00b003a5e82a6474so9423559wmb.4 for <alto@ietf.org>; Tue, 13 Sep 2022 05:51:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=2T3Ru5XVLURGsZFqE67+hobWgY4IhWo8bfZQpaYBeO0=; b=ONwkahMFngqhVu34CVMvQiO4OH5FfhTbQN0KMenLfe/dP+y8gs0DU3eAOfkKFlzOuO y6QdHOOLKz2niF5TQPpM1AwGpAAbqJvxUCed3QmgXcc2+QUM3Z6OTc4UW1pJByzuGYS4 rHuONCZLpFWj+d4+ukDVYPoPSPuOjs9rVENn4h9VrmgeZSlGhOvt21ll+YJXI2v2ng66 KO+Z7Dm0ovy1vf75Yh7tQPlXG3+j5KEhwnFaTvCYvSWjIWO7FdKB1uXgdaiFXLTV/H4d UXtiKOGPYKkgBUBDKAVPVgjYUOfFUP4sdfzOXChehEDeux4ep8Ne75/pOg4rwlrj6EEp SvUw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=2T3Ru5XVLURGsZFqE67+hobWgY4IhWo8bfZQpaYBeO0=; b=2gFHLAjY910UgUHEj1nYJRvtBnX/brn5OyL5h5jwq/vb4Q4B2/afGIEgYO5AhjzCxB 897iQPSYMYeVrRZ/reou6Z8JpyGOGFo1tnpAEQ+tjiwVT1G6czL+JB92o+QbfdS7BV8K B3ea/t1ertzIjGoKToGi7R2TzJR8h/OswayOA5hVVB0YcIEQly77g3uwB/idIAdUbkEC lV0pWKmH7LuG9FG/CzIv1eCAMmmfJlc46jExWtkAQAohf2SgsLGR0wRnx06zfybyMd9x mtGjYoj4hr4LYa4h9kFNlzvLOyEQ2D47EF0tFlHfrQ8ELhKTY3L7esn3AWsWLEIiidfM Fbmg==
X-Gm-Message-State: ACgBeo0KPY9epRO+VPBxnet0gZmySacEgJrWSSjwCxoAirdRDb3cnQ+J VHgTEIZfrJMf7VYHN3lfAwVnaCJuvFsZr7vCdwM=
X-Google-Smtp-Source: AA6agR7Wn4fhLnJc/hdFx5Dn2nP9QpfQYZdUOB0juR9EVGGMEglIRdtkdZpa2mBfzeraJk6XAXYZBw75KvmaBlLfJqU=
X-Received: by 2002:a05:600c:3d05:b0:3b4:9a42:10d0 with SMTP id bh5-20020a05600c3d0500b003b49a4210d0mr1586562wmb.135.1663073501289; Tue, 13 Sep 2022 05:51:41 -0700 (PDT)
MIME-Version: 1.0
References: <CANUuoLq5ma=y4ke91B6CPcX4roEzBWsc1J8JrvgKrKtB0jzoOQ@mail.gmail.com>
In-Reply-To: <CANUuoLq5ma=y4ke91B6CPcX4roEzBWsc1J8JrvgKrKtB0jzoOQ@mail.gmail.com>
From: Jensen Zhang <jingxuan.n.zhang@gmail.com>
Date: Tue, 13 Sep 2022 20:51:29 +0800
Message-ID: <CAAbpuyrYwTEetHPazA2MAC9kV2=n9M2jqmt7DZBTJbFW=CedLQ@mail.gmail.com>
To: "Y. Richard Yang" <yry@cs.yale.edu>
Cc: IETF ALTO <alto@ietf.org>, Kai Gao <kaigao@scu.edu.cn>
Content-Type: multipart/alternative; boundary="00000000000047a2f305e88e7927"
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/90eVBJbJF0b4JQli30mhP05aABs>
Subject: Re: [alto] ALTO service query spanning multiple domains (ECS)
X-BeenThere: alto@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Application-Layer Traffic Optimization \(alto\) WG mailing list" <alto.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/alto>, <mailto:alto-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/alto/>
List-Post: <mailto:alto@ietf.org>
List-Help: <mailto:alto-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/alto>, <mailto:alto-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Sep 2022 12:51:47 -0000

Hi Richard and all,

Thanks for the heads up. Our approach can be considered as a proactive mode
of the multidomain query described in the last email. Instead of searching
for each (srcIP, dstIP) pair, the algorithm works at the granularity of IP
prefixes.

We consider the case that each domain operates an ALTO server, and the ALTO
server can be discovered with the IP address of the border router. Also it
is presumed that each domain D knows the IP prefixes (P_(D,1), ...,
P_(D,K)) owned by itself (i.e., Net(P_(D,x)) = D), and the ingress port
(i.e., access point) of the prefixes. Currently the algorithm only works
for prefix-based routing but can be extended to handle tunnels, as reported
in a related study [KATRA].

There are two independent processes: 1. prefix path discovery, and 2. ALTO
queries.

The prefix path discovery works as the reverse process of BGP. In the
beginning, each domain D_i announces to its neighbors D_j 1) the cross
product of Ps (prefixes owned by D_i) and Pd (prefixes announced to D_i
from D_j), and 2) the ingress port of each valid (Ps, Pd) pair to D_j,
which is the peer of the egress port for each (Ps, Pd) pair in D_i. Upon
the message that some (Ps, Pd) arrives at D_j at ingress, D_j updates its
local cross products, and finds the ingress of (Ps, Pd) of the next domain.
Then this process is repeatedly conducted in each domain until convergence
(guaranteed if the routing is correct, i.e., no loops or blackholes). The
pseudo code below shows the high-level process but simplifies some details
(like handling withdrawls triggered by local network updates).

Initialize (D_i):
// Construct the initial (srcPrefix, ingress, dstPrefix) cross products
P_in = {prefixes owned by D_i}
Ingress_in = {ingress for each prefix in D_i}
P_out = {prefixes announced from all neighbors}
Local_In = cross-product(zip(P_in, ingress_in), P_out)
Local_out = {} // empty set
// Announce to neighbors
Synchronize (D_i, LocalPairs)

Synchronize (D_i, CrossProduct):
// Get the egress port for each (srcPrefix, ingress, dstPrefix) pair
// Note that the prefixes may be split or merged based on the local FIB
M = lookup-local-fib-and-group-by-egress-port(CrossProduct)
// Announce to neighbors
foreach (egress, {(Ps, Pd}) in M:
    next_ingress = get-peer-address(egress)
    D_j = server-discovery(next_ingress)
    Local_out = Local_out U {(Ps, egress, Pd)}
    emit-message(D_j, {(Ps, Pd)}, next_ingress)

Update (D_i, {(Ps, Pd)}, ingress):
// When there are updates from neighbors that (Ps, Pd) arrives in D_i from
ingress
// construct a new tuple (Ps, ingress, Pd)
    Delta = {(Ps, ingress, Pd)}
    Local_in = Local_in U Delta
// Find the local paths for {(Ps, ingress, Pd)} and (iteratively) announce
the pairs to neighbors
    Synchronize (D_i, Delta)

After the prefix path discovery phase, each domain can determine whether a
given (srcIP, dtsIP) pair traverses its network and, if yes, what is the
path. Thus, the query part becomes relatively simple:

Global-Query({(srcIP, dstIP)}):
    foreach D_i:
        metrics_i = Local-Query(D_i, {(srcIP, dstIP})
    metrics = merge({metrics_i})
    return metrics

Local-Query(D_i, {(srcIP, dstIP)}):
    foreach (srcIP_k, dstIP_k):
        ingress_k = lookup-ingress(Local_in, srcIP_k, dstIP_k)
        path_k = look-up-fib-local-path(srcIP_k, ingress_k, dstIP_k)
    metrics_i = extract_metrics({path_k})

An improvement is to not broadcast the information of the complete prefix
set but only those that can be queried. This can be done by replacing

P_in = {prefixes owned by D_i}
Ingress_in = {ingress for each prefix in D_i}
P_out = {prefixes announced from all neighbors}

with

P_in = {prefixes owned by D_i and in Interested_in}
Ingress_in = {ingress for each prefix in D_i}
P_out = {prefixes announced from all neighbors and in Interested_out}

where Interested_in and Interested_out are the prefixes declared by the
clients, which contains all IP addresses that will be contained in an ALTO
query.

[KATRA]: https://www.usenix.org/conference/nsdi22/presentation/beckett

Best regards,
Kai and Jensen


On Tue, Sep 13, 2022 at 10:51 AM Y. Richard Yang <yry@cs.yale.edu> wrote:

> Hi all,
>
> During the weekly meeting last week, we went over the details when
> deploying ALTO in a multi-domain setting, say the FTS/Rucio setting
> supporting the TCN deployment [1]. Below is the endpoint cost service (ECS)
> case, and it was suggested that we post it to the WG mailing list to update
> the WG and get potential feedback.
>
> Problem: An ALTO client queries the endpoint cost from srcIP to dstIP for
> a given performance metric (e.g., latency). Consider the case that the
> srcIP and dstIP belong to different networks, with the whole layer-3 path
> as the list [ip[0], ip[1], ..., ip[N]], where ip[0] = srcIP and ip[N] =
> dstIP. Define Net(ip) as the function that maps an IP address to the
> network that owns the IP---ignore the complexity such as anycast since the
> deployment does not have this case. Then Net(srcIP) != Net(dstIP), if it is
> multi-domain. Consider the initial deployment that we have only an ALTO
> server for each network; that is, it provides ALTO service for only
> Net(srcIP) == Net(dstIP). Then, there is not a single ALTO server that can
> provide the answer.
>
> Basic solution (one src-dst flow): Map the list [ip[0], ..., ip[N]] to a
> list of segments, where each segment starts with an IP address, and ends
> with the first IP address in the sequence that leaves the network of the
> start IP address. Hence, the basic query framework at an aggreation ALTO
> client:
> - alto-ecs(srcIP, dstIP, metric)
>   metrics = EMPTY
>   ingressIP = srcIP
>   do {
>       alto-server = server-discovery(ingressIP)
>       (metricVal, egressIP) = alto-server.query(ingressIP, srcIP, dstIP,
> metric)
>       metrics.add(metricVal)
>       ingressIP = egressIP
>   } while (egressIP != dstIP)
>
> The preceding assumes a procedure that collects segment attributes, and it
> can be a single pass composition using a metric-dependent function (e.g.,
> latency is addition, and bw is min).
>
> Multi-flow queries: ALTO ECS supports the querying of multiple src-dst
> pairs. A simple solution is to query each src-dst pair one-by-one. Such a
> query is necessary because the routing can be dependent on packet
> attributes (srcIP, dstIP) and a pseudo packet attribute (ingressIP), and
> the ALTO client cannot reuse the results. To allow reuse (both in
> multi-flow queries and caching of past queries), it helps that the ALTO
> server indicates equivalent classes, which Kai and Jensen investigated.
>
> A revision of the protocol using caching and equivalent class is:
> alto-server-cache: indexed by ALTO server, <attribute, mask> pairs
> - alto-ecs(srcIP, dstIP, metric)
>   metrics = EMPTY
>   ingressIP = srcIP
>   do {
>       alto-server = server-discovery(ingressIP)
>       if (alto-server-cache.match(alto-server, ingressIP, srcIP, dstIP)
>          use cache results
>       else
>          (metricVal, egressIP; ingressIPMask, srcIPMask, dstIPMask)
>                  = alto-server.query(ingressIP, srcIP, dstIP, metric)
>          alto-server-cache.add(alto-server, <ingressIP, ingressIPMask>,
>                     <srcIP, srcIPMask>, <dstIP, dstIPMask>
>       metrics.add(metricVal)
>       ingressIP = egressIP
>   } while (egressIP != dstIP)
>
> The mask design is a special case. For the general case, the most flexible
> equivalent class may be using predicates (e.g., supporting identifying the
> lower entries of longest prefix matching). It is an issue that can benefit
> from more benchmarking, or if there are any related pointers, the team will
> appreciate the pointers.
>
> In the next email, Kai and Jensen will post a slightly different design
> supporting a map oriented service.
>
> Cheers,
>
> [1] Transport control networking: optimizing efficiency and control of
> data transport for data-intensive networks,
> https://dl.acm.org/doi/abs/10.1145/3538401.3548550
> _______________________________________________
> alto mailing list
> alto@ietf.org
> https://www.ietf.org/mailman/listinfo/alto
>