Re: [drinks] Comment on today's drinks discussion

Otmar Lendl <lendl@nic.at> Mon, 10 August 2009 15:40 UTC

Return-Path: <lendl@nic.at>
X-Original-To: drinks@core3.amsl.com
Delivered-To: drinks@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 85B713A6B13 for <drinks@core3.amsl.com>; Mon, 10 Aug 2009 08:40:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.265
X-Spam-Level:
X-Spam-Status: No, score=-0.265 tagged_above=-999 required=5 tests=[AWL=-0.565, BAYES_50=0.001, HELO_EQ_AT=0.424, HOST_EQ_AT=0.745, RCVD_IN_DNSWL_LOW=-1, SARE_RMML_Stock10=0.13]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R-cFoMT4fXsE for <drinks@core3.amsl.com>; Mon, 10 Aug 2009 08:40:40 -0700 (PDT)
Received: from mail.bofh.priv.at (fardach.bofh.priv.at [88.198.34.164]) by core3.amsl.com (Postfix) with ESMTP id 9FC343A6E8A for <drinks@ietf.org>; Mon, 10 Aug 2009 08:40:37 -0700 (PDT)
Received: from [10.10.0.241] (nat.labs.nic.at [83.136.33.3]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bofh.priv.at (Postfix) with ESMTP id 7087F4C55B; Mon, 10 Aug 2009 17:40:38 +0200 (CEST)
Message-ID: <4A803F75.50503@nic.at>
Date: Mon, 10 Aug 2009 17:40:37 +0200
From: Otmar Lendl <lendl@nic.at>
User-Agent: Thunderbird 2.0.0.22 (Windows/20090605)
MIME-Version: 1.0
To: "PFAUTZ, PENN L, ATTCORP" <ppfautz@att.com>
References: <35FE871E2B085542A35726420E29DA6B01F18918@gaalpa1msgusr7a.ugd.att.com> <8BC845943058D844ABFC73D2220D46650863B5B0@nics-mail.sbg.nic.at> <35FE871E2B085542A35726420E29DA6B01FA1C64@gaalpa1msgusr7a.ugd.att.com>
In-Reply-To: <35FE871E2B085542A35726420E29DA6B01FA1C64@gaalpa1msgusr7a.ugd.att.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Cc: drinks@ietf.org
Subject: Re: [drinks] Comment on today's drinks discussion
X-BeenThere: drinks@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF DRINKS WG <drinks.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/drinks>, <mailto:drinks-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/drinks>
List-Post: <mailto:drinks@ietf.org>
List-Help: <mailto:drinks-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/drinks>, <mailto:drinks-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Aug 2009 15:40:41 -0000

PFAUTZ, PENN L, ATTCORP wrote:
> 
> I've had a lingering concern about the disconnect between what Speermint
> has proposed (LUF/LRF)and the route that drinks has taken. Since the
> will of the design team seemed to be to get on with a simple protocol
> directed toward provisioning DNS RRs I let that ride. Monday's session,
> however brought up things like more abstraction of routing elements
> which suggests to me assumptions about the nature of interconnections
> and my issues with the original ESPP I-D.
> 
> Brian Rosen's comment about number "ownership" relations also seemed to
> suggest another complexity that the protocol would try to incorporate. 
> 
> I get concerned about a protocol that either makes a lot of specific
> assumptions about the nature of the registry and the interconnection
> framework and/or becomes bloated by trying to incorporate the panoply of
> possible cases.
> 

Penn,

one of the problem I see is that these assumptions regarding the
"interconnection framework" are never spelled out. I don't see a vision on
how this might all fit together other than in very restricted situations.

Anyway, on a recent train ride I tried to write up a summary on what bugs
me about the Speermint/Drinks design. Here it is:

--------------------------------------

Some comments on the direction Speermint and Drinks are taking

First of all, why do we need these WGs at all? The quick answer is that
VoIP interconnection based on plain SIP and ENUM did not work out as
envisioned by the authors of the respective RFCs. There are a number of
reasons for that (see draft-lendl-speermint-background), and I don’t expect
that the IETF can do anything to change this.

What happens in the ITU/ETSI/3GPP area? The PSTN interconnection used to be
simple before the era of telecom liberalization: you had one incumbent per
country and close to a full mesh of interconnection links between all
countries. In the GSM world, the possibility of roaming led to a large
number (scaling with O(n*n)) of contracts between operators. In both the
fixed line business, as well as in the mobile telephony world, the number
of operators has markedly increased over the last years. A full mesh of
links just is not possible any longer. Even if the underlying network does
not need dedicated links any more, just doing contracts between all
possible pairs is no longer a viable option. Recent developments in the GRX
and IPX demonstrate this: The introduction of "hubs" was necessary to get
the quadratic scaling under control. The time of full meshes is over.

Call routing was rather simple in the full mesh world (be it PSTN or
RFC3263 SIP), it only needed some directory service to map Public
Identities (PI = phone numbers or SIP URIs) to operators. In a lot of
cases, these directories are static simple mappings like "route anything
starting with +49 to Deutsche Telekom".

This is no longer sufficient. Any solution to the current world-wide call
routing problem needs to cope with arbitrary interconnection graphs, not
just the trivial case of full meshes. A directory will not suffice any
more: we need a full blown routing algorithm.

I repeat: The current graph of interconnection between carriers has no
special properties any more. We have a text-book routing problem to solve.

This is not what Speermint is supposed to solve. As the name says: It is
supposed to do “peering” and not “routing”. In other words, Speermint
covers what two operators need to do to exchange calls between their direct
customers.

As the need to cover transit is clear, it has crept into a good number of
Speermint documents. This just amounts to “we need to consider these
scenarios”, but not to “here is how you solve them”.

Why is Speermint restricted to peering? My guess is the following:

    * The driving force behind the establishment of this working group is
the set of US MSOs. Their motivation is simple: enable direct call routing
amongst them. They do not care about transit.

    * Doing a routing protocol for VoIP violates the IETF end-to-end
principle and is thus not politically correct.

    * A full routing architecture plus routing protocol for VoIP is a huge
task and to be successful needs buy-in from traditional carriers. This is
more than the IETF is willing to tackle.

    * Peering can be stacked on existing protocols as implemented in the
SIP devices: all the reachability/routing information is stored in ENUM
trees which the SIP devices query. The only protocol work relates to how
these ENUM trees get populated. Thus DRINKS.

What’s not to like?

    * Transit will be back.

      It’s an illusion that we can build a system solely for peering
setups. This will get used for transit. The foundations build by Speermint
will not be able to really support transit. I predict utterly messy and
unstable deployments down the road as transit will be bolted on peering.

      It is far better protocol design to see peering as a simple special
case of transit.

      What DRINKS is doing is the SIP equivalent of a provisioning protocol
to drive proxy-ARP servers for the interconnection of LANs. Nice and simple
if everybody you every want to reach is just a hop away, but complete
unsuitable for transit.

    * ENUM will not be enough.

      ENUM is a simple lookup protocol: The input is an E.164 number and
the result is an ordered set of service-type/URI pairs. ENUM is not a SIP
device control protocol. It is way too limited in the information the SIP
device can pack into the _query_ (e.g.: source URI, source trunk
information, SIP device ID, media requested, ...). You can fudge more
information into the ENUM _answers_ by excessive use of URI parameters, but
the query is limited by the underlying DNS protocol.

    * Excessive use of Registries.

      The LERG and LNP databases are know solutions. The default solution
for data-interchange problems in the PSTN world seems to be registries.
(Remember the old adage of a problems looking like nails when all you have
are hammers?)

      In the Internet architecture, central registries are sometimes a
necessary evil and are kept as small as possible. In the VoIP routing case,
registries are certainly part of a solution, but only to manage a shared
namespace (domain registries for URIs and ENUM registries for E.164 numbers).

    * LUF / LRF confusion.

      In a rare moment of sanity, Speermint came up with the distinction
between LUF and LRF. In a nutshell: the LUF (Lookup Function) maps the
public identifier to some aggregation concept like "destination group",
"routing key", "destination domain",... whereas the LRF (Location Routing
Function) take this Identifier as input an finds the next SIP hop towards
that destination.

      This is a common concept in the Internet: typically a domain-name is
first mapped to an IP address, and then the IP routing algorithm finds the
next hop towards the host offering services for that domain.

      The basic concept behind this is the difference between a "name"
(identifying what you want, regardless where it is) and an "address"
(identifying where something is, preferably amenable to aggregation). The
result of the LRF is the Session Establishment Data (SED) which is a
"route" (identifying the next hop towards the destination).

      The LUF is best implemented by an on-demand lookup function to
central database (which does not preclude local replicas for performance
reasons). The LUF is querier-independent; everybody will get the same answer.

      The LRF is the lookup into a local routing database. No external
database needs to be involved. But you need either a lot of manual work or
a routing protocol between the interconnection-partners to build up the
local routing tables.

    * Too much information in the Registry.

      As should be clear from the preceding remarks, I consider it to be a
serious mistake that the current DRINKS design completely merges LUF and
LRF into one single protocol.

      This overloads the registry with data that should not be there.

    * No consideration for multiple Registries.

      If registries are pure LUFs, then there are natural ways to
distribute the namespace over multiple registries: Both in the E.164/ENUM
case and the URI/domains case, the hierarchical DNS can delegate parts of
the global namespace to local registries. The actual call routing is not
affected as the LUF registries do not contain routing data. There is no
need for a registry-registry protocol to synchronize "routes" or "routing
groups".

      If the registries are mixed LUF/LRF ones, then multiple registries
lead to interesting routing problems: Unless every carrier is a customer of
every registry, then achieving worldwide VoIP routing gets tricky. Either:

          o Transit carriers re-announce PIs they learned from one registry
into all other registry they are connected to. That means that the same PI
will appear in each registry multiple times and the registry needs to
decide which path announce with what preference to carriers in need of
transit. Additionally, I don't see how you can prevent routing loops in
this scenario.

          o We invent a registry-registry protocol that avoids these
re-announcements. That just opens another can of worms: will there be a
full mesh between the registries? What kind of routing policies will the
registries implement on behalf of their customers?

    * Routing is a core competence of a carrier.

      I just cannot image the any serious transit carrier will want to
outsource its transit routing algorithm to a foreign party.

      This is not an issue as long as the registry will contain only pure
peering data.

__Summary__

Both working-groups are building a complex solution for a very restricted
special case of the generic VoIP routing problem. This will work only in
tightly restricted setups.

But it will block a clean solution for the underlying problem and will
encourage ugly hacks in the long term.

This is like designing the IBM PC with its 640kB limit while you already
know that this will lead to EMS, XMS, himem.sys and all these kludges once
you want to do more with the design than just run visicalc.

/ol
-- 
// Otmar Lendl <lendl@nic.at>, T: +43 1 5056416 - 33, F: - 933 //