Re: [Int-area] IP Protocol number allocation request for Transparent Inter Process Communication (TIPC) protocol

Joseph Touch <touch@strayalpha.com> Mon, 23 March 2020 15:42 UTC

Return-Path: <touch@strayalpha.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AD6213A09C9 for <int-area@ietfa.amsl.com>; Mon, 23 Mar 2020 08:42:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.087
X-Spam-Level:
X-Spam-Status: No, score=-2.087 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strayalpha.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2KH2XzStBeeo for <int-area@ietfa.amsl.com>; Mon, 23 Mar 2020 08:42:08 -0700 (PDT)
Received: from server217-3.web-hosting.com (server217-3.web-hosting.com [198.54.115.226]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 427503A098B for <int-area@ietf.org>; Mon, 23 Mar 2020 08:41:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=strayalpha.com; s=default; h=To:References:Message-Id:Cc:Date:In-Reply-To: From:Subject:Mime-Version:Content-Type:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=m0Gn2B8kiboiPIolR25hZCw+UYY4EeIR3c3ocl0s5WA=; b=yg/7pqS53420xxm47okthqGDc pZHS/ThsOadlLnpWiyDK8U+SjPqhiFFRl5sjgFmRwpg0R2V575gtkJmPdscdcqU6H7mzcoWX+0ml0 ZW0WxSybUaqgIovPqwCyE0J5OmR8vxFdZbTHNjwS3RXSmw4zJM8rTwXGwQOvq2n5zbm7Y+6wW7go7 WUZTwmvy6KG4D0EvVZ1gd19ifiSl7pZE5NPB4xJ6nfgmHFB3nMfzYb/C1JCIdcWAEJIhuc68PhqfS Kqie3GYDz/qGvTVIvyjAPVliGWqUM4tdPbb55ayKPzfTBdlMyS2TMKySGFEFoggPR90j7aiZgPEuV 5UEI3hZ2g==;
Received: from cpe-172-250-225-198.socal.res.rr.com ([172.250.225.198]:56746 helo=[192.168.1.10]) by server217.web-hosting.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92) (envelope-from <touch@strayalpha.com>) id 1jGPCr-000vTY-SF; Mon, 23 Mar 2020 11:41:42 -0400
Content-Type: multipart/alternative; boundary="Apple-Mail=_00DB7205-7F6B-4946-A727-F06C5ACECB4B"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Joseph Touch <touch@strayalpha.com>
In-Reply-To: <ab1de07e-6284-5fe6-ef0d-46303f996354@redhat.com>
Date: Mon, 23 Mar 2020 08:41:37 -0700
Cc: int-area <int-area@ietf.org>, Suresh Krishnan <suresh@kaloom.com>
Message-Id: <6BD67898-E2AB-4628-9A0D-4AEAC790EFA0@strayalpha.com>
References: <DC440B28-DA08-499F-8A2A-7A8ACF880724@kaloom.com> <A6B82786-FB50-4AAA-8D69-0A55FEB5DC3B@strayalpha.com> <4bad2d30-0220-a836-451d-b01fdba4d098@redhat.com> <0C774D74-89A9-44CB-BCE7-A0ACC138C10F@strayalpha.com> <4cd43b9b-f7fa-0fc5-3ba9-11a735268288@redhat.com> <BAAD573B-497C-4F86-AF7A-776781698717@strayalpha.com> <eb054946-0bbe-ce6b-3a7d-6e2630ae4c6f@redhat.com> <E206BEE8-C157-4733-924F-649C94321E03@strayalpha.com> <ab1de07e-6284-5fe6-ef0d-46303f996354@redhat.com>
To: Jon Maloy <jmaloy@redhat.com>
X-Mailer: Apple Mail (2.3445.9.1)
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server217.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - strayalpha.com
X-Get-Message-Sender-Via: server217.web-hosting.com: authenticated_id: touch@strayalpha.com
X-Authenticated-Sender: server217.web-hosting.com: touch@strayalpha.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/tP5AdtVl1YIpZp2I8ImubVn2SQc>
Subject: Re: [Int-area] IP Protocol number allocation request for Transparent Inter Process Communication (TIPC) protocol
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Internet Area Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Mar 2020 15:42:30 -0000

Jon,

First, if you’re going to come to the IETF asking for something as core as an IP protocol number, you need to be able to explain your system to us in our terms.

That means explaining things below in the following terms:
	UDP/TCP and nearly anything over IP = transport
	IP = network
	Ethernet = link

UDP/IP isn’t a link layer to us; what you’re really asking for, FWIW, is to be a *transport protocol*, but that’s not quite what you want either (see below).

Second, if you want an IP protocol number, your system has to “buy-in” to the IP model in which IP unicast addresses are endpoints, not logical identifiers.

>> ...
>> Type in www.google.com <http://www.google.com/>
>> 
>> Now type in its IPv6 address.
>> 
>> Now see if you remember google’s website DNS or its IPv6 address. That’s what the DNS was originally intended for.
> Yes. But in this case also demonstrates that both DNS names and the IP address may be location independent. We have no clue whether a call will end up in a server farm in the US or Europe, let alone which server it will be handled on. So, even though the original purpose of DNS may have been something else, it has clearly followed the obvious path of becoming a tool for location independence. This is good, but not good enough for our purposes.

Please be more specific in what you’re seeking then.

>> 
>>>> 	DNS names are no more or less location-independent than IP addresses.
>>>> 
>>>> This is also why DNS was invented...
>>>> 
>>>> 	False. The reason the DNS exists has nothing to do with location. It’s simply string substitution for convenience, or at least was ONLY that originally.
>>> 
>>> I think you just supported my case for a location independent addressing scheme.
>> 
>> I am - but then I’m baffled why you want to run direct over IP. Ethernet has location independent addresses; IP does not* (see next part).
> 
> When I am talking about location independence I am always talking about what the socket programmer/user sees.

IP isn’t about that. It’s about what the network sees.

> We don't want him to handle IP addresses, and we probably don't want him to hard code DNS names either.

Please clarify - do you want to hard-code anything? Or have the user type it in?

> But, at some level further down in the stack we never get around translating location independent addresses to some form of location dependent ditto in order to transmit the packets to the right node and socket. Be it MAC, IPc4, IPv6 or anything else. 
> 
> This is what we do in TIPC :
> 
> Socket Layer:            {service type, service instance}                 {port number}

The Internet uses service names for that (e.g., HTTP, HTTPS, etc).

If service name lookup over the Internet using DNS is too slow, then replace it with a different lookup mechanism or implementation. But it’s still DNS and DNS SRV records equivalent at that point.

> ------------------                                  |                                                          A
>                                                        v                                                          |
> TIPC Binding Table:  {port number, node number}                                   |

Please explain what a node number is...

> -------------------------                          |                                                          |
>                                                        v                                                          |
> TIPC Link Layer:            {UDP port, IP address}                       {UDP port, IP address} 
> -----------------------             or {MAC address}                                or {MAC address}

How is a UDP port different from your port?

How is a node number different fro your number?

> The {UDP port, IP address} tuple (or MAC address) at the link layer are never visible to the user,

That’s how Internet protocols already work...

> and may change on-the-fly without him ever noticing. 

That’s where you lose me. You want IP, but this isn’t IP. This is Ethernet, at least as I uses it.

> The same is true for the {port number, node number} tuple,

Why?

If everything in your system changes on the fly, what stays the same?

> although the user here has the option to use those directly, at the expense of location transparency.
> So, our request is simply about enabling us to use a third mapping at the link layer, an IP address only. This does not in any way interfere with the location transparency that is already provided at the socket level.

My point is that you’re not showing us how this helps. You simply want something - I understand that. But you have to show you NEED it. Everything you’re saying are reasons why you actually don’t want or need it.

Further, let’s say you get an IP protocol number. Why wouldn’t that be among the many things here that needs to “change on the fly” too?

> 
>> 
>>> This was one of the original motivations for developing TIPC in the first place.  A programmer using TIPC can hard code his service addresses if he wants to, ignoring the number of or location of the corresponding endpoints, even as those move around or scale up/down quite fast.
>> 
>> Anycast gives you location independent addresses at the cost of doing discovery “inside the network layer”.
> 
> Yes, and that is what we do. But for this to be of any use, that discovery/translation has to be blistering fast, and that is also what we do.

You don’t need an IP protocol number for that….

> 
>> 
>> However, even if you have those addresses, you still need to identify the service types (which is what we use ports for).
> 
> UDP (at the link level) has only one service type in this case: “TIPC"

That’s an identifier for your service - you can easily add whatever additional identifiers you want inside that and demux to support dozens or even billions of different sub-services.

> At the socket level we are using TIPC service addresses for this, i.e., a {service type, service instance} tuple, each element being a 32-bit integer.

That, IMO, is an ID that belongs *inside* UDP port TIPC. That is YOUR service type/instance, not the Internet’s. The Internet should consider this all a single TIPC service.

>> 
>> ——
>> 
>> I’m still stuck at why you want to run direct over IP. If you want Ethernet that bridges across routers, GRE does that.
> 
> Yes, we could use VxLAN or Geneve or whatever. But that always comes to a cost both in performance and maintenance.

I can’t speak for IP protocol numbers, but Internet transport port numbers are not assigned or performance reasons. 

> We want TIPC to be both performant and really simple to use.

You seem to have a lot of competing goals. You should consider the rule of home contractors - fast, cheap, good - pick two. The same applies to nearly all systems design decisions.

> 
>> If you want loc-independent addresses for services, UDP over IP using anycast does that.
> 
> Again yes, but IP is normally not location independent inside clusters. 8.8.8.8 may be perceived as location independent, but 192.168.100.17 is typically not. And UDP has well-known limitations:
> 
> 1) - UDP has 16-bit port numbers, a number space which has to be strictly managed.
>     - TIPC has a 32-bit+32-bit service address instead. This is what we want
>       to extend to 128+128 bits, so that nobody ever needs to register a 
>       well-known address for TIPC. At least not for the purpose of 
>       avoiding collisions.

What you want, IMO, is a field t the front of a UDP TIPC port packet. YOUR service IDs are not the Internet transport port services; they’re components of what the Internet architecture considers “the application layer’ (which is merely whatever runs over UDP/TCP/SCTP/DCCP).

> 2) - UDP is best effort. 
>     - Standard TIPC anycast is "better than best" effort, because packets will 
>       never be lost in transport. Due to lack of socket level flow control, there 
>       is still a risk of seeing messages being dropped, though.
>     - Group anycast DOES have end-to-end flow control, so such messages 
>       will never be lost or disordered.

Raw UDP isn’t what you seek, so do what you want *over* UDP. Nothing stops you and nothing makes your protocol only offer only what UDP has. E.g., see QUIC.

> 3) Furthermore, we have reliable multicast and broadcast using the same
>     address type. There is no way you can get that with UDP.

See my response to #2.

>> 
>> What is the specific gain of needing IP but not allowing a transport? AFAICT, it’s all down to GSO - which is an implementation. If GSO doesn’t do what you want, it would be useful to take your issues there or edit the code yourself and submit the patches.
> 
> In that respect this is only an implementation issue, as you say, but it is not a TIPC only one. 

Perhaps, but you’re the only one asking for a new IP protocol number to solve it.

> The slides referred to me by Tom Herbert describe GSO on large UDP messages, but they don´t describe how we go one step further and do it on the inner messages, or how we identify those as being TIPC in the first place. Furthermore, we would have to re-write the host level GSO support, which am highly uncertain that the Linux network community would accept, given that everything needed already is there (i.e., if we only have a proper protocol number.)

So let me get this straight:

- you want an IP protocol number, a limited resource of the entire global Internet
- because you’re concerned that Linux won’t take your code?

I strongly suggest trying that first and if it fails, then perhaps make your own Linux release or patch.

I.e., this is not an Internet protocol problem.

> GSO is only one of the reasons for our request. There are more reasons:
> - Performance. The difference is not dramatic, but clearly measurable.
>   Terminating sockets in kernel space comes at a cost.

That’s an implementation issue...

> - The need to be able to register a new socket type, which will map down
>   to a (compatible) TIPC v3 protocol.

That’s another implementation issue.

> - Acceptance. We want to have TIPC recognized as a part of the IP protocol
>   family, controlled by IETF, like most other protocols.

It already is, but it’s recognized for what it is to the Internet protocol family - a service, not a transport. 

Also, FWIW, making it be its own transport will only ensure it won’t get through most firewalls.

Joe