Return-Path: <tom@herbertland.com>
X-Original-To: stackevo-discuss@ietfa.amsl.com
Delivered-To: stackevo-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1])
 by ietfa.amsl.com (Postfix) with ESMTP id 9329C1A8A7B
 for <stackevo-discuss@ietfa.amsl.com>; Tue, 22 Dec 2015 10:49:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.621
X-Spam-Level: 
X-Spam-Status: No, score=0.621 tagged_above=-999 required=5
 tests=[BAYES_40=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
 FM_FORGED_GMAIL=0.622] autolearn=no
Received: from mail.ietf.org ([4.31.198.44])
 by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id JeOUWjrjIT7V for <stackevo-discuss@ietfa.amsl.com>;
 Tue, 22 Dec 2015 10:49:36 -0800 (PST)
Received: from mail-io0-x233.google.com (mail-io0-x233.google.com
 [IPv6:2607:f8b0:4001:c06::233])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by ietfa.amsl.com (Postfix) with ESMTPS id 3FFDE1A8A7A
 for <stackevo-discuss@iab.org>; Tue, 22 Dec 2015 10:49:36 -0800 (PST)
Received: by mail-io0-x233.google.com with SMTP id 186so197922204iow.0
 for <stackevo-discuss@iab.org>; Tue, 22 Dec 2015 10:49:36 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=herbertland-com.20150623.gappssmtp.com; s=20150623;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type:content-transfer-encoding;
 bh=/+rStdITMkKy7Zv+fTe+vokrrtEHv+RnVwjbpQS/ZBk=;
 b=gRKosuH+M/QjEJGSmWTekhU2O1FuHZxM093WLyzgUs4rBjbRjd6Mmy5ZAbB6A6pLXI
 U43gorXgJgHAUceHjGkBacWgceo23COTAh4IXiQhLFvVE7QAmJ1udKOb5hOONFrbw7mm
 g9TUVWvuO8+cfee6GNY1SZny7wbxq/0/9QmZY+yaapUmY1cBPhIwu2wQK6oaXLtZuU56
 QixgX0orUB8qoLb3DmOzc2aaW1lFDQ3LwaiVI7Gd17SUPqtZmGpvJCMJIy+XYNZmauXa
 2FSUbYvDa5sodk9POQHpnuR5qcuH31BIBfhb0dg+L7svEjkx1G+QAAgkOqSqYE4MV1Kh
 E9zw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type
 :content-transfer-encoding;
 bh=/+rStdITMkKy7Zv+fTe+vokrrtEHv+RnVwjbpQS/ZBk=;
 b=J1g/oNDGnBAVYgTAr7ZWxLu7iL3g013XNYyXzBwlT+qh0x/LTal0UJ4yPiyf7Gra1x
 YArEKvF6cmd5elnXkjNRm5Tan7sULxPftdtNmXI26r1s7A9oaG3Jxv3seIh/iy4d6/9s
 ONfGIPbllIT8954HCWOGyQeJjoR57uNUhG7kudXbkNdDExnPg8Q7OOuHcApBrlzKWEbX
 BTz/vYzge059frElo4im2YG5lff94djvUc2FoLTmT/xxKeSGxiGlfmEPUZds04ubwaew
 fxwqvkkTOIO6KfgS2s0RGHejGGI9Tx4Tjgua7wvMqmMGsOx4fhmpZy5DDcdJVJqzxOHK
 IbTQ==
X-Gm-Message-State: ALoCoQnUE9icaM+wUWDmQVvustYNHnUKepQ+4UCQOtDj/i+RmqjDcMac9egNU7aIzgIPO8jYC6AkjIOCQhZcVOfZfHvbHumQlA==
MIME-Version: 1.0
X-Received: by 10.107.135.23 with SMTP id j23mr3352381iod.50.1450810175576;
 Tue, 22 Dec 2015 10:49:35 -0800 (PST)
Received: by 10.107.140.150 with HTTP; Tue, 22 Dec 2015 10:49:35 -0800 (PST)
In-Reply-To: <6689CCA2-CDC4-44AF-BFD4-270EA6E154F4@trammell.ch>
References: <CALx6S37p4aXhhXf0THRFde8R6Vaf+ouYO2jDz+pKWiXbAa5w4Q@mail.gmail.com>
 <6689CCA2-CDC4-44AF-BFD4-270EA6E154F4@trammell.ch>
Date: Tue, 22 Dec 2015 10:49:35 -0800
Message-ID: <CALx6S36Se5LZ0=6-d+d6O8kOWLGeo=-OCgyjhsm2sFLVj4b2Lw@mail.gmail.com>
From: Tom Herbert <tom@herbertland.com>
To: Brian Trammell <ietf@trammell.ch>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/stackevo-discuss/ScTbEfxUprM6TNCM1OI72mcmZFY>
Cc: stackevo-discuss@iab.org
Subject: Re: [Stackevo-discuss] Scope of stackevo and ossification in DC
X-BeenThere: stackevo-discuss@iab.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IP Stack Evolution Discussion List <stackevo-discuss.iab.org>
List-Unsubscribe: <https://www.iab.org/mailman/options/stackevo-discuss>,
 <mailto:stackevo-discuss-request@iab.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/stackevo-discuss/>
List-Post: <mailto:stackevo-discuss@iab.org>
List-Help: <mailto:stackevo-discuss-request@iab.org?subject=help>
List-Subscribe: <https://www.iab.org/mailman/listinfo/stackevo-discuss>,
 <mailto:stackevo-discuss-request@iab.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Dec 2015 18:49:37 -0000

>> Similar to the use of protocols on the Internet we are hitting the
>> transport protocol ossification problem in the data center.
>> Specifically, performance optimizations in networking devices only
>> support TCP or UDP, and without these optimizations this negatively
>> impacts our use of other protocols.
>
> Right. But this is a fundamental problem, I think. NIC offloads reach pre=
tty deeply into the transport protocol, and as such won't work with new tra=
nsports whether they're encrypted or not until those new transports. A ques=
tion: for NIC offload, how much of the win comes from segmentation offloadi=
ng, and how much comes from other trickery? If the biggest win really is bu=
ndling a bunch of packets into a single context switch, then how would the =
performance of the current offload architecture compare with a smart librar=
y on top of approaches like netmap?
>
Segmentation offload (RX and TX) is considered win because it reduces
the number of packets that need to be processed through various layers
of the stack. This becomes really evident in deep layering such as we
see with network virtualization. However, most of the benefits can be
achieved with software mechanisms and LRO (RX segmentation offload) is
pretty controversial since the device is compressing TCP headers and
that has had some insidious effects. Checksum offload and RSS are the
critical offloads we need.

>> One example of this is the need
>> for fine grained ECMP which has become driver behind many of the
>> foo-over-UDP proposals (e.g. MPLS/UDP, GRE/UDP, ...).
>
> So this is a separate issue -- ECMP is a (semi-elegant) hack, predicated =
on the assumption that things on a five-tuple need to stay together and thi=
ngs on separate five-tuples don't. NAT + TCP (any reordering-intolerant tra=
nsport, really) makes this assumption more or less hold. Driving it in the =
opposite direction -- using knowledge that there's ECMP on path to do cheap=
 traffic engineering -- leads to the unintended consequences that foo-over-=
udp brings with it.
>
> What you really want architecturally is a way for the network layer (at a=
 gateway) to explicitly say "keep these packets together" and "it's okay to=
 split these packets apart". It'd be even better if we had a way to request=
/measure/enforce actual path diversity without manually managing tunnels, b=
ut this is sadly explicitly a non-feature of our routing protocols. In any =
case this seems to have a harder incremental deployment story than simple t=
ransport state exposure.
>
IPv6 flow label for ECMP (RFC6438) solves the problem of ECMP/RSS.
With the use of this, devices don't need to parse beyond the IPv6
header to switch packets and we don't need to have the overhead of UDP
encapsulation just for the purpose of getting good ECMP.

>> This problem is likely a proper subset of the general problem, but
>> might be more amenable to some "simpler" solutions. Is this within
>> scope of stackevo?
>
> It very much seems to be, yes. Let's keep this discussion going on this l=
ist...
>
Protocol ossification is also now in the vernacular of Linux
networking: https://lwn.net/Articles/667059/

Tom

