[tcpm] hystart++ -03 review

Mark Allman <mallman@icir.org> Tue, 12 May 2020 19:54 UTC

Return-Path: <mallman@icsi.berkeley.edu>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 85D173A0A27 for <tcpm@ietfa.amsl.com>; Tue, 12 May 2020 12:54:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.648
X-Spam-Level:
X-Spam-Status: No, score=-1.648 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FW5Ht6G-T1j9 for <tcpm@ietfa.amsl.com>; Tue, 12 May 2020 12:54:31 -0700 (PDT)
Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 10A493A0A39 for <tcpm@ietf.org>; Tue, 12 May 2020 12:54:30 -0700 (PDT)
Received: by mail-ot1-f53.google.com with SMTP id d26so280080otc.7 for <tcpm@ietf.org>; Tue, 12 May 2020 12:54:30 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version; bh=RO3mm+2wbVokKQtxHciPrKiaq/IxXtXtcUTAhdn288E=; b=IfNLHT+I0575k8jetbJqEhB2tGMTzaVHvq+aTZsQEHbVOmuiwdbMjJDNaDiaNxom/O WB6whrRT07fvEHQJxz/K41o465uZ4qtLe+aC3Y3iT4bFiUHnlfDfEUTlS9Suz7FMBo4Q TtBHNFCPAzRw5h1WMO/Mnxaje540G7ZCe1CsoPaAssJ4SDtGEm3nFFtm0EJXMCuijti8 4IwlHf8rHexLs43dZ6k0hZOwmKOaSnBKwkfbonNowGvMW+fu7Mf/mxaxkApvq4ObTW+n OMzpQJLuLlKq1FGJOQKcxpWp0JhjMjPSYB2Wt4/sf12zT0VTX1ek+yS+elFC4sY0yJ3o gEMg==
X-Gm-Message-State: AGi0Pub9EzvQxC9n/R9cRXtni8p5MJyJr9x00PGcwpYta2zPq1XvtuUX UMm7n3csk2RHQMEEh1cPSPMQd0vlvV7WIw==
X-Google-Smtp-Source: APiQypK+vjdl0MhkZgxDG8R3nzs3+LHLs3kjOWylefKsv2mgo0nJ6WA7DWWZ31MARtF4zccpv7ckKw==
X-Received: by 2002:a05:6830:1082:: with SMTP id y2mr17571686oto.123.1589313269829; Tue, 12 May 2020 12:54:29 -0700 (PDT)
Received: from [192.168.1.244] ([2600:1700:b380:3f00:5901:392d:2787:d65b]) by smtp.gmail.com with ESMTPSA id l26sm1874706oos.43.2020.05.12.12.54.28 for <tcpm@ietf.org> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 May 2020 12:54:28 -0700 (PDT)
From: Mark Allman <mallman@icir.org>
To: Extensions <tcpm@ietf.org>
Date: Tue, 12 May 2020 15:54:27 -0400
X-Mailer: MailMate (1.13.1r5671)
Message-ID: <C9ED1343-E2A4-450C-9907-7B1FD2711ECA@icir.org>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=_MailMate_B60A6DD6-427C-4991-A9DC-4CF184882BA1_="; micalg="pgp-sha1"; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/pzM9k9wMMOVs4xNZiSnn_ljoF1w>
Subject: [tcpm] hystart++ -03 review
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 May 2020 19:54:33 -0000

I read version -03 of the HyStart++ document and my view differs
from many, I guess.  IMHO, this document is not ready for adoption.

The draft is pretty anemic and I am not sure there is enough there
to actually judge whether it is or is not a good idea.  It seems
like perhaps the document could get there, but at this point I think
the authors have some work to do.  Review below.

allman




Section 1:

  - Nit: RFC 793 does not define slow start, remove ref.  (Could add
    a ref to original Jacobson paper.)

  - HyStart++ is "widely deployed", but is it widely used?  Is it
    enabled or just there for the enabling?  The document makes it
    sound like we have a lot of experience with this somehow, but
    none of that is shared (see below).

  - Nit: Spell out "LSS" on first use.

Section 3:

  - Nit: Totally fine to crib the definitions from RFC 5681, but
    it'd be good form to note that is where they came from.

  - Nit: Also, since you use rwnd in one of the definitions and
    you're copying things, you might as well copy that one, too.

Section 4:

  - There is so much unexplained magic in here.  When setting
    RttThresh, what is magic about 1/8-th of the lastRoundMinRtt?
    Why clamp this between 4msec and 16msec?  Would that really be
    reasonable for all networks?

    We need magic numbers sometimes and as long as they're
    resasonably chosen and motivated we have to live with having
    them around.  But, the draft really needs to explain these
    choices, how they were made and how broadly applicable they
    might be.  Further, the document suggests one MAY vary the
    constants, but it'd be great if it gave some sort of advice on
    how one might decide if different constants were OK or not OK.

  - The algorithm (series of equations) is here and clear.  But, the
    document really needs to give us a flavor of the why.  Talk us
    through what is happening and why in here, please.

General / meta:

  - The draft offers a little intuition on why this is a good idea
    (i.e., to prevent massive slow start over-shoot).  But, there is
    no sketch of evidence that shows it is effective.  I see this
    was previously mentioned on the mailing list and Praveen said:

      We presented lab measurement data at the last IETF. We are in
      the process of doing some A/B experiments at scale with real
      world workloads. We did fairness comparisons in our lab
      measurements between Hystart and non-Hystart and we didn't see
      any problems.

    The bar ought to be higher than "didn't see any problems".  It
    is great that there is some testing done and planned.  I'd like
    to see this all sketched in the document itself.  I am not even
    close to convinced that this is a good idea.

  - In some sense, this idea makes TCP more conservative (i.e., we
    stop exponential increase sooner than RFC5681 would suggest).
    So, the bar is likely not huge for saying this is "OK".  But, I
    think we should have some notion that this helps *something*
    before agreeing to suggest it as somehow a reasonable thing to
    do in an RFC.  Right now this document doesn't concretely show
    that HyStart++ is useful in any way.