Re: Need your help: different connection IDs in the same datagram

Ian Swett <ianswett@google.com> Wed, 15 July 2020 21:17 UTC

Return-Path: <ianswett@google.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF1223A088F for <quic@ietfa.amsl.com>; Wed, 15 Jul 2020 14:17:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g-5nvOmN0iUJ for <quic@ietfa.amsl.com>; Wed, 15 Jul 2020 14:17:30 -0700 (PDT)
Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6BC563A0885 for <quic@ietf.org>; Wed, 15 Jul 2020 14:17:30 -0700 (PDT)
Received: by mail-yb1-xb2c.google.com with SMTP id y13so1808951ybj.10 for <quic@ietf.org>; Wed, 15 Jul 2020 14:17:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SUuL6CYRKpqR2IJ6P8MPSo2fVhAC/vrQru1vHmT9374=; b=HfF2lvu7ECaeKsA3EvkNF9Qyyv7a0/BzXwlowO7G0M5UnbBCLjR7YU1TIYSWTlJ28V 0Fq99v/wv3GexfYzbwUNxV+/I5kv7zppNq2zVzISvGtHaeGssIRhjzCm/R1Ziwufj53f h8PHJ3LjtBUatjaRLjsy1SLPnqNpINwnDx1TOkUL3ueYhwdaRiNT/TKbb+Do+X9WDEKu S33vqDhVMRkvzu3vpJ6QoBBraD+Y8fTyHz9nznsqpWaOO7dmY28UN6LLTUmE25uFwVKV TcPAUfTK72zBcpMr6nPBoVjnea7+K1TJGXf/DJKCZfmzf/0W+i3oJBrY1L1ZuVvF5h4q h0jg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SUuL6CYRKpqR2IJ6P8MPSo2fVhAC/vrQru1vHmT9374=; b=iZVJJgnj1RLFNoIMpAwC96XgI/4yw9RqMmPQf+GfYcxwq9DDTzAmYVtUQ115lUegSv E2FSG5GSemKPsmlgWqr7c2BnX/8FrfSnwvjqtM/kYPRLZrOl6da0i//27riUm24c8CSr PbXHXx9Q+jGHiNVm1UULENTHZSrdnKKbgNS2gKamjOMuKlwnFqXfgZIOCm7ZpVnPsbLj CyFpLyZVW8b4BCl9F8LujGzjiSrulIGoziW9gT2d0ywEx+oZ8Wukeyj+8zAu/g8nnqDV h+SaX0GjAEQVHuIljU9OphweOjxoH21Qc4U1m5AMr6WprynqFGTzbIEXkJLewZHJfe6J ekPw==
X-Gm-Message-State: AOAM533iUkVFWA/Aa/lcFO8R2rQNH1z5Qdo/HAJ5dNDZAm6dnzeUpM91 QhCbkWodMH0eupN8DmoDEE6fzCV8ASXPsbeXhcHn/gcE
X-Google-Smtp-Source: ABdhPJyPhqY0Z7tCTlaAnB/4g1zocHVTFzywp3K+lGlyNLINNWZzwtzkUGXeLixpbKcmsIGJQvDAq/7w7cSTlvvQ50E=
X-Received: by 2002:a25:b78a:: with SMTP id n10mr1463574ybh.494.1594847849274; Wed, 15 Jul 2020 14:17:29 -0700 (PDT)
MIME-Version: 1.0
References: <ae21cc02-3357-40c8-a1e9-3966fdf575a5@www.fastmail.com> <20200715180231.GB9808@lubuntu> <CAKcm_gPfc3sFy0kuyTzUFk2XFMZ8NdXTd7CuNf0o0v+RXDG=xg@mail.gmail.com> <CH2PR22MB20861D3BEA06EE61AA882245DA7E0@CH2PR22MB2086.namprd22.prod.outlook.com>
In-Reply-To: <CH2PR22MB20861D3BEA06EE61AA882245DA7E0@CH2PR22MB2086.namprd22.prod.outlook.com>
From: Ian Swett <ianswett@google.com>
Date: Wed, 15 Jul 2020 17:17:18 -0400
Message-ID: <CAKcm_gP24=x64QwcCdki-jxFXsLKiTPH8UYFdGOG3H1YJgEbOQ@mail.gmail.com>
Subject: Re: Need your help: different connection IDs in the same datagram
To: Mike Bishop <mbishop@evequefou.be>
Cc: Martin Thomson <mt@lowentropy.net>, IETF QUIC WG <quic@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000087508605aa8173bd"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/b0o4nwxWhZbdjYEG78HIK861VZA>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jul 2020 21:17:33 -0000

I forgot there's a contradiction, because I thought we disallowed sending
mixed-CID packets.  So I'd prefer changing to MUST NOT coalesce packets
with different CIDs.  Even if you generate them and then coalesce
them(which we do), that's not that hard to enforce in the coalescing code.

On Wed, Jul 15, 2020 at 4:42 PM Mike Bishop <mbishop@evequefou.be> wrote:

> Fundamentally, I think there has to be a change, because we currently have
> an inconsistent mandate – mixed-CID packets are acceptable to send, but
> SHOULD be dropped on receipt.
>
>
>
> First, there’s the privacy argument, in that the CIDs in the same datagram
> will become linked to external observers.  I think Marten has already
> argued convincingly why this will be rare during a typical handshake;
> Christian and Kazuho have argued that a privacy-sensitive implementation
> will need to do a CID jump once the handshake is confirmed, at which point
> you’re mostly not coalescing packets anyway.  So this is a mild argument
> for not mixing, but I don’t think it’s dispositive.
>
>
>
> Second, the implementation arguments appear to boil down to two camps:
>
>    - Implementation X generates packets independently, then packages them
>    into datagrams.  Since all packets waiting for packaging are from the same
>    connection, there’s currently nothing to check to see whether they’re
>    allowed in the same datagram.  Requiring the CIDs to match would require a
>    new check and a code path for *not* coalescing packets in certain
>    cases.
>    - Implementation Y consumes packets within a datagram independently,
>    so the validation has to be done at the datagram level before doing any
>    packet-level activities.  A requirement that can be evaluated solely on the
>    contents of the datagram, independent of any connection state, is more
>    efficient.
>
>
>
> Of these two, I currently find the latter slightly more persuasive.  The
> first is a check that can be done between packets without needing to access
> any connection state, and there are already presumably code paths for
> handling when a waiting packet can’t go in the datagram currently being
> constructed (e.g. it’s too large to fit the remaining MTU).  However, I’m
> sure someone with such an implementation could tell me why it’s more
> complicated than that.  😊
>
>
>
> Neither of the resolutions seems more technically correct than the other;
> we just need to pick one.
>
>
>
> *From:* QUIC <quic-bounces@ietf.org> *On Behalf Of * Ian Swett
> *Sent:* Wednesday, July 15, 2020 3:31 PM
> *To:* Martin Thomson <mt@lowentropy.net>; IETF QUIC WG <quic@ietf.org>
> *Subject:* Re: Need your help: different connection IDs in the same
> datagram
>
>
>
> I don't think this change would be difficult for our implementation, but I
> also don't see it as necessary.  Given where we are in the process, that
> alone argues for not changing it I believe.
>
>
>
> On Wed, Jul 15, 2020 at 2:02 PM Dmitri Tikhonov <
> dtikhonov@litespeedtech.com> wrote:
>
> On Wed, Jul 15, 2020 at 05:23:57PM +1000, Martin Thomson wrote:
> > There has been some opposition to the proposed resolution in PR 3870.
> >
> > Apparently, for some, having multiple connection IDs in the same
> > datagram complicates processing.  I don't understand this objection.
> > It seems to me more difficult to retain state across packets than it
> > is to process each atomically.  I was hoping that Christian or Nick
> > can explain more about how this affects them.
>
> I can provide an example from lsquic.  The datagram is parsed into
> QUIC packets in one function, lsquic_engine_packet_in():
>
>
> https://github.com/litespeedtech/lsquic/blob/v2.18.1/src/liblsquic/lsquic_engine.c#L2781L2816
>
> Each QUIC packet is processed by process_packet_in(), where a
> connection is looked up:
>
>
> https://github.com/litespeedtech/lsquic/blob/v2.18.1/src/liblsquic/lsquic_engine.c#L1352L1360
>
> The DCID check is performed lsquic_engine_packet_in(), before
> process_packet_in() is called:
>
>
> https://github.com/litespeedtech/lsquic/blob/v2.18.1/src/liblsquic/lsquic_engine.c#L2793L2806
>
> The DCID information is readily available in the datagram parsing
> loop, while connection information is not.
>
> For lsquic to support the proposed change, it would have to remember
> the current connection and then query it whether it is indeed the
> owner of the next DCID (A) or look up DCID in the global hash (B):
>
>     conn = NULL;
>     while (quic_packet = parse_udp(pointers)) {
>         dcid = parse(quic_packet);
>         if (conn)
>         {
>   #if VARIANT_A
>             if (!conn_owns_scid(conn, dcid))
>   #else
>             if (conn != lookup_by_dcid(dcid))
>   #endif
>                 continue;
>         }
>         conn = process_packet(quic_packet);
>     }
>
> Not that it could not be done, of course, but it is both extra work
> to modify lsquic and a more inefficient mechanism: what was a simple
> CID comparison is now a hash lookup.
>
> That's why I argued [1] for having solid rationale behind the change
> rather than a personal preference.
>
>   - Dmitri.
>
> 1.
> https://github.com/quicwg/base-drafts/issues/3800#issuecomment-656851626
>
>