Re: qlog, AckFrame, and ack_delay

Lucas Pardue <lucaspardue.24.7@gmail.com> Tue, 14 November 2023 20:03 UTC

Return-Path: <lucaspardue.24.7@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D1F2AC151710 for <quic@ietfa.amsl.com>; Tue, 14 Nov 2023 12:03:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.855
X-Spam-Level:
X-Spam-Status: No, score=-1.855 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rinmf-v_eDXk for <quic@ietfa.amsl.com>; Tue, 14 Nov 2023 12:03:06 -0800 (PST)
Received: from mail-oa1-x34.google.com (mail-oa1-x34.google.com [IPv6:2001:4860:4864:20::34]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0ABA6C15C2AA for <quic@ietf.org>; Tue, 14 Nov 2023 12:01:28 -0800 (PST)
Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-1e9c42fc0c9so2958277fac.1 for <quic@ietf.org>; Tue, 14 Nov 2023 12:01:28 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699992087; x=1700596887; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=HymzL0aClsudQtUMcf80Dz1dAThe7rjyflDanDVEdKs=; b=aOX9Et9BkyC/ppwJEnTXZtHG+Y/8dttWZV0EOBFYfNh2ErVlBWB3gSzfxv/XVGIYCC sbwRBYLoXiwRBfk3LN9f7TFl4yYEFiikTJqVbrCuysLRNoDO6eRGfpwhsMrYEgi1Yt8G 5z/Ts4FBDrO+xvdDlYeePVe6UAcUndkPRkfk6081GZNX2aTT3o2wJoml/qujWetlSuh8 QQNGgu4NtBvXt6QdxUHjje/5T4fu+llYq+F1ZkWmRonplQeXrilybN6omFZlhI2um6PD aYKwdI0VIxO5uDv2CBm/IOXkVRc6oKa9S0tINNoVGWM2Ex4DN3FaYbym064lqL8Zp/p2 vfWA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699992087; x=1700596887; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HymzL0aClsudQtUMcf80Dz1dAThe7rjyflDanDVEdKs=; b=YH4eBdi5UPYtKwE5cdnQe1gBZZAyDp6T/nmgCq/72OR+kwXiUugwA19Swrqgp5v8JK 1LZozW/0Ok+3dAqTZMKoAsvd09JJITSeXV4jptAQCr8BpEjG+6GJJV09Acdq0AF3zvP9 r60mtoINESkXLUg+G0tqaMGn67qHjFfdNqwELST3VwXsKJ+nUJF7Iz1i9oboY500QAdA p+hlProvLhZl0H3Lr9eApedKz3M5Y4i9GhkahhrDJEwv4n/VktHdzUnPVp45gwBwESon pvB0Od+Qhvzq+kMDpRmnAhKLOh63Q8BMEQoLNz3k5BZluvtRcz6pwGWf+em8oc83kFQd RDMQ==
X-Gm-Message-State: AOJu0Yy1TwGo9uu1BGDNPSUbKzJdnTm+TRjqVNjYYlAlgL98MxubbhN9 NNf+4KNWDLE3JTFEZbC966R754Io4HROKnlNjCA=
X-Google-Smtp-Source: AGHT+IG/Emvu2Ah2s/pU9zwRrei+scd72KkE0BCTMFn+1NoiuckP1KmD+4r7PM+PtvUohCSIt/KMZKJTZEsfUMZkM3I=
X-Received: by 2002:a05:6871:329b:b0:1f0:8120:88a6 with SMTP id mp27-20020a056871329b00b001f0812088a6mr11699468oac.58.1699992087099; Tue, 14 Nov 2023 12:01:27 -0800 (PST)
MIME-Version: 1.0
References: <CAGgfL4tDDb7jZ-e-_STZ8SFPESyuPtqfFesQZOjzX_5i0XpAdw@mail.gmail.com> <CAOYVs2p2Z1JMa97uXF0SYYzsw_AOFP0-G9R+PVfeMwRw46Gw-w@mail.gmail.com> <CAGgfL4u=T-f=Uft=0-cLN9Jte4RETwYVq1UsmDVUER5geDnMiw@mail.gmail.com> <CALGR9oZU7ppYX8WDob_NXjy2exn5rNT4zLojKtaqP7b2P3gTag@mail.gmail.com> <CAGgfL4vrkSgCbOtPOFZ0c6XEKSe5JPxApbKh5KkMHP0VZGPAZg@mail.gmail.com> <CAOoMnrg35P2F5qm_5Xqwr0DE4uFQ8OVKSY3atCJ_MTCV_VkhbQ@mail.gmail.com> <CAOYVs2o6Ks5CrrrrHew06szbdSuiDqa=RU5-zVUy4-rkAE6AbQ@mail.gmail.com> <CAGgfL4uvsmd7uSfbbQg3zrXgLjXZmEYkXZ0LdZVcaPhCc=778g@mail.gmail.com>
In-Reply-To: <CAGgfL4uvsmd7uSfbbQg3zrXgLjXZmEYkXZ0LdZVcaPhCc=778g@mail.gmail.com>
From: Lucas Pardue <lucaspardue.24.7@gmail.com>
Date: Tue, 14 Nov 2023 20:01:13 +0000
Message-ID: <CALGR9oY7aq3cCnDkvOV3q7jGDcyXYKKFKBr9gs6CNi-y6LbeKg@mail.gmail.com>
Subject: Re: qlog, AckFrame, and ack_delay
To: Damien Neil <dneil@google.com>
Cc: Marten Seemann <martenseemann@gmail.com>, Robin Marx <marx.robin@gmail.com>, QUIC WG <quic@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000078fdae060a224082"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/RQz5eWlmjST2w0EiGbQp_1tlvXw>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Nov 2023 20:03:10 -0000

On Tue, 14 Nov 2023, 16:25 Damien Neil, <dneil@google.com> wrote:

> I agree with Marten that logging the raw frame isn't a great option. You
> don't want to have to stick a frame parser in every qlog consumer.
>

Why not?

Applications that use QUIC increasingly make use of QUIC's variable length
encoding. For example, the HTTP capsule protocol is basically like QUIC
frames layered over an HTTP stream. And unlike QUIC, HTTP intermediaries do
not need to understand the contents of all Capsules. Tooling that can
operate on raw data can be very useful for people that want to write such
tooling. By the time varint decoding is added, theres not much effort
needed IMO. Tooling that doesn't want to do the work doesn't have to.


> I'd still like to know how a producer is supposed to log ack_delay prior
> to knowing the peer's max_ack_delay parameter. Are producers just expected
> to not log the ACK Delay in that case? That seems unfortunate; the value
> can't be used, of course, but that's not a reason not to log it for
> debugging.
>

It would help to better understand the use case. How can debugging use the
value? Do we expect all tools to offer this and hence standardizing
something has value. If not, logging endpoints can always insert custom
fields into any element. And custom tooling can consume it.

The scenario where TPs are not ever available opens up a load of issues for
tools. For instance, there will be no idea about stream limits or
extensions in use.

Diving into an active connection without a full transcript of what it has
been doing doesn't seem particularly useful to me. I've lost count of the
number of partial logs (in various formats) that people have shared with me
to debug something and it never resulted in indentifying a root cause.

But maybe I'm overlooking other use cases.

Cheers
Lucas


> On Tue, Nov 14, 2023 at 2:19 AM Marten Seemann <martenseemann@gmail.com>
> wrote:
>
>> I had always assumed that RawInfo only applied to the payload of the
>> STREAM and DATAGRAM frame. Looking at the definition of RawInfo, it looks
>> like my assumption was likely wrong :)
>>
>> While I can see potential value in logging the data sent in STREAM /
>> DATAGRAM frames, I don't think there's anything useful that can be learned
>> from logging the wire representation of the frame. The only conceivable
>> piece of information one could learn from that is how many bytes were used
>> for encoding the integer values. I don't see how this would be useful.
>> Logging the entire frame means that a qlog consumer who is interested in
>> the payload of these frames now needs to implement a QUIC frame parser.
>>
>> Instead of adding a RawInfo field to all frames, I'd therefore like to
>> propose to change the definition of STREAM and DATAGRAM frames, such that
>> we only enable logging of the payload of these frames.
>>
>> On Tue, 14 Nov 2023 at 10:08, Robin Marx <marx.robin@gmail.com> wrote:
>>
>>> Hello all,
>>>
>>> Sorry for the late reply from my end; I had some troubles posting to the
>>> list and my original reply got lost in moderation.
>>>
>>> I can appreciate both sides of the argument here.
>>> Originally I tended more towards Damien's suggestion and opened a PR for
>>> it at https://github.com/quicwg/qlog/pull/337.
>>> However, I think Marten and Lucas make good arguments for things like
>>> HTTP/3 header compression and packet number encoding needing "global"
>>> persistent state to decode.
>>> I also feel the unscaled ack value isn't really useful in tooling (which
>>> is the main use case for qlog).
>>>
>>> I also follow Lucas' argument that a good fallback option would be to
>>> log the raw bytes using the "? raw: RawInfo" field.
>>> However, for this I see that we don't have this explicitly defined for
>>> all possible frames (including AckFrame) only on some (like StreamFrame or
>>> UnknownFrame).
>>> As such, if this option is chosen, we should add the explicit "raw"
>>> field to all (most?) frames as well imo.
>>> I'm guessing that would also meet Marten's approval?
>>>
>>> With best regards,
>>> Robin
>>>
>>>
>>> On Fri, Nov 10, 2023 at 6:50 PM Damien Neil <dneil=
>>> 40google.com@dmarc.ietf.org> wrote:
>>>
>>>> I believe the ACK Delay field is unique in the QUIC wire protocol in
>>>> being a field which can't be interpreted without information from a
>>>> separate protocol layer. I don't think this change leads to anything else.
>>>>
>>>> More precisely: Viewed as an integer, ACK Delay can be interpreted by
>>>> considering the ACK frame alone. Viewed as a duration, it requires
>>>> additional information exchanged in the TLS handshake. ACK Delay is the
>>>> only field I know of which has this duality.
>>>>
>>>> I can write a function which takes a QUIC packet payload and outputs a
>>>> list of qlog $QuicFrames, with the sole exception of the ACK Delay field
>>>> which requires access to the ack_delay_exponent. And as the
>>>> ack_delay_exponent is not known early in the handshake, it's currently
>>>> impossible to fully log early ACK frames. (Probably the right choice under
>>>> the current design is to not emit the ack_delay field in this case.)
>>>>
>>>> On Fri, Nov 10, 2023 at 9:20 AM Lucas Pardue <
>>>> lucaspardue.24.7@gmail.com> wrote:
>>>>
>>>>> These use cases seem like they are more general than just one field in
>>>>> one frame type.
>>>>>
>>>>> I don't want to commit to doing 1 thing if it's the start of a string
>>>>> of work that updates many other events.
>>>>>
>>>>> Would you be willing to survey all current events (modulo qpack, which
>>>>> we are removing) in order understand the full scope of change?
>>>>>
>>>>> Cheers
>>>>> Lucas
>>>>>
>>>>> On Fri, 10 Nov 2023, 17:31 Damien Neil, <dneil=
>>>>> 40google.com@dmarc.ietf.org> wrote:
>>>>>
>>>>>> Consider an endpoint processing an ACK frame in an Initial packet
>>>>>> received before transport parameters have been received. This can happen
>>>>>> when, for example, a client's Initial CRYPTO flight is too large to fit in
>>>>>> a single datagram; the server will send an ACK for the first Initial packet
>>>>>> before it has the ability to send transport parameters in the Handshake
>>>>>> flight. Or even in the case where the client Initial CRYPTO flight fits in
>>>>>> one datagram, the server may respond with an ACK in an Initial packet prior
>>>>>> to sending a Handshake packet.
>>>>>>
>>>>>> In this case, the client is processing an ACK frame that may contain
>>>>>> a non-zero ACK Delay value, but has no ability to interpret it because it
>>>>>> doesn't know the peer's ack_delay_exponent. I forget whether it's permitted
>>>>>> for an endpoint to send a non-zero ACK Delay in an Initial packet, but even
>>>>>> if it isn't, the recipient may want to log the value.
>>>>>>
>>>>>> Or one could imagine a tool which converts a pcap packet capture to a
>>>>>> qlog file; in this case, the tool may have access to a key log to permit it
>>>>>> to decrypt packets, but may be processing a section of the log that does
>>>>>> not include the handshake.
>>>>>>
>>>>>> On Fri, Nov 10, 2023 at 12:48 AM Marten Seemann <
>>>>>> martenseemann@gmail.com> wrote:
>>>>>>
>>>>>>> There's a tradeoff here: Giving writers of qlog files more
>>>>>>> flexibility comes at a cost to consumers of qlog files, who now need to
>>>>>>> support multiple representations. There's a lot of value in having only a
>>>>>>> single way to log something.
>>>>>>>
>>>>>>> I'm not sure the proposal for unscaled_ack_delay strikes the right
>>>>>>> balance here. For a consumer of a qlog file, I can't think of a single
>>>>>>> scenario where the unscaled_ack_delay would provide any advantage over the
>>>>>>> actual value, so introducing this option would purely to make the writer's
>>>>>>> life easier. And I'm struggling to see why logging the ack_delay would
>>>>>>> place a big burden on the writer, since a QUIC stack will need to decode
>>>>>>> this field at some point anyway.
>>>>>>>
>>>>>>> On Thu, 9 Nov 2023 at 22:44, Damien Neil <dneil=
>>>>>>> 40google.com@dmarc.ietf.org> wrote:
>>>>>>>
>>>>>>>> The qlog AckFrame type includes the ack delay as a float32 number
>>>>>>>> of milliseconds:
>>>>>>>>
>>>>>>>> AckFrame = {
>>>>>>>>     frame_type: "ack"
>>>>>>>>
>>>>>>>>     ; in ms
>>>>>>>>     ? ack_delay: float32
>>>>>>>>     ; ...
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> https://www.ietf.org/archive/id/draft-ietf-quic-qlog-quic-events-06.html#section-8.12.3
>>>>>>>>
>>>>>>>> Given a serialized ack frame, determining the delay as a duration
>>>>>>>> requires knowing the ack_delay_exponent. In some cases, the logging
>>>>>>>> endpoint may not have this available (if receiving an ack before transport
>>>>>>>> parameters have been received). Even when available, it may not be easily
>>>>>>>> accessible at the point of logging. For example, in my own implementation,
>>>>>>>> I'd like to be able to convert a packet payload to a series of qlog event
>>>>>>>> frames without needing to reference persistent connection state.
>>>>>>>>
>>>>>>>> I think there should be an alternative to log the raw value of the
>>>>>>>> ACK Delay field:
>>>>>>>>
>>>>>>>> AckFrame = {
>>>>>>>>     frame_type: "ack"
>>>>>>>>
>>>>>>>>     ; in ms
>>>>>>>>     ? ack_delay: float32
>>>>>>>>
>>>>>>>>     ; integer value of the ACK Delay field, not scaled by the
>>>>>>>> ack_delay_exponent
>>>>>>>>     ? unscaled_ack_delay: uint64
>>>>>>>>
>>>>>>>>     ; ...
>>>>>>>> }
>>>>>>>>
>>>>>>>> - Damien
>>>>>>>>
>>>>>>>
>>>
>>> --
>>> Marx Robin
>>> +32 (0)497 72 86 94 <+32%20497%2072%2086%2094>
>>>
>>