Re: Comments on QLOG drafts

Robin Marx <marx.robin@gmail.com> Fri, 19 January 2024 12:01 UTC

Return-Path: <marx.robin@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1C8F3C14F696 for <quic@ietfa.amsl.com>; Fri, 19 Jan 2024 04:01:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.105
X-Spam-Level:
X-Spam-Status: No, score=-2.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Z7-CWK-fIBG7 for <quic@ietfa.amsl.com>; Fri, 19 Jan 2024 04:01:39 -0800 (PST)
Received: from mail-ot1-x32d.google.com (mail-ot1-x32d.google.com [IPv6:2607:f8b0:4864:20::32d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3D09BC14F68D for <quic@ietf.org>; Fri, 19 Jan 2024 04:01:39 -0800 (PST)
Received: by mail-ot1-x32d.google.com with SMTP id 46e09a7af769-6ddeec84e35so396519a34.3 for <quic@ietf.org>; Fri, 19 Jan 2024 04:01:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705665698; x=1706270498; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=hFN8fs3bmNioLxSY300OLKmI/OBp6BFcLTa4SgXZXTs=; b=gsBPO/bohxeqnT2ET6hdw6pErjrWEkyshEVvE1XYSHR6StYovwnZCevxkFjPglWHcm MmH9TgV52OYrGWBPTnfljwFqANJPEXkFPhLzmIg8HJyBSb2GAJKEGLagLc+65owxgnpK ZUFVOeSHj9bqZYv25O3ZlBW7KXIbRFHn6zyOsb265hkLqkBLF1PaSbMNLGL0UWDKn6e4 GaxUvTwoB5YiA9ck2xipo0irIN4ZsVPGH+ISZcYLhAwBFG7IDRBlXqRv7lbtRT29vczO kprysE58eZt6utW6pkY6srAGNnbzGlKzxizpbr4auNCgu3/R5nBMl6xMQQwLuovJqMnC 8KdA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705665698; x=1706270498; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hFN8fs3bmNioLxSY300OLKmI/OBp6BFcLTa4SgXZXTs=; b=SyOv5xEpmHDAM4AHaPO2oXJSEvQlYxl4C0NpXiBIAEZLuI5+COMI2GfkI8kvN7gBe+ HXIwYYY1Bgc6cxvrXvcGvtuUXiXRm9szEzqmNWL8tzN2rvi5glKPB5IwoyXbopfbhI8u nTqTit35kmj84F8zzGKM5atLhCX5MRNj5PqDFe2YvstYTdMPGyytBt4EVnUbfkPAVK4W qnTmyMJtGBIAyZ2jstL2dnlgXeD0oNcTlFlOBuBIP6YZEc1FF/M8r7UeADqKoGd2mrh+ jU4G6PQfX8NuHLEXWExD6QENSp3f3fHMMf/4l8+ml4q6Dbp7ZqMAbnAkOBXL64fbRWlc 3RxQ==
X-Gm-Message-State: AOJu0YwB5b+wEjvvOCha2eLF21qqqPp2qtygeuDfjlFXzqPz54LycQaE u6xVhWeVrOuLQaFjXmYIxlNvoNYmnE7mtX4G93dLjGaZDUJ2R6mkMgL2BpofguKw3WRVJkWBK9D ilImLGInSWtzx7u4hAuWo9JegDDDFt3ZQ5Ek=
X-Google-Smtp-Source: AGHT+IGixM1nbCbFEPv0UCsgiOB+Qd0imxynxn1HYyoByvjcUesG/TlMXHFnApQMa/wzviSWt5JVA18ZMm9gHN8mM9A=
X-Received: by 2002:a05:6830:438f:b0:6dd:ed82:f46c with SMTP id s15-20020a056830438f00b006dded82f46cmr2778080otv.10.1705665697992; Fri, 19 Jan 2024 04:01:37 -0800 (PST)
MIME-Version: 1.0
References: <ZalprxQ7TESSfBN2@camelot.lhh.devever.net>
In-Reply-To: <ZalprxQ7TESSfBN2@camelot.lhh.devever.net>
From: Robin Marx <marx.robin@gmail.com>
Date: Fri, 19 Jan 2024 13:01:26 +0100
Message-ID: <CAOoMnrjUgmZ85XdnYFcfeD62X7ksS346nkAz6a87pAezwwqkDg@mail.gmail.com>
Subject: Re: Comments on QLOG drafts
To: Hugo Landau <hlandau@devever.net>
Cc: quic@ietf.org
Content-Type: multipart/alternative; boundary="00000000000008d422060f4b3e46"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/_g3Soh4KIwc4oqnnDLUu5NCa0zM>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Jan 2024 12:01:44 -0000

Hello Hugo,

Thank you for this very thorough review of the QLOG documents!

Some of your remarks were clear (editorial) errors and I have changed them
in a single commit here:
https://github.com/quicwg/qlog/commit/21a3de1ec3dbcef1c636d91d841fa597f683c8f9


Others also indicate valid problems, but will require additional discussion
and/or work before being addressed in a PR.
I have opened individual issues for them in the github repo at:
https://github.com/quicwg/qlog/issues. Hopefully you're willing to continue
the discussion there.

Finally, for a few remarks, I'll offer my comments here, since I personally
don't think they should be addressed (but might warrant further discussion
in the wg through this mailing list):

1. For connectivity:spin_bit_updated's category: while I agree this is
somewhat QUIC specific, the same could be set for the connection_id_changed
event. This feels more like bikeshedding, so unless others have strong
opinions, I'd keep it at connectivity.

2. For quic:datagrams_sent and logging other ToS fields: I think ECN is
special here since it has a clear (potential) impact on transport-level
congestion control. I don't agree that other IP-level fields should be
logged here, and rather should get an IP-specific qlog event somewhere down
the line.
    Similarly, IIUC QUIC implementations should ALWAYS request the "don't
fragment treatment" from an OS? If not, then I also agree this should not
be part of the datagrams_* events but rather part of a more general
"configuration" field (which we removed a while back due to no concrete use
cases ;))

3. For recovery:congestion_state_updated and uppercase-ness of the ECN
trigger value: ecn is consistently lowercase as field _name_ (as are all
field names in qlog), but we have other instances where ECN-related
_values_ are uppercase (and this is also a _value_, so I'd keep it at ECN)

4. For recovery:loss_timer_updated and clarifying the "timer_type": we
currently have a comment pointing to RFC 9002 there; I would assume this is
enough?

5. For AckFrame and not being able to log the receipt of ECN data without
also logging the ECN values: I personally don't think this is sensitive
enough data to need a way to log this without revealing the values.
Additionally, one can arguably infer this through
recovery:ecn_state_updated that they "should be there" (if not scrubbed by
the network).

6. Several error codes should be uint64 instead of uint32: I think this was
mainly because uint64 is a bit of a nuisance because it can be either an
actual uint64 or text string due to wanting to support "I-JSON" in
webbrowsers. Agreed that according to the spec, these should be uint64 and
have changed this in the commit linked above.


Thank you again for your time on this.
It's very good to get input from a "new implementer" on the docs now that
they've progressed so much from the initial points where most people
implemented the format.

With best regards,
Robin


On Thu, Jan 18, 2024 at 7:49 PM Hugo Landau <hlandau@devever.net> wrote:

> Hi,
>
> I've completed a review of the QLOG drafts and have some comments. Hope
> these
> are helpful.
>
> Comments on draft-ietf-quic-qlog-main-schema.md
> ===============================================
>
>     ## Custom fields\
>
> Trailing backslash in this section title
>
> Comments on draft-ietf-quic-qlog-quic-events.md
> ===============================================
>
> connectivity:connection_closed:
>
>   This is missing the frame type field in the QUIC CONNECTION_CLOSE
>   frame which is rather useful information.
>
>   connection_code can be a string or a uint32, but QUIC uses a variable
>   length integer here and thus this can be up to 62 bits. Same for the
>   application_code.
>
>   It is up to the decoder to infer by presence of connection_code or
>   application_code to infer whether this was an application or transport
>   CONNECTION_CLOSE. But this requires it to log such a value. It would
>   be nice if this can be communicated without necessarily knowing the
>   actual value. Consider either adding a field to distinguish between
>   transport and app errors or replacing the "error" field of "trigger"
>   with "app_error" and "transport_error".
>
> connectivity:connection_id_updated:
>
>   It would be nice to be able to optionally log CID-associated sequence
>   numbers and the type of CID source (ODCID, Initial CID,
>   preferred_address transport parameter, NCID frame, etc.)
>
> connectivity:spin_bit_updated:
>
>   This is very QUIC-specific, so I question the value of it being defined
>   as a generic connectivity event.
>
> connectivity:mtu_updated:
>
>   I suppose it is a bit of an asinine remark, but IPv6 jumbograms mean
>   theoretically you could do things with datagrams greater than 64k.
>   uint32?
>
> quic:version_information:
>
>   Maybe clarify that if the client receives a version negotiation packet
>   with no viable overlap, it logs a version_information event with no
>   chosen_version?
>
> quic:alpn_information:
>
>   RFC 7301 states that ALPN protocol strings are byte strings, not text.
>   How should ALPN protocol strings which are not valid UTF-8 or not
>   valid text be logged?
>
> quic:parameters_set:
>
>   As things stand, there is no facility for logging the reception of
> unknown
>   transport parameters by an endpoint. Since transport parameters are used
> to
>   negotiate extensions to QUIC, having logged data about the presence of
>   transport parameters sent by a peer but not understood, and the
> consequential
>   ability to get information about the prevalence of any given unsupported
>   transport parameter in incoming traffic, seems like it would be quite
> useful.
>
>   I think this event needs more clarify in terms of the distinction between
>   attempting to negotiate a feature and it being enabled. It should be
> possible
>   to log the feature set a local or remote endpoint is requesting and it
>   should be possible to log the actually negotiated feature set. Possibly
>   this can be resolved via simple clarification. If a client connects to
>   a server, does it start by emitting parameters_set (owner=local) with
> (e.g.)
>   max_datagram_frame_size set, and then when the server responds without
>   it, emit parameters_set (owner=remote) without it? I guess this allows
>   the necessary information to be inferred.
>
>   If the absence of a field indicates it was not negotiated after receiving
>   transport parameters from a server, this creates a problem if this event
> has
>   to be used to log other parameters determined at different times (which
> seems
>   plausible for `tls_cipher`, `early_data_enabled`, etc.), as then logging
>   those parameters could be misinterpreted as a sign that the transport
>   parameters have been received and a feature has not been negotiated.
>   Either the cause of a parameters_set event should be clarified (maybe a
> trigger
>   field) or possibly this should be split into different event types.
>
>   I think it would be valuable to have the ability to log information
>   such as spin bit policy (e.g. enabled, disabled due to random chance,
>   disabled administratively), independent of the actual spin_bit_updated
>   event).
>
> quic:packet_sent:
>
>   The meaning of `is_coalesced` should be clarified. I assume this is
>   set to true unless a packet is both the first and last packet in a
>   datagram. However that is not necessarily obvious when a packet is
>   being generated; another option is to define this as being true for
>   every packet but the first packet in a datagram.
>
> quic:packet_received:
>
>   See comment on is_coalesced above
>
> quic:packet_dropped:
>
>   What is the intended trigger category here when an EL is discarded and
>   buffered packets are discarded as a consequence? `decryption_failure`
>   could match but feels subtly different, as I guess that covers the
>   case where a packet is received after an EL is discarded. IMO it would
>   be better to have a different trigger category for this, and add
>   language clarifying this case.
>
> quic:packet_acked:
>
>   `PacketNumberSpace.application_data`, as this is by far the most
> prevalent packet
>   number space a typical QUIC connection will use.
>
>   Editorial note: The usage of the . notation
>   `PacketNumberSpace.application_data` seems slightly novel here. Just
>   write "application_data"?
>
> quic:datagrams_sent:
>
>   It is a bit minor, but the optional logging of the non-ECN bits in the
> ToS byte
>   (as a separate field) might be useful. Implementations which don't do
> anything
>   special here would omit it, and specifying a field would indicate that
> the
>   implementation has requested a specific ToS from the OS.
>
>   I also think it would be potentially valuable if an implementation could
>   note if it has requested 'Don't Fragment' treatment from an OS, but since
>   an implementation can be expected to be consistent about this, this
> doesn't
>   seem like something which should be logged on a per-datagram basis,
> perhaps
>   elsewhere.
>
> quic:datagrams_received:
>
>   Comments on ToS above also apply here
>
> quic:stream_state_updated:
>
>   I think this needs to be clarified a little that an event logged here
> with both
>   a `stream_id` field and a `stream_side` field is expressing a state for
> that
>   component of the stream and from the perspective of the specified
> vantage point
>   (also, what about the network vantage point?); whereas an event logged
> here
>   with a `stream_id` field but no `stream_side` field is expressing a
> state for
>   both stream components as a whole.
>
>   If my interpretation here is correct, I think it would also help to note
> that
>   the valid values of StreamState depend on the value *and* presence of
>   stream_side.
>
> quic:frames_processed:
>
>     The `frame_processed` event can be used to signal internal state
> change not
>     resulting directly from the actual `parsing` of a frame (e.g., the
> frame
>     could have been parsed, data put into a buffer, then later processed,
> then
>     logged with this event).
>
>   Editorial note: use of backticks around parsing seems off here.
>
>      The `packet_received` event can convey all constituent frames. It is
> not
>      expected that the `frames_processed` event will also be used for a
>      redundant purpose. Rather, implementations can use this event to avoid
>      having to log full packets or to convey extra information about when
> frames
>      are processed (for example, frame processing is deferred for any
> reason).
>
>   Editorial note: add "if" before "frame processing is deferred for any
> reason"
>
> quic:stream_data_moved:
>
>   A brief explanation of each from/to value here and the intended meaning
> would
>   be a big improvement. Also clarify that raw here refers to the referenced
>   stream application data (and does not include frame headers etc.).
>
>   Consider having a way to report when the end-of-stream (FIN) condition is
>   reported to an application. One option here is to have an optional "fin"
> field
>   here.
>
>   There does not appear to be any way to communicate when a stream reset
> event is
>   delivered to an application in an equivalent way, which would be useful.
>
> quic:datagram_data_moved:
>
>   As with stream_data_moved, clarify meaning of the from/to fields here.
>
> security:key_updated:
>
>   Clarify use of the generation field here, which is introducing a new
> concept
>   not directly touched on by the QUIC RFCs, which just deal in the Key
> Phase bit.
>   Provide examples for the first few generation numbers (for example,
> presumably
>   the first 1-RTT EL key has a generation of 0). Also clarify how this
> interacts
>   with the temporal behaviour of the key update process (since old keys are
>   retained to handle reordered packets). Presumably once a new key exists,
> it is
>   logged with a new generation number and the old key is eventually given a
>   key_discarded event with the old generation number. But this should be
>   clarified.
>
> security:key_discarded:
>
>   As above.
>
> recovery:parameters_set:
>
>     do, for some reason, change these parameters during execution, MAY
> emit the
>     `parameters_set` event twice.
>
>   Twice? Seems like this should say "more than once".
>   Maximum datagram size could change repeatedly.
>
>   A minor point, but I suppose it may be useful to have a field for a name
> of a
>   congestion control algorithm here, defined as "newreno" / text (with
> "newreno"
>   being the algorithm specified in RFC 9002).
>
> recovery:congestion_state_updated:
>
>   It is a little bit inconsistent here for ECN to be uppercase when the
>   "ecn" field in other events is lowercase.
>
> recovery:loss_timer_updated:
>
>   Probably have a little bit of text clarifying the meaning of
>   `timer_type` here.
>
> AckFrame:
>
>   It is worth noting that this doesn't allow a QLOG implementation to
>   log whether a peer included ECN data in an ACK frame without actually
>   including that data. i.e., a QLOG decoder has to infer presence of ECN
>   from the presence of the ect1/ect0/ce fields. Perhaps this is too
>   minor to address, but I thought I'd point it out.
>
> ResetStreamFrame:
>
>   error_code should be uint64 here.
>
> StopSendingFrame:
>
>   error_code should be uint64 here.
>
> CryptoFrame:
>
>   StreamFrame has an optional raw field here, so CryptoFrame should too.
>
> StreamFrame:
>
>   This format has ambiguity as to whether explicit length and/or offset
>   fields were used. It is potentially useful to know this, or the
>   numeric value of the stream type field.
>
>   Being able to know the header length for overhead calculation would
>   also be valuable IMO.
>
>   It may be worth explicitly clarifying that it is not possible to use
>   this object to log information about a stream frame without logging
>   the value of the FIN bit, as the presence or absence of this bit is
>   inferred from presence or absence of the FIN field.
>
> ConnectionCloseFrame:
>
>   Is there a reason error_code is limited to uint32 here when
>   error_code_value is correctly specified as uint64? Understanding some
>   of the rationale behind the encoding of error codes in this
>   specification would be helpful.
>
> Yours,
> Hugo Landau
>
>

-- 
Marx Robin
+32 (0)497 72 86 94