Re: Some remarks on draft-marx-qlog-main-schema-02

Robin MARX <robin.marx@uhasselt.be> Mon, 08 March 2021 12:49 UTC

Return-Path: <robin.marx@uhasselt.be>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 833A43A0E61 for <quic@ietfa.amsl.com>; Mon, 8 Mar 2021 04:49:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.098
X-Spam-Level:
X-Spam-Status: No, score=-7.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=uhasselt.be
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RA2qbAXzg9vL for <quic@ietfa.amsl.com>; Mon, 8 Mar 2021 04:49:01 -0800 (PST)
Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CF0B13A0E62 for <quic@ietf.org>; Mon, 8 Mar 2021 04:49:00 -0800 (PST)
Received: by mail-wm1-x32e.google.com with SMTP id u5-20020a7bcb050000b029010e9316b9d5so765900wmj.2 for <quic@ietf.org>; Mon, 08 Mar 2021 04:49:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uhasselt.be; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6lkdC2FxQQ3CTMgL7Y5oop/nQJaJ32UeWDnZ8vOLtw4=; b=mCKMH9opdDtny9GX7EhE4JEkxMQlGE+kFJVaHxEALlcaMSHgjWmGr8et6OfE5MzYAA 1qKV6lFb0wvANsAlRhZuT9HK9rVzedv183TV7LeC+1jZSlyBVnc5QdJ66jZwDS3k984O /mApDipEuN0fTghnKNWAm3F6n6FsrQhFeZE68=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6lkdC2FxQQ3CTMgL7Y5oop/nQJaJ32UeWDnZ8vOLtw4=; b=VD6KiElliRJJ0Z6/QD+OqvXCEyn0NfZsNpmNHC81REKmGW1DPbJjHrfrCTbGDkUkh6 afxItCPKosgl7aqEXn48CxLVeOu/DjvCfwv1I9PCi63sEPvI2vfWIz7j+R1Luaqbsv3i bG7LL4eYiOoyux7sjZTHRN0a1fGyrq78Q81+Bl11ZUfOv4xvwzyEYuCPnpL+2li+zHJ1 R/Wlu/1zHMlnbYxaW1fftYRFUpxYZwPbtGq7It8p9tTVc8D76hJMRWM7i5cIO3PL50n0 2uH1z/l4Lvz92/iGCPtmAUCE+tlJ+K+igydldaCqBjfsciEZumIkJNNah9Xer79Hnzh+ RliA==
X-Gm-Message-State: AOAM532r+T3YI3aul48oZmr+tKBT/chFKUymDJcX9+IFUsuxW63RlHhj HkGnN3nI4jfPmeQw4F3tH59jRUZzh2ofqnTyNSIOqsif3HE=
X-Google-Smtp-Source: ABdhPJw02XFJJa7XGqEfNoWcZVzsM8P/C9MzdEH2/FYkzO5o0GjwESDX7l8aoj9nUYVe1be/Teh8hqtJiOWXORsKYK0=
X-Received: by 2002:a05:600c:4844:: with SMTP id j4mr22113593wmo.179.1615207734135; Mon, 08 Mar 2021 04:48:54 -0800 (PST)
MIME-Version: 1.0
References: <f1387b85-93c2-51fa-1f6a-5dfcc79c0ae2@powerdns.com>
In-Reply-To: <f1387b85-93c2-51fa-1f6a-5dfcc79c0ae2@powerdns.com>
From: Robin MARX <robin.marx@uhasselt.be>
Date: Mon, 08 Mar 2021 13:48:41 +0100
Message-ID: <CAC7UV9bEn+MASA6uNY-51J=BQ0GptZjM3Xwm5KB=LU4c_BxiEw@mail.gmail.com>
Subject: Re: Some remarks on draft-marx-qlog-main-schema-02
To: Pieter Lexis <pieter.lexis@powerdns.com>
Cc: IETF QUIC WG <quic@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000003b5b5705bd05db13"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/GlJ9ziyO4oLBmrCWT6SA_v35X2s>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Mar 2021 12:49:05 -0000

Hello Pieter,

Thank you very much for your extensive and certainly valid feedback.

I agree the approach up until now has been relatively "ad-hoc", as it has
grown organically over time for QUIC and HTTP/3.
Especially as we're aiming to (properly) start generalizing qlog to other
protocols and use cases,
we need to have discussions about serialization formats and the datatype
language used in the drafts.

I have several presentations on qlog planned for this IETF week (in
dispatch, maprg, tsvwg, iccrg, saag and opsawg),
with the explicit goal of hopefully getting some outside feedback/interest.
I specifically mention the format/definition language there as one of the
main challenges.
I'm well aware that some of this has already been solved for other use
cases;
I/we just didn't/don't have enough experience with all options to make
these decisions before.
Your feedback was exactly the type of thing we were hoping for by
soliciting outside viewpoints.

Some concrete thoughts of me personally on your points:

1) My personal main goal for the datatype definition language in the
documents would be to allow automatically generating
schema definitions for multiple serialization formats / programming
languages. I assume CDDL and/or YANG allow that, so they
seem like good candidates. Existing mappings to e.g., JSON/CBOR would
definitely also come in handy there.
I wonder if there are also (non-standard) mappings to e.g., protocol
buffers/flatbuffers for YANG?

2) I sort of disagree with not needing a serialization format indicator. In
our tooling, we currently support 4 completely different file types,
that (can) all use the .json extension (and the same MIME types) by
default. I agree that conceptually it's not needed, but practically
it's very useful to have. Of course, that might become moot if JSON isn't
the main format going forward.

3) I can appreciate that NDJSON is not an IETF RFC, but I'm also not yet
sure we want to move to e.g., CBOR as the default format,
as it removes easy human readability. The main thing that has become clear
the past months is that streaming should be the main use case
(instead of full-file storage/transfer, which we assumed previously), so I
would also prefer having a standardized format for that of course.

4) I agree the concrete API endpoints / environment variables might have to
be split out of the main document (if we keep them at all).
I do note that having a default environment variable name (QLOGDIR) has
been useful, as most implementations support this, which is handy for newer
users.
The well-known URL however has so far not been used by any deployment afaik.
In general I think there's value in having some recommendations for this,
but agree those might not belong in the main spec.

Thank you again for your extensive feedback.
I hope you will be part of the continued discussions on this in the future.

With best regards,
Robin


On Fri, 5 Mar 2021 at 17:07, Pieter Lexis <pieter.lexis@powerdns.com> wrote:

> Hello Quic-WG, Robin,
>
> Someone pointed me to draft-marx-qlog-main-schema-02 because "You showed
> interest in structured logging". I've had a quick read and have some
> initial thoughts.
>
> The first thought was "yes, there is need for a specified schema for
> logs, that can be serialized to a variety of formats". However, the
> draft is a bit hand-wavy about the schema and instantiated format.
>
> For starters, section 1.1 notes the use of a datatype language "inspired
> loosely by the "TypeScript" language". This language is not an IETF
> standard. The IETF has standardized at least 3 data definition languages:
>
> 1. ABNF as RFC 5234 [1]
> 2. Concise Data Definition Language (CDDL) as RFC 8610 [2]
> 3. YANG as RFC 7950 [3]
>
> Apart from ABNF, both CDDL and YANG have specified how to convert the
> instantiated data to JSON (RFC 7951[4] for YANG, CDDL in its own RFC). I
> would highly recommend the author to choose either YANG or CDDL to
> define all qlog structures.
>
> Skipping over the schema definition, section 4 deals with the
> serialization of qlog.
>
> The schema has a field that contains the serialization format. But this
> serialization is actually metadata. It is up to the parties exchanging
> the serialized data to agree on the format (possibly using
> Accept/Content-Type headers when using HTTP to transfer and a
> file-extension when stored on disk).
>
> Section 4.1 should be superfluous if the author uses either CDDL or YANG
> as a modeling language, as those have defined how to serialize data.
>
> Section 4.2 then uses a non-IETF serialization format (NDJSON) to
> accomplish the streaming property of qlog. In the DNS world, the C-DNS
> (RFC 8618[5]) logging format is specified using CDDL, uses CBOR (RFC
> 8949[6]) as its primary 'storage' mechanism, using tables inside blocks
> to 'compress' repeated data. It implements streaming on a specific level
> of the schema. Using such an approach in qlog would mitigate the need of
> the "optimization" section (4.3). It is up to the tooling to translate
> from CBOR to JSON or any other format the user or tools can read.
>
> Section 5 then goes into how tools should behave, down to the use of
> certain environment variables. This is needlessly restrictive and
> stifles any attempt to differentiate between the multitude of tools that
> could be developed.
>
> I hope the WG and author consider these reservations on the draft
> seriously.
>
> Best regards,
>
> Pieter Lexis
>
> 1 - https://tools.ietf.org/html/rfc5234
> 2 - https://tools.ietf.org/html/rfc8610
> 3 - https://tools.ietf.org/html/rfc7950
> 4 - https://tools.ietf.org/html/rfc7951
> 5 - https://tools.ietf.org/html/rfc8618
> 6 - https://tools.ietf.org/html/rfc8949
> --
> Pieter Lexis
> PowerDNS.COM BV -- https://www.powerdns.com
>
>

-- 

dr. Robin Marx
Postdoc researcher - Web protocols
Expertise centre for Digital Media

T +32(0)11 26 84 79 - GSM +32(0)497 72 86 94

www.uhasselt.be
Universiteit Hasselt - Campus Diepenbeek
Agoralaan Gebouw D - B-3590 Diepenbeek
Kantoor EDM-2.05