Some remarks on draft-marx-qlog-main-schema-02

Pieter Lexis <pieter.lexis@powerdns.com> Fri, 05 March 2021 16:07 UTC

Return-Path: <pieter.lexis@powerdns.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4243B3A26FF for <quic@ietfa.amsl.com>; Fri, 5 Mar 2021 08:07:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DUPlO5xFnxwa for <quic@ietfa.amsl.com>; Fri, 5 Mar 2021 08:07:07 -0800 (PST)
Received: from mx3.open-xchange.com (alcatraz.open-xchange.com [87.191.39.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 71ACF3A26FE for <quic@ietf.org>; Fri, 5 Mar 2021 08:07:07 -0800 (PST)
Received: from imap.open-xchange.com (imap.open-xchange.com [10.242.2.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx3.open-xchange.com (Postfix) with ESMTPSA id 85EED6A1A9 for <quic@ietf.org>; Fri, 5 Mar 2021 17:07:05 +0100 (CET)
Received: from ananas.home.plexis.eu ([10.242.2.59]) by imap.open-xchange.com with ESMTPSA id 1Hk4HSlXQmAJTgAA3c6Kzw (envelope-from <pieter.lexis@powerdns.com>) for <quic@ietf.org>; Fri, 05 Mar 2021 17:07:05 +0100
To: quic@ietf.org
From: Pieter Lexis <pieter.lexis@powerdns.com>
Subject: Some remarks on draft-marx-qlog-main-schema-02
Message-ID: <f1387b85-93c2-51fa-1f6a-5dfcc79c0ae2@powerdns.com>
Date: Fri, 05 Mar 2021 17:07:04 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/H_inwB1oLh4fr1-i7JyUb-8U5Gg>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Mar 2021 16:07:09 -0000

Hello Quic-WG, Robin,

Someone pointed me to draft-marx-qlog-main-schema-02 because "You showed
interest in structured logging". I've had a quick read and have some
initial thoughts.

The first thought was "yes, there is need for a specified schema for
logs, that can be serialized to a variety of formats". However, the
draft is a bit hand-wavy about the schema and instantiated format.

For starters, section 1.1 notes the use of a datatype language "inspired
loosely by the "TypeScript" language". This language is not an IETF
standard. The IETF has standardized at least 3 data definition languages:

1. ABNF as RFC 5234 [1]
2. Concise Data Definition Language (CDDL) as RFC 8610 [2]
3. YANG as RFC 7950 [3]

Apart from ABNF, both CDDL and YANG have specified how to convert the
instantiated data to JSON (RFC 7951[4] for YANG, CDDL in its own RFC). I
would highly recommend the author to choose either YANG or CDDL to
define all qlog structures.

Skipping over the schema definition, section 4 deals with the
serialization of qlog.

The schema has a field that contains the serialization format. But this
serialization is actually metadata. It is up to the parties exchanging
the serialized data to agree on the format (possibly using
Accept/Content-Type headers when using HTTP to transfer and a
file-extension when stored on disk).

Section 4.1 should be superfluous if the author uses either CDDL or YANG
as a modeling language, as those have defined how to serialize data.

Section 4.2 then uses a non-IETF serialization format (NDJSON) to
accomplish the streaming property of qlog. In the DNS world, the C-DNS
(RFC 8618[5]) logging format is specified using CDDL, uses CBOR (RFC
8949[6]) as its primary 'storage' mechanism, using tables inside blocks
to 'compress' repeated data. It implements streaming on a specific level
of the schema. Using such an approach in qlog would mitigate the need of
the "optimization" section (4.3). It is up to the tooling to translate
from CBOR to JSON or any other format the user or tools can read.

Section 5 then goes into how tools should behave, down to the use of
certain environment variables. This is needlessly restrictive and
stifles any attempt to differentiate between the multitude of tools that
could be developed.

I hope the WG and author consider these reservations on the draft seriously.

Best regards,

Pieter Lexis

1 - https://tools.ietf.org/html/rfc5234
2 - https://tools.ietf.org/html/rfc8610
3 - https://tools.ietf.org/html/rfc7950
4 - https://tools.ietf.org/html/rfc7951
5 - https://tools.ietf.org/html/rfc8618
6 - https://tools.ietf.org/html/rfc8949
-- 
Pieter Lexis
PowerDNS.COM BV -- https://www.powerdns.com