Re: New Version Notification for draft-kazuho-httpbis-selftrace-00.txt

Kazuho Oku <kazuhooku@gmail.com> Sun, 15 August 2021 06:20 UTC

Return-Path: <kazuhooku@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AC8A13A15B6 for <quic@ietfa.amsl.com>; Sat, 14 Aug 2021 23:20:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ju_JRM8wegJR for <quic@ietfa.amsl.com>; Sat, 14 Aug 2021 23:20:31 -0700 (PDT)
Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 01F673A15B3 for <quic@ietf.org>; Sat, 14 Aug 2021 23:20:30 -0700 (PDT)
Received: by mail-ed1-x531.google.com with SMTP id i6so21554329edu.1 for <quic@ietf.org>; Sat, 14 Aug 2021 23:20:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6ayERaSaxWYFv0kL2mLcBFu5VuwsZkRL8DIxoyIhbOE=; b=aFIxak1jfGp0itaV8XLH5elaefRSoMponLO2ciUbBxgtzF/Y6p11jAJA3kHf3/kYIF iQInFQqttXKGj5FK9XrddAjprGM9V6yynr4WIOXmqjOYkPXQvmWbjwG/fdABlUP6BzA6 Ajy+NJb5CxBM3lF5mmAUb5O7Jj09M1oibXM58OSWlViydqKcyVwwSfY4tX1CDTUN7DBd FLMwQD1kbVOXyMRFtRtBelFF56+TZLAaPkevBCPv3EVcdpY7swsYIMkzfOdB7HcHHwk6 P8ZBYh4g8RkoTYkQt4Fabn4g+OZiHSciLA07zWMFKA4ernlyxElYMidcXuDDWMfVt/7g qDLA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6ayERaSaxWYFv0kL2mLcBFu5VuwsZkRL8DIxoyIhbOE=; b=SaxVqVaAX/3alrapK9aiXvrcgmRbt5jpIPLzdhOQOUhb79Pvo8YUsEpGo2vNGR8B4b mCd5w1GKN6yDmLJTGKYJzINN9CWBXoWisb2fJ6JXf/wuaqOIzX3Z07/j0AX4WGHN1txj LhjcRd6HmCZEzARvpAd05heCD3xOM/kUaLBPay4WA50iwj6wOYqLGXGjOPSLEeQBN8m+ SRrzJ99nTnepSMkq3Ys83e8UgtKp2QOf6l348hIVueDzOSVMAlDpHZZgnwL2KtzghGCW ydHJka10D+IXxfJHrZOuGipNSBc0KXKMeFpgT+Pje80rgr9lF+DXBj9Y4tw99euCdvpl RXFw==
X-Gm-Message-State: AOAM532nXOjygM8Ksohx9B33f2/qFHDH4sRrCURhPjsIsj1deW6zucrf nyBg6Fee2P5okOZ66uzw7BkJq/nHSqFBRLZafCM=
X-Google-Smtp-Source: ABdhPJyELFsrWhFNBlY2hIZtSI14rMNF8YVraOIjV+Pa8+G8wKu1af7VwOJpCNweL8K/VALrWebgolOTaLy5wlzZVPE=
X-Received: by 2002:aa7:d5d3:: with SMTP id d19mr12561558eds.201.1629008428029; Sat, 14 Aug 2021 23:20:28 -0700 (PDT)
MIME-Version: 1.0
References: <162883401993.25302.7275724432785172464@ietfa.amsl.com> <CANatvzxWrg+rciDpOZqsnDWq_oW_cr-Do2SjUzGgPy_vyAUs=Q@mail.gmail.com> <CAC7UV9aVnrUfvLuMB6dFSqiVzyr5PNF_xc+nRiZve35R3xqyrw@mail.gmail.com>
In-Reply-To: <CAC7UV9aVnrUfvLuMB6dFSqiVzyr5PNF_xc+nRiZve35R3xqyrw@mail.gmail.com>
From: Kazuho Oku <kazuhooku@gmail.com>
Date: Sun, 15 Aug 2021 15:20:16 +0900
Message-ID: <CANatvzx_O_38nU3wyD6UCtFRfBSarT4=NO45yOQMbSOe0oCK=g@mail.gmail.com>
Subject: Re: New Version Notification for draft-kazuho-httpbis-selftrace-00.txt
To: Robin MARX <robin.marx@uhasselt.be>
Cc: IETF QUIC WG <quic@ietf.org>, HTTP Working Group <ietf-http-wg@w3.org>, Jana Iyengar <jri.ietf@gmail.com>
Content-Type: multipart/alternative; boundary="000000000000b05b6705c99314e5"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/V255DD8M0mykzkPJdUkUM5xyPkU>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Aug 2021 06:20:38 -0000

Hello Robin,

Thank you for your comments. My responses inline.

2021年8月14日(土) 1:28 Robin MARX <robin.marx@uhasselt.be>:

> Hello Kazuho,
>
> Thanks a lot for writing this up and sharing it.
> This concept has been part of the qlog docs since the very start [1], but
> has simultaneously been something that no-one has implemented yet, so it's
> good to see a concurrent proposal and an actual POC that seems to work
> quite nicely!
> I think this capability will be core to allow people to diagnose QUIC in
> the wild, as we've seen just this week with some of the problems Wix had to
> properly identify H3 vs H2 gains on their Fastly deployment [2].
>
> A couple of questions/remarks:
>
> 1) The trace stream currently starts from when the request for the
> well-known URL is received, thus missing all events before that (e.g.,
> handshake details).
>     I understand this is a tradeoff (you don't want to keep full traces
> for all connections just in case they are requested), but I feel we might
> want a way to indicate at the very start (e.g., transport parameter) that a
> trace will be requested so we can request that info as well?
>

That's a good point.

While our PoC starts collecting the trace from the moment when the server
receives the request to the well-known URI, I do not think we are tied to
doing that. And I think that we probably do not need a TP.

A server can allocate a fixed-size buffer for each connection, retaining
the first N events that occurred for that connection. When it receives a
request for the well-known URI, it could send those events being recorded
and then send the events that follow. Recording few events during the
startup of a connection is probably alright, as the cost of logging would
be negligible compared to that of the TLS handshake.


> 2) There are downsides to loading the trace over the same connection, some
> of which you note in the text and in the email (overhead for the "real"
> connection). The qlog text allows fetching of traces of connection A over
> connection B by CID, but that of course has other tradeoffs.
>     I feel your method is probably best, IF we can find a solution for
> point 1). e.g., if you don't want to impact real connection, you request
> trace at start, let the connection run, and then fetch full trace at the
> end instead of streaming during.
>     This does introduce some extra (resource exhaustion) attack vectors
> that need mitigations etc.
>
> 3) Relatedly, your POC seems to assume the browser will have just a single
> connection and a trace request in a second tab will auto map to that
> connection.
>     That works fine for the POC, but for a real deployment that lets
> end-users fetch traces, you'd need built-in browser/client support to
> select a specific connection / fetch traces for all connections to a given
> origin.
>      Not a big problem ofc, and things like Chrome's netlog export already
> do this, but still a practical hurdle.
>

Right. I would hope that it would be possible to implement this as a
browser extension at least (with the assumption being that requests from a
browser extension would be coalesced with other requests going to the same
authority).


> 4) Any reason in particular you're not streaming qlog? I assume it's
> because you don't log qlog directly but instead use a converter and that
> converter adds too much overhead to do on the fly?
>     Not that it really matters or that I feel we should limit to qlog, but
> it does bring up the question of how an automated client setup (e.g., via
> WebPageTest-alike tooling) would identify what the server sends back.
>     You of course already know this because you made the qlog issue for it
> [3], but good to bring it up on the list as well.
>

I would argue that there are differences between tracing a program and an
interchange format being used for analysing transport issues. What we emit
is the trace of h2o, that *can* be converted to qlog.

To give an example, we might have a call graph of functions like this:
quic::on_quic_ack // processing of a QUIC ACK frame
  -> quic::on_quic_ack_one_pn // processing of a particular packet number
being acked
    -> h3::on_buffer_shift // some bytes are removed from the send buffer
      -> proxy::on_upstream_unblock // the proxy is unblocked from reading
more data from upstream
        -> proxy::read_data // proxy reads data from upstream, queued in
the receive buffer
          -> h3::notify_data_ready // h3 layer is notified that there is
more data to be sent
            -> quic::notify_data_ready // QUIC stack is notified that there
is more data to be sent

and for the purpose of analysis, we want to emit traces that preserve these
kinds of call graphs, or to paraphrase, log events that happen in their
order, regardless of where they happened.

In this example, H3- and proxy-level events can happen for each PN being
ACKed. However, I do not recall if it was possible with qlog to emit an H3
event while processing an ACK frame.

I could well be wrong about how we could use qlog, but regardless, the
broader point is that we do not want our tracing capabilities to be
constrained by the limits of qlog. Emitting traces our own way preserves
the most information, with minimal effort. If necessary, we can
post-process the traces to qlog format to use the tools developed by the
community.


5) I wonder if this should be a completely separate document, a separate
> document part of the qlog effort, or part of the qlog documents.
>     As said in 4), I don't feel this necessarily should be limited to
> qlog, but a lot of the privacy issues+mitigations inherent to exposing logs
> will be discussed for qlog and should probably be referenced for this
> approach as well.
>     Currently, you seem to skirt some of this by saying it doesn't matter
> because the client is the one requesting the logs, but I don't quite agree
> that's enough.
>     This could be used to ask end-users to capture a trace of a
> problematic connection and upload it for analysis. If the end-user isn't
> very technical, they might not know which info they're exposing. Even if
> they are, they probably don't want to go through the trouble of sanitizing
> the logs themselves.
>     Put differently: this should probably either be restricted to expose
> no privacy-sensitive info at all (or at least discuss the issues) or allow
> explicit selection of a "privacy level" (the approach we'll probably take
> with qlog is to define multiple levels of obfuscation/omission for
> different use cases).
>
> I am very excited by the proposal and would love for some large
> deployments to offer this service.
> I feel that wouldn't just be revolutionary to many academic efforts, but
> also enable better client-aided debugging and to allow users to assess
> bottlenecks in their setups.
>
> With best regards,
> Robin
>
> [1]:
> https://datatracker.ietf.org/doc/html/draft-ietf-quic-qlog-main-schema-00#section-7.2
> [2]: https://twitter.com/alonkochba/status/1424403252284694528?s=20
> [3]: https://github.com/quicwg/qlog/issues/158
>
>
> On Fri, 13 Aug 2021 at 08:15, Kazuho Oku <kazuhooku@gmail.com> wrote:
>
>> Hello folks,
>>
>> Today Jana and I have submitted a tiny I-D called
>> draft-kazuho-httpbis-selftrace.
>>
>> The draft specifies a well-known URI to be used for providing a trace of
>> a particular HTTP/3 connection (e.g., qlog) on that same HTTP/3 connection.
>>
>> One of the biggest hurdles in analyzing HTTP/3 performance issues is
>> obtaining traces that show the symptoms. That is because clients being
>> affected by issues have to coordinate with the server operators to collect
>> the traces.
>>
>> This PR solves the problem by defining a well-known URI for serving a
>> trace to the client on the HTTP connection that the client is using. When a
>> user sees an issue, they can collect the traces themselves and provide it
>> to the server operator.
>>
>> We have already implemented the feature in h2o, and doing so was easy,
>> assuming that the underlying QUIC stack already defines callbacks for
>> collecting trace events, see lib/handler/self_trace.c of
>> https://github.com/h2o/h2o/pull/2765.
>>
>> We also have a public endpoint; to try it out, first open
>> https://ora1.kazuhooku.com/test/self-trace/video-only.html (which starts
>> streaming a video), then open
>> https://ora1.kazuhooku.com/.well-known/self-trace. While the video is
>> being served, you would see the trace flowing through the well-known URI.
>>
>> At the moment, we are using a custom JSON format for the trace, but when
>> gzip compression is applied on-the-fly, the overhead of sending a trace
>> alongside ordinary HTTP responses is less than 10%. Therefore, we tend to
>> believe that this approach would work well in practice.
>>
>> Please let us know what you think - your feedback is very welcome.
>>
>> ---------- Forwarded message ---------
>> From: <internet-drafts@ietf.org>
>> Date: 2021年8月13日(金) 14:53
>> Subject: New Version Notification for
>> draft-kazuho-httpbis-selftrace-00.txt
>> To: Jana Iyengar <jri.ietf@gmail.com>, Kazuho Oku <kazuhooku@gmail.com>
>>
>>
>>
>> A new version of I-D, draft-kazuho-httpbis-selftrace-00.txt
>> has been successfully submitted by Kazuho Oku and posted to the
>> IETF repository.
>>
>> Name:           draft-kazuho-httpbis-selftrace
>> Revision:       00
>> Title:          Self-Tracing for HTTP
>> Document date:  2021-08-13
>> Group:          Individual Submission
>> Pages:          5
>> URL:
>> https://www.ietf.org/archive/id/draft-kazuho-httpbis-selftrace-00.txt
>> Status:
>> https://datatracker.ietf.org/doc/draft-kazuho-httpbis-selftrace/
>> Htmlized:
>> https://datatracker.ietf.org/doc/html/draft-kazuho-httpbis-selftrace
>>
>>
>> Abstract:
>>    This document registers a "Well-Known URI" for exposing state of an
>>    HTTP connection to the peer using formats such as qlog schema [QLOG].
>>
>>
>>
>>
>> The IETF Secretariat
>>
>>
>>
>>
>> --
>> Kazuho Oku
>>
>
>
> --
>
> dr. Robin Marx
> Postdoc researcher - Web protocols
> Expertise centre for Digital Media
>
> *Cellphone *+32(0)497 72 86 94
>
> www.uhasselt.be
> Universiteit Hasselt - Campus Diepenbeek
> Agoralaan Gebouw D - B-3590 Diepenbeek
> Kantoor EDM-2.05
>
>
>

-- 
Kazuho Oku