Re: [Doh] Clarification for a newbie DoH implementor

"Mark Delany" <d5e@xray.emu.st> Sun, 09 June 2019 08:37 UTC

Comments: QMDA 0.3a
Date: Sun, 09 Jun 2019 08:37:24 +0000
Message-ID: <20190609083724.23965.qmail@f3-external.bushwire.net>
From: Mark Delany <d5e@xray.emu.st>
To: doh@ietf.org
References: <20190418071238.68406.qmail@f3-external.bushwire.net> <20190518233815.44249.qmail@f3-external.bushwire.net> <CAHbrMsCMWtzHXZvpodak59RtAkSQC_ZM03oekKj00WqzNkDaaA@mail.gmail.com> <20190519055255.45717.qmail@f3-external.bushwire.net> <CAHbrMsC-1OQnoaYFE5BO8UzDsebo7jhJBfc9F4J-zgeA2FZm7Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
In-Reply-To: <CAHbrMsC-1OQnoaYFE5BO8UzDsebo7jhJBfc9F4J-zgeA2FZm7Q@mail.gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/doh/qoNp7jVOHE0hW_tFhPCwIzoNris>
Subject: Re: [Doh] Clarification for a newbie DoH implementor
Precedence: list

On 19May19, Ben Schwartz allegedly wrote:

> In practice, none of this seems to be a problem.  DoH servers simply do
> their best to fully answer the query

I know this is getting close to flogging a dead horse, but I'm still troubled by
the silent truncation implied by a DoH Server processing TC=1.

Notwithstanding clients which directly connect to DoH servers (such as some
browsers) the typical scenario is likely to be existing stub clients using UDP
to talk to a DoH proxy/client which in turn talks to a DoH server, i.e.:

 stub (UDP) ->        DoH proxy (HTTPS) -> DoH server (UDP) -> resolver
 stub (UDP) <-        DoH proxy (HTTPS) <- DoH server (UDP) <- resolver

(Sorry, you'll need to render this in a fixed-width font).

As I understand Ben's response, if the resolver returns TC=1 then a typical DoH
server (or its resolver library) will retry with TCP which makes the flow for a
large response actually look like:

 stub (UDP) ->        DoH proxy (HTTPS) -> DoH server (UDP) -> resolver
                                           DoH server (UDP) <- (TC=1) resolver
                                           DoH server (TCP) -> resolver
 stub (UDP) <- (TC=0) DoH proxy (HTTPS) <- DoH server (TCP) <- (TC=0) resolver

Which certainly meets the "do their best to fully answer the query" as the stub
never has to worry about TC=1 and simply gets the "full result".

However... It makes me wonder how a large TCP response the DoH Server sends back
via HTTPS can possibly fit into the response the DoH proxy sends back via UDP to
the stub.

Let's say the DoH proxy receives a 5K response over HTTPS, does it blindly try
and transmit this 5K response over UDP back to the stub and just hope for the
best? Or should it know this is likely to fail and act accordingly? If so, how
should it act?

I can't think of what a DoH proxy can sensibly do in such circumstances apart
from arbitrarily truncate the response down to the UDP size indicated by the
stub and *not* mark the response with TC=1. Thus the stub loses some of the
answer and more importantly loses knowledge of the truncation. That seems bad to
me.

As we've discussed previously, it's pointless returning TC=1 to the stub as it
will simply re-issue an identical query as far as the DoH server is concerned.

Note that I said "identical query as far as the DoH server is concerned". The
same is not the case for the DoH proxy as the query re-issued from the stub
*can* be disambiguated since the stub connects to the proxy via TCP.

Which offers a possible solution.

Since a proxy knows whether the inbound query has come via UDP or TCP it can
annotate the HTTPS request accordingly (let's invent a "Use-TCP"
header). The DoH server acts on this header thus the flow becomes:

 stub (UDP) ->        DoH proxy (HTTPS)           -> DoH server (UDP) -> resolver
 stub (UDP) <- (TC=1) DoH proxy (HTTPS)           <- DoH server (UDP) <- (TC=1) resolver
 stub (TCP) ->        DoH proxy (HTTPS - Use-TCP) -> DoH server (TCP) -> resolver
 stub (TCP) <- (TC=0) DoH proxy (HTTPS)           <- DoH server (TCP) <- (TC=0) resolver

IOWs TC=1 is sent all the way back to the stub for it to deal with. If the stub
re-queries with TCP, the proxy forwards the query with the "Use-TCP" annotation
to the DoH server which in turn instructs its resolver library to use TCP.

Not only does this alleviate a DoH server from trying their best to "fully
answer the query" in the blind, it also avoids the impossibility of transmitting
a large response back to the stub over UDP. Most importantly truncation is no
longer silently performed - rather it is communicated back to the stub which
lets it decide what to do as has traditionally been the case.

I think all stubs should work in this "Use-TCP" scenario whereas clearly they
cannot in the "DoH server transparently handles TC=1" scenario.

The one down-side is that a TC=1 response incurs higher latency due to the flow
going all the way back to the stub for a retry, however given these TC=1
responses are uncommon, that seems like a minor price to pay for a consistent
outcome.

Mark.

[Doh] Clarification for a newbie DoH implementor Mark Delany
Re: [Doh] Clarification for a newbie DoH implemen… Vladimír Čunát
Re: [Doh] Clarification for a newbie DoH implemen… Mark Delany
Re: [Doh] Clarification for a newbie DoH implemen… Mark Delany
Re: [Doh] Clarification for a newbie DoH implemen… Ben Schwartz
Re: [Doh] Clarification for a newbie DoH implemen… Mark Delany
Re: [Doh] Clarification for a newbie DoH implemen… Ben Schwartz
Re: [Doh] Clarification for a newbie DoH implemen… Ray Bellis
Re: [Doh] Clarification for a newbie DoH implemen… Mark Delany
Re: [Doh] Clarification for a newbie DoH implemen… Mark Delany
Re: [Doh] Clarification for a newbie DoH implemen… Ben Schwartz
Re: [Doh] Clarification for a newbie DoH implemen… Mark Delany
Re: [Doh] Clarification for a newbie DoH implemen… Tony Finch