[Driu] Query distribution vs. query routing
Ted Hardie <ted.ietf@gmail.com> Thu, 19 July 2018 22:57 UTC
Return-Path: <ted.ietf@gmail.com>
X-Original-To: driu@ietfa.amsl.com
Delivered-To: driu@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7C894130ED1 for <driu@ietfa.amsl.com>; Thu, 19 Jul 2018 15:57:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vQEy_LrbC2Cw for <driu@ietfa.amsl.com>; Thu, 19 Jul 2018 15:57:37 -0700 (PDT)
Received: from mail-oi0-x22a.google.com (mail-oi0-x22a.google.com [IPv6:2607:f8b0:4003:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AE5D6130EE8 for <driu@ietf.org>; Thu, 19 Jul 2018 15:57:34 -0700 (PDT)
Received: by mail-oi0-x22a.google.com with SMTP id k81-v6so18016279oib.4 for <driu@ietf.org>; Thu, 19 Jul 2018 15:57:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=iOQo43nSJ7AIMe97yNTiWw91GCYOVcn1lGG2KEGQJYA=; b=WmSTyPeZfj/y/8HzRKR36J9mm3OaA9CT4DYbBke3mX5DMerXjYO3XuQ7I0fzQvYRHM DO2X67ptPkcN87KzkpXOHHAFNdWBfwSg5w8BCgLHJaQgwreamKRsg38RLiu2fAZJbYa3 fViFnPniQ0FjfqGzLjmiC656ZwUHWRvfAi7LtHFe7rsIKh+TPakQfAt8d7SZQ528Oppg x/OLPoFkig2G8mjuzkQEkinmDoGp0ySk7WvRHxnXtAJgdOqnI+YB0fe0aCoIDPVOx/cd hJ4sUYDmUoFW9zUijKIiAxFqj48OI3C0YH8adE+LsQTtkEgaQ7ylUpqiiJos6cq64hme wUTA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=iOQo43nSJ7AIMe97yNTiWw91GCYOVcn1lGG2KEGQJYA=; b=Ga2kuEGvqCWYorqXFb6OLmirpgXzHeT0gUCw9FXmTAYUSvnD6yb797X3cOnuaKChT4 AFLyoHnEeMg9Fq0iRnbGgSMaZVJkjvnBEQ9Deg/47rdKeupzWbhYHDQVOSrleIwJyNKT uj6PL1T44doh1qGhepiKOJrp0QSZsV23NJRUZV1U2rDghb8JvNW2ZU9MVteYCQhGo402 H02Qy6xBnFkC2pPHNSj5XusIzUeMvNANu7aJ1CiWdWZclPq8933eVTGAOaeGaRvhA4C8 aXdRut+wCDLuGiyfPDmkQKm2hwrXQ8757UDpvouvCX9hWg0enHvc56mQvGn/PyGOoebf XHfA==
X-Gm-Message-State: AOUpUlHXl5fjWIMmc1XheETTV94tU9rxakBH+BkDF2PJ5D6eGxmpmy5E b4S+nL9xS2Mcyygz1LPWmgmvnzdABIh220wF6jOz661f
X-Google-Smtp-Source: AAOMgpd3zCK9cYY/sSa19HcDt/JHqOIY3VR2UuD573SLvEg/uxf5g9iu/hozgUadOlgSe+JfFNiQywZgZUbLbPkMMA4=
X-Received: by 2002:aca:2c83:: with SMTP id s125-v6mr29134ois.103.1532041053405; Thu, 19 Jul 2018 15:57:33 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a4a:66d9:0:0:0:0:0 with HTTP; Thu, 19 Jul 2018 15:57:02 -0700 (PDT)
From: Ted Hardie <ted.ietf@gmail.com>
Date: Thu, 19 Jul 2018 18:57:02 -0400
Message-ID: <CA+9kkMB+yODJARiFqyestCRtX6Od-Ryz7qkjhWmn5ToqYBOLgA@mail.gmail.com>
To: driu@ietf.org
Content-Type: multipart/alternative; boundary="000000000000c507730571621a36"
Archived-At: <https://mailarchive.ietf.org/arch/msg/driu/mmKBircD2UfqZJXPB9XkYHKgVEQ>
Subject: [Driu] Query distribution vs. query routing
X-BeenThere: driu@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: "DNS Resolver Identification and Use \(DRIU\)." <driu.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/driu>, <mailto:driu-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/driu/>
List-Post: <mailto:driu@ietf.org>
List-Help: <mailto:driu-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/driu>, <mailto:driu-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Jul 2018 22:57:40 -0000
Howdy, The mic line interaction and related jabber traffic around Mark's draft were interesting, and they nicely illustrated just how confused I can be. Sorry for sharing that experience with the group. Where I think it's useful to dive in a little further is in the distinction between query distribution and query routing. Let's start by making a parallel: a device on a lan with a single router has a default next hop destination; that router may also have a default next hop destination. In both cases, all traffic leaving one device will go through the other (lan device to lan router; lan router to default upstream destination). In the router case, it's pretty common for there to be two different upstream possibilities, with the router making the choice between upstream possibilities for packets as they arrive from the device. That choice may be a simple algorithm that load balances between them, like ECMP, or it may be made by consulting a mapping that tells it which upstream is a better route to a specific destination network (that mapping being derived from data distributed by some routing protocol). At a hideously gross approximation, a host talking to a recursive resolver is in a parallel situation. It can have a single upstream resolver it talks to, or it can have more than one. If it has more than one, it may direct queries to them as a load balancing function, or as a function of the names one serves usually because one is configured to serve the local names in a split DNS situation. That configuration is the simple routing information in the parallel here. What I heard in the jabber room as explanatory text was that this proposal suggests allowing the default upstream server to name other default upstream servers, who could share the load of the query stream. There would be benefits to the upstreams in load-shedding and benefits to the device in reducing the information revealed to any particular upstream. I have some serious concerns there about whether the default upstream servers it names could be scoped to avoid this becoming a DOS vector, but even putting that aside, I think the incentive is poor. The local device already has a known good upstream with an established TCP/TLS/HTTP session going, and it's going to want to avoid the latency of establishment of load balancing in a lot of cases. What I originally thought was in the proposal was closer to distributing routing information. There, the server isn't distributing new default servers, it is distributing information about servers that are suited to answering queries about specific names. The choice to use them, in other words, wouldn't be based on load balancing, but on query routing--sending a query to the best available query responder from a latency or authority perspective. There's a whole bunch of work around securing routing information that we'd have to import if we went down this path. Down this path split DNS becomes fragmented DNS, in which you may find some answers are "reachable" and some are not. I can also read the document to be recommending a hybrid mode, in which every DOH responder agrees to be a default query path but still tells you which names it is well suited to resolve (giving you a situation like that in which a router gets multiple routing announcements that include both specific networks and a default route). There are a bunch of interesting possibilities there, both in optimizations and attacks. Ultimately, though, the incentive works only when this actually results in quicker connection setups and when both the DOH client and DOH responder care a great deal about that timing. What I think that turns out to mean, and this is from a very dusty crystal ball, is that a DOH responder will have to maintain a very robust DNS recursive resolver infrastructure if it agrees to be a default query path in order to attract query traffic bound for its network. That means a big cache, good connections to other DNS services, the lot. If it doesn't, its delays in responding to "default upstream" queries will be slower than those from other potential services, and a DNS client will eventually cease using them (presuming it is still testing optimization by query response time). If it still gets any queries, it will be along the "fragmented DNS" query model, where it gets only queries where the latency of its answers for specific networks is very important. I think those later in the line that were pointing out the risks of consolidation were right, in other words, and maybe didn't even go far enough. I think the end game of this model is that the user has no control over where the queries go and the heuristic system underneath them ends up sending them to site willing to offer the highest number of names (more specific routes) and the biggest DNS query infrastructure. That's going to land everything behind a few CDNS, unless I miss my guess. I'm sure I've made several additional howlers in this note, for which I apologize early, Ted
- [Driu] Query distribution vs. query routing Ted Hardie
- Re: [Driu] Query distribution vs. query routing Martin Thomson