Re: [DNSOP] Fwd: New Version Notification for draft-muks-dnsop-dns-thundering-herd-00.txt

Mukund Sivaraman <muks@mukund.org> Fri, 26 June 2020 12:01 UTC

Return-Path: <muks@mukund.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 54CD03A127F for <dnsop@ietfa.amsl.com>; Fri, 26 Jun 2020 05:01:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.089
X-Spam-Level:
X-Spam-Status: No, score=-2.089 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=mukund.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zPKY_YuzNuPW for <dnsop@ietfa.amsl.com>; Fri, 26 Jun 2020 05:01:20 -0700 (PDT)
Received: from jupiter.mukund.org (jupiter.mukund.org [46.4.226.158]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 97B0E3A093E for <dnsop@ietf.org>; Fri, 26 Jun 2020 05:01:18 -0700 (PDT)
Date: Fri, 26 Jun 2020 17:31:11 +0530
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mukund.org; s=mail; t=1593172876; bh=PNz2fKi+sRCZLUGZ1/DbrZQz+cwBov9VfK72eSLg3FQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HfXZp4SpfVmstCnGzvfpt4LfYVHgazLPUsM0wKJNZS+BXKRQLzuzuOy6oOgp5qK/2 TmqQc9cjOuk6upRPVwDdm3yUYFCLf3Pdl89ZOx35BHA7bG86GpIxLEjQd5a6HUCyRF Ty9C0/of+cMva02Q3i8yBOd66i8AjBpTpDrJ3FaA=
From: Mukund Sivaraman <muks@mukund.org>
To: Paul Wouters <paul@nohats.ca>
Cc: Cricket Liu <cricket@infobox.com>, dnsop@ietf.org
Message-ID: <20200626120111.GA12168@jurassic.vpn.mukund.org>
References: <159310275958.28219.2228183649424027878@ietfa.amsl.com> <20200625165049.GA22173@jurassic.vpn.mukund.org> <alpine.LRH.2.22.394.2006251428010.18151@bofh.nohats.ca>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="liOOAslEiF7prFVr"
Content-Disposition: inline
In-Reply-To: <alpine.LRH.2.22.394.2006251428010.18151@bofh.nohats.ca>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/ahwb8fg9Rqa3-4Dg5LcoSHdPqSM>
Subject: Re: [DNSOP] Fwd: New Version Notification for draft-muks-dnsop-dns-thundering-herd-00.txt
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jun 2020 12:01:27 -0000

Hi Paul

On Thu, Jun 25, 2020 at 02:29:03PM -0400, Paul Wouters wrote:
> On Thu, 25 Jun 2020, Mukund Sivaraman wrote:
> 
> > For whoever is interested, this is a description of a pattern of queries
> > noticed at busy public resolvers that has led to issues in at least 4
> > different sites in the last 2 months.
> > 
> > The current revision is a work in progress. We are still developing some
> > mitigations for NIOS, and some more introductory text also has to be
> > added.
> 
> I would add a more explicit section on using prefetching of frequently
> asked queries, which mitigates (eliminates) the period when an answer is not
> available in the cache.

The resolvers in question had prefetching functionality enabled. It
didn't help mitigate the problem. These are very busy public resolvers,
and their clients for the most part are static participants (in other
words, clients don't join and leave the group in large numbers in the
TTL interval). So the clients align themselves into herds at the expiry
time, and they show up in a spike that is instantaneous. The frequency
of these spikes was observed to be equal to the TTL of the
answer. Prefetching does not happen as there's often no lone client that
comes in a few seconds before the answer expires. The answer expires and
the spike causes nuisance.

Your suggestion to mention prefetching is good though and I will add
notes on what happens with prefetching.

		Mukund