Re: [DNSOP] sentinel and timing?

Joe Abley <jabley@hopcount.ca> Thu, 08 February 2018 12:51 UTC

Return-Path: <jabley@hopcount.ca>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B82C126CE8 for <dnsop@ietfa.amsl.com>; Thu, 8 Feb 2018 04:51:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=hopcount.ca
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ISpsBrB89wxL for <dnsop@ietfa.amsl.com>; Thu, 8 Feb 2018 04:51:22 -0800 (PST)
Received: from mail-yw0-x22a.google.com (mail-yw0-x22a.google.com [IPv6:2607:f8b0:4002:c05::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 29FCA124235 for <dnsop@ietf.org>; Thu, 8 Feb 2018 04:51:21 -0800 (PST)
Received: by mail-yw0-x22a.google.com with SMTP id u21so2499078ywc.2 for <dnsop@ietf.org>; Thu, 08 Feb 2018 04:51:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hopcount.ca; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=qE6VuhceglMlkPavr8nFoL7PA9MLj+d4El1EEtQbDus=; b=dJqTrkJmE1upMJOxEzWDGhriSeG3wLwbcQ6b5L6HFpplskTEB+AQmguCJ37zDmHyOm HNOuVOlNNxb4ibsdyRTBdMbNbMqEDF1No3ztdOrYXLmrzifdGno5tXQyO3OX07Ck7RGu enKTP4teXW6I3YFmSU6MwsOoHQUl/i+nmfWkc=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=qE6VuhceglMlkPavr8nFoL7PA9MLj+d4El1EEtQbDus=; b=sI5cbdF3JQNT5/wgsBXYDrhoW47L251fyLwcwmDoH0jwAckOs6CUZf43lc96iuvVgE niYBkXgEF4jkZvMFSOYxQMQrISjLV1RBOTNXb6Azsq7/mnQSrW2vk6zPdBuh4w1rEeUx CZ4AC9Lsmy8AI4UG1DwXCmH8VQBwAXcXuh9vKCtfVTVQguIojcZG4B3UyKOPRZ0L9jVh 3+yENnWOh+/k5fanPY0Sn3uhOg21vTQo9l5rhcTCAVUgdc2bAIXtX5knCaiEkGX9ov8y PMvZYrW1jP/8HgPb7XbSLGW4wc6sMsF39WgdFeB8xwGOfRZ7UPFRqd3qzuFl1Oy2yIgE fQRw==
X-Gm-Message-State: APf1xPDfW3LW0SBrbtl+/AGiTedqiuCP9WbZiiOoxVGnaan2Orls5j8h QuUwzla/z8IoymPc4lQFWt5LrkG+PBw=
X-Google-Smtp-Source: AH8x227aYit4gidd/Yxzl24XRQQJpJ2oA6JIuK0TLYxwBMRR4VnWbR+VeccatLNZ5QoWF/Kp7XJLBA==
X-Received: by 10.129.48.74 with SMTP id w71mr355499yww.60.1518094280503; Thu, 08 Feb 2018 04:51:20 -0800 (PST)
Received: from ?IPv6:2607:f2c0:101:3:f084:4cbf:ff6c:b6db? ([2607:f2c0:101:3:f084:4cbf:ff6c:b6db]) by smtp.gmail.com with ESMTPSA id v5sm1333946ywa.9.2018.02.08.04.51.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Feb 2018 04:51:19 -0800 (PST)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (1.0)
From: Joe Abley <jabley@hopcount.ca>
X-Mailer: iPad Mail (15D60)
In-Reply-To: <alpine.LRH.2.21.1802080059480.6658@bofh.nohats.ca>
Date: Thu, 8 Feb 2018 07:51:18 -0500
Cc: Robert Story <rstory@isi.edu>, dnsop <dnsop@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <7816D681-7A97-466C-A77F-7A0CC87C4F8F@hopcount.ca>
References: <alpine.LRH.2.21.1802071035280.6369@bofh.nohats.ca> <20180207215502.46daf6bc@titan.int.futz.org> <alpine.LRH.2.21.1802080059480.6658@bofh.nohats.ca>
To: Paul Wouters <paul@nohats.ca>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/4rJgd9ExjDJVlrH7HBIkJMC3xiE>
Subject: Re: [DNSOP] sentinel and timing?
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Feb 2018 12:51:24 -0000

Hi Paul,

(with apologies for breakfast/iPad MIME crime that surely follows)

> On Feb 8, 2018, at 01:02, Paul Wouters <paul@nohats.ca> wrote:
> 
>> On Wed, 7 Feb 2018, Robert Story wrote:
>> 
>>> On Wed 2018-02-07 10:43:16-0500 Paul wrote:
>>> How about using this query to also encode an
>>> uptime-processstartedtime value? Maybe with accurancy reduced to
>>> minutes. I think that would return valuable data.
>> 
>> -1 for feature creep and the technical reasons Joe mentioned.
> 
> We have a giant hole in our understanding of why there are updated
> nameservers running the latest software with the older keys. We
> need to gain understanding and we know we need more data.

I don't disagree with the need for more data, but I think the hole you mention is not so giant. As far as I can tell it's a result of:

1. RFC5011 support not being turned on in nameservers that have been upgraded but whose older, DNSSEC-validating configuration has been preserved across updates (most cases), and

2. RFC5011 support exercising a code path that requires a writable, persistent filesystem to store an updated trust anchor, which turns out not to be available (fewer, but some cases).

These are both BIND9 problems, which I mean as a complement since they are indicative of (a) widespread use, (b) early implementation of DNSSEC and (c) a high degree of backwards compatibility in configuration.

The larger question of whether RFC5011 is a practical or sufficient mechanism given this experience is a reasonable one. You may recall I have been a serial advocate for adding standardised bootstrap mechanisms that include fetching a trust anchor out-of-band, for example, which I still think would be a practical remedy even if a slightly inelegant one; unbound-anchor and its use in package and system start scripts is, I think, a key reason why the two problems described above don't show up in unbound.

My sense from the recent KSK rollover/RFC8145 data collection experience is that the actual impact on end-users from validators dependent on the outgoing KSK is very small. This is hard to quantify with precision, however, because we are not able to measure the state of most resolvers (e.g. those not reporting via RFC 8145 or not validating), nor assess their operational impact (e.g. size of end-user population and impact of validation failures upon them) with any degree of accuracy.

I think that the sentinel approach of measuring end-user impact from the end-user perspective gets us much closer to useful data in general. However, it's not clear to me how even a trusted, accurate sense of uptime across all resolvers would help with those questions.


Joe