[DNSOP] How Slack didn't turn on DNSSEC
John Levine <johnl@taugh.com> Tue, 30 November 2021 18:38 UTC
Return-Path: <johnl@iecc.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B22583A149C for <dnsop@ietfa.amsl.com>; Tue, 30 Nov 2021 10:38:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.85
X-Spam-Level:
X-Spam-Status: No, score=-1.85 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=iecc.com header.b=jo8LRe2U; dkim=pass (2048-bit key) header.d=taugh.com header.b=LL1exqpD
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p5AwyEk-cKBv for <dnsop@ietfa.amsl.com>; Tue, 30 Nov 2021 10:38:13 -0800 (PST)
Received: from gal.iecc.com (gal.iecc.com [IPv6:2001:470:1f07:1126:0:43:6f73:7461]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 296363A149B for <dnsop@ietf.org>; Tue, 30 Nov 2021 10:38:12 -0800 (PST)
Received: (qmail 65337 invoked from network); 30 Nov 2021 18:38:09 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=iecc.com; h=date:message-id:from:to:subject:mime-version:content-type:content-transfer-encoding:cleverness; s=ff34.61a66f91.k2111; bh=6oq8XXx9A/eGxk8V+FWfObaazxwNkunyrsKMjp+qk38=; b=jo8LRe2UzN8ascDFsLt5BsaIka0e0+N7+xss8xVzGZZ6SNKvhnW+yj2daFZU8VP2gSDvExN2kh6HdWVw1DXO4U+P688l1b/VKfbhsguaTGDC2DNETrk9HzPmnSRhm7PYmH19Rk/F6vVb4d0xbLLGeTH6No7//wsPgxNfybvkY4nTJVitTeBwJgZjKSJyBSfNjXenIX/oaZq2x96CBcMITAtGgiq/dMCx6jbz5DqZb6BKVd9arvCd2lasI2irjlorO4J/tQPDC/0bHe/10Wqm5OJIU8jRMt5jgaE/uX2j9HbJZ2SDr8f7Wi+uyF76QoLvV4n/156VEThdxH7jLdnpJw==
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=taugh.com; h=date:message-id:from:to:subject:mime-version:content-type:content-transfer-encoding:cleverness; s=ff34.61a66f91.k2111; bh=6oq8XXx9A/eGxk8V+FWfObaazxwNkunyrsKMjp+qk38=; b=LL1exqpDIbUKBiRs2e6+gD6BsSMfBtZTUfMxjYods+aMF35tbzBZ38je9ro3KXXRjMMLqsVhS6TEwFkQuHHFLdnDudQPnRkjK89Ag86eaAiT3UOHptwa2dYIZDo/ly2heV59yD6gf+yVzZJ3g0SW3SoS6bla1nVlkKozgbhZwC5nIlYvBI2Pl2EkYhQ1je4FbWPgxfBjw+0aFRnezngvzI0MEj8RQ5EV7XKar75VdvXmphqYiRWDZvhSLK+T46XWMGYA6mrWsb8mj47iIxGpNhLExmAfyZizn1WYxZM9jVSfgNvv11Au96oL76q9Xg6p5om93My9cuvbRZY04Ha0yA==
Received: from ary.qy ([IPv6:2001:470:1f07:1126::78:696d:6170]) by imap.iecc.com ([IPv6:2001:470:1f07:1126::78:696d:6170]) with ESMTPS (TLS1.2 ECDHE-RSA AES-256-GCM AEAD) via TCP6; 30 Nov 2021 18:38:09 -0000
Received: by ary.qy (Postfix, from userid 501) id 04E8230CA390; Tue, 30 Nov 2021 13:38:07 -0500 (EST)
Date: Tue, 30 Nov 2021 13:38:07 -0500
Message-Id: <20211130183809.04E8230CA390@ary.qy>
From: John Levine <johnl@taugh.com>
To: dnsop@ietf.org
Organization: Taughannock Networks
X-Headerized: yes
Cleverness: minimal
Mime-Version: 1.0
Content-type: text/plain; charset="utf-8"
Content-transfer-encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/RB4heseYoNGVI5UoX0rz5OIK07s>
Subject: [DNSOP] How Slack didn't turn on DNSSEC
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Nov 2021 18:38:19 -0000
This blog post has been making the rounds. Since it is about a sequence of DNS operational failures, it seems somewhat relevant here. https://slack.engineering/what-happened-during-slacks-dnssec-rollout/ tl;dr first try was rolled back due to what turned out to be an unrelated failure at some ISP second try was rolled back when they found they had a CNAME at a zone apex, which they had never noticed until it caused DNSSEC validation errors. third try was rolled back when they found random-looking failures that they eventually tracked down to bugs in Amazon's Route 53 DNS server. They had a wildcard with A but not AAAA records. When someone did an AAAA query, the response was wrong and said there were no records at all, not just no AAAA records. This caused failures at 8.8.8.8 clients since Google does aggressive NSEC, not at 1.1.1.1 because Cloudflare doesn't. They also got some bad advice, e.g., yes the .COM zone adds and deletes records very quickly, but that doesn't mean you can unpublish a DS and just turn off DNSSEC because its TTL is a day. Their tooling somehow didn't let them republish the DNSKEY at the zone apex that matched the DS, only a new one that didn't. It is clear from the blog post that this is a fairly sophisticated group of ops people, who had a reasonable test plan, a bunch of test points set up in dnsviz and so forth. Neither of these bugs seem very exotic, and could have been caught by routine tests. Can or should we offer advice on how to do this better, sort of like RFC 8901 but one level of DNS expertise down? R's, John
- [DNSOP] How Slack didn't turn on DNSSEC John Levine
- Re: [DNSOP] How Slack didn't turn on DNSSEC Viktor Dukhovni
- Re: [DNSOP] How Slack didn't turn on DNSSEC Philip Homburg
- Re: [DNSOP] How Slack didn't turn on DNSSEC Mark Andrews
- Re: [DNSOP] How Slack didn't turn on DNSSEC Mark Andrews
- Re: [DNSOP] How Slack didn't turn on DNSSEC Vladimír Čunát
- Re: [DNSOP] How Slack didn't turn on DNSSEC Philip Homburg
- Re: [DNSOP] How Slack didn't turn on DNSSEC libor.peltan
- Re: [DNSOP] How Slack didn't turn on DNSSEC Tim Wicinski
- Re: [DNSOP] How Slack didn't turn on DNSSEC Mark Andrews
- Re: [DNSOP] How Slack didn't turn on DNSSEC Vladimír Čunát
- Re: [DNSOP] How Slack didn't turn on DNSSEC Mark Andrews
- Re: [DNSOP] How Slack didn't turn on DNSSEC Vladimír Čunát
- Re: [DNSOP] How Slack didn't turn on DNSSEC Mark Andrews
- Re: [DNSOP] How Slack didn't turn on DNSSEC Paul Vixie
- Re: [DNSOP] How Slack didn't turn on DNSSEC Andrew Sullivan
- Re: [DNSOP] How Slack didn't turn on DNSSEC Jim Reid
- Re: [DNSOP] How Slack didn't turn on DNSSEC Viktor Dukhovni
- Re: [DNSOP] How Slack didn't turn on DNSSEC Paul Vixie
- Re: [DNSOP] How Slack didn't turn on DNSSEC Viktor Dukhovni
- Re: [DNSOP] How Slack didn't turn on DNSSEC John Levine
- Re: [DNSOP] How Slack didn't turn on DNSSEC Petr Špaček
- Re: [DNSOP] How Slack didn't turn on DNSSEC - is … Petr Špaček
- Re: [DNSOP] How Slack didn't turn on DNSSEC Philip Homburg
- Re: [DNSOP] How Slack didn't turn on DNSSEC Mark Andrews