Re: [dmarc-ietf] Doing a tree walk rather than PSL lookup

Alessandro Vesely <vesely@tana.it> Tue, 24 November 2020 18:48 UTC

Return-Path: <vesely@tana.it>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 381863A1390 for <dmarc@ietfa.amsl.com>; Tue, 24 Nov 2020 10:48:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.221
X-Spam-Level:
X-Spam-Status: No, score=-0.221 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1152-bit key) header.d=tana.it
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fpp2c3ZNTTl0 for <dmarc@ietfa.amsl.com>; Tue, 24 Nov 2020 10:47:58 -0800 (PST)
Received: from wmail.tana.it (wmail.tana.it [62.94.243.226]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9FB2A3A138E for <dmarc@ietf.org>; Tue, 24 Nov 2020 10:47:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tana.it; s=delta; t=1606243675; bh=eYAN+aL8mldSx4aJC3qyuC0ggiQOhe1+ZJdYTIZpa1M=; l=2561; h=To:Cc:References:From:Date:In-Reply-To; b=DUKp2Yj70frJ6fSkmyj2I+hp1bYJ3AZ8As3sKBl08keLhBlC/1tv+11wqIG28GkOg UUZhQqFzQtBthH74Po7UCzbgBUfB8CBOEHoJZ9mzo1rMi6Rl0iuMeBkko++U6Qpi1O kp1/4fb7r2sbj4WsKntuG6KHmav56HbSpThgbG0m71CJCXFdGb8fj9ViompUz
Authentication-Results: tana.it; auth=pass (details omitted)
Original-From: Alessandro Vesely <vesely@tana.it>
Original-Cc: IETF DMARC WG <dmarc@ietf.org>
Received: from [172.25.197.111] (pcale.tana [172.25.197.111]) (AUTH: CRAM-MD5 uXDGrn@SYT0/k, TLS: TLS1.3, 128bits, ECDHE_RSA_AES_128_GCM_SHA256) by wmail.tana.it with ESMTPSA id 00000000005DC081.000000005FBD555B.00000E81; Tue, 24 Nov 2020 19:47:55 +0100
To: "Murray S. Kucherawy" <superuser@gmail.com>
Cc: IETF DMARC WG <dmarc@ietf.org>
References: <20201123213846.EB14127C8160@ary.qy> <efa0117e-5b17-800d-820d-b5d2413c6075@tana.it> <CAL0qLwZru7q_YxJLj1wXjaLckeajQ0BE4kL6FTqjrPtj=V0Auw@mail.gmail.com>
From: Alessandro Vesely <vesely@tana.it>
Message-ID: <7ab9e796-0385-5206-d202-97522802393b@tana.it>
Date: Tue, 24 Nov 2020 19:47:55 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0
MIME-Version: 1.0
In-Reply-To: <CAL0qLwZru7q_YxJLj1wXjaLckeajQ0BE4kL6FTqjrPtj=V0Auw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/2OtZgH0mV6xIAcftd_O4bJ0PuUo>
Subject: Re: [dmarc-ietf] Doing a tree walk rather than PSL lookup
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Nov 2020 18:48:00 -0000

On Tue 24/Nov/2020 17:50:20 +0100 Murray S. Kucherawy wrote:
> On Tue, Nov 24, 2020 at 4:20 AM Alessandro Vesely <vesely@tana.it> wrote:
> 
>>> If I'm going to go to the effort to download and decode a PSL and find
>>> the OD, I'll just use the OD. >>>
>>> One of the points of the tree walk is to get rid of the PSL processing.
>>
>> The PSL processing is a local lookup on an in-memory suffix tree.  How is 
>> it a progress to replace it with a tree walk?  A PSL search is lightning
>> faster than even a single DNS lookup, isn't it? >>
> 
> Sure, but only if you think the PSL is accurate.  Otherwise you're basing
> your shortcut up the tree on data you don't have reason to trust.  On the
> other hand, a tree walk, while more expensive in terms of queries, isn't a
> heuristic based on possibly stale information.


The PSL is the result of a community-maintained effort.  They do not follow 
intricate naming restrictions that ccTLDs might theorize, but actively track 
subdomains as they become visible/ noticed.  It is remarkably good.

The reason why one may happen to use stale information is because updates are 
not so well organized.  Arguably, it's not going to reach a stable state until 
it's considered a sort of hack.

For one, the CA/Browser forum had that stance:

On Feb 1, 2013, at 10:25 AM, Phillip wrote:
     The public suffix list is a hack. It should go away. There needs to be a
     mechanism for determining if a domain is a public suffix or not but that
     information should be distributed through the DNS and not through an ad hoc
     list that a third party is meant to be maintaining under ill-defined
     criteria and without the active participation of the TLD operators.
         https://archive.cabforum.org/pipermail/public/2013-February/001146.html

That stance is justified by Section 8.2 of RFC 6454.  However, their current 
Baseline Requirements state the following:

     Determination of what is “registry-controlled” versus the registerable
     portion of a Country Code Top-Level Domain Namespace is not standardized
     at the time of writing and is not a property of the DNS itself. Current
     best practice is to consult a “public suffix list” such as the Public
     Suffix List (PSL), and to retrieve a fresh copy regularly.
        https://cabforum.org/wp-content/uploads/CA-Browser-Forum-BR-1.7.3.pdf

And, noticeably, the URL Living Standard references the PSL plainly.  They call 
*registrable domain* what we call Organizational Domain.  See:
https://url.spec.whatwg.org/#host-public-suffix


Best
Ale
--