Re: [DNSOP] Clarification question: compression pointers always to names earlier in the packet?

Shane Kerr <shane@time-travellers.org> Wed, 24 October 2018 20:25 UTC

Return-Path: <shane@time-travellers.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 75163130DFB for <dnsop@ietfa.amsl.com>; Wed, 24 Oct 2018 13:25:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ei9HkrLmz54V for <dnsop@ietfa.amsl.com>; Wed, 24 Oct 2018 13:25:15 -0700 (PDT)
Received: from time-travellers.org (c.time-travellers.nl.eu.org [IPv6:2a02:2770::21a:4aff:fea3:eeaa]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 54F7A128D68 for <dnsop@ietf.org>; Wed, 24 Oct 2018 13:25:15 -0700 (PDT)
Received: from [2001:470:78c8:2:74eb:e519:2d30:7b79] by time-travellers.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <shane@time-travellers.org>) id 1gFPiU-0006gv-Dj for dnsop@ietf.org; Wed, 24 Oct 2018 20:25:22 +0000
To: dnsop@ietf.org
References: <BC2CDF40-4FF0-4111-88B7-04969491D2E0@dukhovni.org> <EC514300-F235-41DB-A413-2F9F8F8B04C8@sinodun.com>
From: Shane Kerr <shane@time-travellers.org>
Message-ID: <f3c89916-6622-c153-f662-9ac42b61d81c@time-travellers.org>
Date: Wed, 24 Oct 2018 22:25:12 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1
MIME-Version: 1.0
In-Reply-To: <EC514300-F235-41DB-A413-2F9F8F8B04C8@sinodun.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/d5y9UquPbw29J2dLbcBl7RmdXzM>
Subject: Re: [DNSOP] Clarification question: compression pointers always to names earlier in the packet?
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Oct 2018 20:25:18 -0000

John,

On 24/10/2018 15.38, John Dickinson wrote:
> On 24 Oct 2018, at 10:01, Viktor Dukhovni wrote:
> 
>> My reading of RFC 1035 is that DNS name "compression"
>> via "pointers" is restricted to name strictly earlier
>> in the DNS message:
>>
>>    4.1.4. Message compression
>>
>>    In order to reduce the size of messages, the domain system utilizes a
>>    compression scheme which eliminates the repetition of domain names 
>> in a
>>    message.  In this scheme, an entire domain name or a list of labels at
>>    the end of a domain name is replaced with a pointer to a prior 
>> occurance
>>                                                             
>> ---------------
>>    of the same name.
>>
> 
> Not strictly to do with loops but we noticed that not all nameservers 
> use the same compression algorithm. See section 9.1 and appendix B of 
> https://datatracker.ietf.org/doc/draft-ietf-dnsop-dns-capture-format

It's not too surprising. I am pretty sure that producing optimal name 
compression (by size) is the same as one of packing problems, which are 
either NP-complete or NP-hard depending on the particulars:

https://en.wikipedia.org/wiki/Packing_problems

I think that this is only true if you allow re-ordering of RRset as well 
as RR within a set, but still.

If you don't allow re-ordering then I think the overall complexity of 
compressing a message is the same as sorting, O(NlgN), which while a lot 
better is still not great, because you need to search through all prior 
labels to find the longest match when compressing each name.

I think it's due to these fundamentally expensive costs that different 
name servers use the various heuristics you mention in the C-DNS draft 
to try to get a good balance of speed & size.

Cheers,

--
Shane