[core] HREF compression encoding

Jim Schaad <ietf@augustcellars.com> Wed, 22 April 2020 18:20 UTC

Return-Path: <ietf@augustcellars.com>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 44D073A1204 for <core@ietfa.amsl.com>; Wed, 22 Apr 2020 11:20:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.1
X-Spam-Level:
X-Spam-Status: No, score=0.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, PDS_OTHER_BAD_TLD=1.999, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vYfcIARuynLP for <core@ietfa.amsl.com>; Wed, 22 Apr 2020 11:20:49 -0700 (PDT)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D716D3A1203 for <core@ietf.org>; Wed, 22 Apr 2020 11:20:48 -0700 (PDT)
Received: from Jude (73.180.8.170) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Wed, 22 Apr 2020 11:20:43 -0700
From: Jim Schaad <ietf@augustcellars.com>
To: core@ietf.org
Date: Wed, 22 Apr 2020 11:20:41 -0700
Message-ID: <00ba01d618d2$bc689e60$3539db20$@augustcellars.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_00BB_01D61898.100B25F0"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AdYY0kiSZ0ZIX97ER3mBegRkAxq6IQ==
Content-Language: en-us
X-Originating-IP: [73.180.8.170]
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/X29PmWjKJuyokXy98_UeFcvTsWU>
Subject: [core] HREF compression encoding
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Apr 2020 18:20:52 -0000

I had a slightly different proposal to what Klaus presented at the last
interim in terms of doing href compression.  It is based on the fact that
URIs have a relatively fixed pattern and keeps the CBOR coding directly
rather than moving to some type of binary encoding.

 

The standard pattern for a URI is scheme://hostname/path/path/.  Using this
for compression purposes by removing all of the tagging which keeps to the
same pattern you would compress coap://example.com/foo/abc.xyz to

 

[ "coap", "example.com", "foo", "abc.xyz"]

1    1          1          1          2     = 6 bytes of padding

 

This is the same amount of padding as his binary compression method.  There
is a slight loss over his method when you want to do port numbers, queries
or fragments as they would need to have a integer tag inserted so you get

 

[ "coap", "example.com", "foo", "bar", Query, "a=b", "c=d", Fragment,
"gohere"]

1    1          1           1      1     1       1     1       1        1
= 10 bytes

 

In the binary encoding this would only require 9 bytes

 

Moving to an IP address adds no additional padding as the difference between
a text string, a byte string of length either 4 or 8 can easily be detected.
Relative URIs are encoded using similar tagging so you end up with

 

[ Absolute, "foo", "bar" ]

1   1          1      1   =  4 byte

 

[ Relative, 2, "foo", "bar" ]

1   1        1   1     1        = 5 bytes

 

Using the binary mode these would be 4 bytes and  4 bytes respectively (I
think as no examples are in the slides)

 

I believe that the advantage of this proposal is that there is no new
encoder/decoder needed as this is pure CBOR.  The compressed outputs are of
similar lengths as the binary version and processing them I believe will
result in near identical code sizes.  The code to do absolute and relative
processing as well as generating CBOR options is going to be very similar.

 

Jim