Re: [core] HREF compression encoding
Carsten Bormann <cabo@tzi.org> Wed, 06 May 2020 12:19 UTC
Return-Path: <cabo@tzi.org>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BE7113A09ED for <core@ietfa.amsl.com>; Wed, 6 May 2020 05:19:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FB05PlR-8K9D for <core@ietfa.amsl.com>; Wed, 6 May 2020 05:19:09 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7ACEC3A09D2 for <core@ietf.org>; Wed, 6 May 2020 05:19:08 -0700 (PDT)
Received: from [172.16.42.112] (p548DCD70.dip0.t-ipconnect.de [84.141.205.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 49HFy23Jl5zyd9; Wed, 6 May 2020 14:19:06 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <00ba01d618d2$bc689e60$3539db20$@augustcellars.com>
Date: Wed, 06 May 2020 14:19:06 +0200
Cc: "core@ietf.org WG" <core@ietf.org>
X-Mao-Original-Outgoing-Id: 610460346.006689-4da5f37beb2552bee36325afb5689c4b
Content-Transfer-Encoding: quoted-printable
Message-Id: <C2189FF3-2DD2-41E3-9719-789A982E0405@tzi.org>
References: <00ba01d618d2$bc689e60$3539db20$@augustcellars.com>
To: Jim Schaad <ietf@augustcellars.com>
X-Mailer: Apple Mail (2.3608.80.23.2.2)
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/B1O1VezNNnUMAkfneNUjXN988f4>
Subject: Re: [core] HREF compression encoding
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 May 2020 12:19:14 -0000
Hi Jim, This is an interesting proposal. Klaus and I had a chat about it now, which led to the following straw man grammar: CRI-Reference = [ (?scheme, ?(host, ?port) // path.type), *path, ? (QUERY, +query) ? (FRAGMENT, fragment) ] scheme = ((SCHEMETEXT, text .regexp "[a-z][a-z0-9+.-]*") // COAP // COAPS // HTTP // HTTPS) host = ((HOSTTEXT, text) // bytes .size 4 // bytes .size 16) port = 0..65535 path.type = 0..127 path = text query = text fragment = text SCHEMETEXT = -1 COAP = -2 COAPS = -3 HTTP = -4 HTTPS = -5 HOSTTEXT = true QUERY = -6 FRAGMENT = -7 Obviously, the specific marker values we use could be shuffled a bit, but it seems we can cover the entire spectrum that is covered by the existing syntax. There are a few things that need to be decided base on perceived frequency of use (e.g., are text-valued host names the likely or the unlikely case — depending on that, the marker is placed on the host text or on the path sequence, or the absent host could be represented by `false`), so we should come up with some rough estimations here. Comments welcome... Grüße, Carsten > On 2020-04-22, at 20:20, Jim Schaad <ietf@augustcellars.com> wrote: > > I had a slightly different proposal to what Klaus presented at the last interim in terms of doing href compression. It is based on the fact that URIs have a relatively fixed pattern and keeps the CBOR coding directly rather than moving to some type of binary encoding. > > The standard pattern for a URI is scheme://hostname/path/path/… Using this for compression purposes by removing all of the tagging which keeps to the same pattern you would compress coap://example.com/foo/abc…xyz to > > [ “coap”, “example.com”, “foo”, “abc…xyz”] > 1 1 1 1 2 = 6 bytes of padding > > This is the same amount of padding as his binary compression method. There is a slight loss over his method when you want to do port numbers, queries or fragments as they would need to have a integer tag inserted so you get > > [ “coap”, “example.com”, “foo”, “bar”, Query, “a=b”, “c=d”, Fragment, “gohere”] > 1 1 1 1 1 1 1 1 1 1 = 10 bytes > > In the binary encoding this would only require 9 bytes > > Moving to an IP address adds no additional padding as the difference between a text string, a byte string of length either 4 or 8 can easily be detected. Relative URIs are encoded using similar tagging so you end up with > > [ Absolute, “foo”, “bar” ] > 1 1 1 1 = 4 byte > > [ Relative, 2, “foo”, “bar” ] > 1 1 1 1 1 = 5 bytes > > Using the binary mode these would be 4 bytes and 4 bytes respectively (I think as no examples are in the slides) > > I believe that the advantage of this proposal is that there is no new encoder/decoder needed as this is pure CBOR. The compressed outputs are of similar lengths as the binary version and processing them I believe will result in near identical code sizes. The code to do absolute and relative processing as well as generating CBOR options is going to be very similar. > > Jim > > _______________________________________________ > core mailing list > core@ietf.org > https://www.ietf.org/mailman/listinfo/core
- [core] HREF compression encoding Jim Schaad
- Re: [core] HREF compression encoding Carsten Bormann
- Re: [core] HREF compression encoding Thomas Fossati
- Re: [core] HREF compression encoding Jim Schaad
- Re: [core] HREF compression encoding Christian Amsüss
- Re: [core] HREF compression encoding Jim Schaad