[jose] Using Jose as an encrypted document container.

Phillip Hallam-Baker <phill@hallambaker.com> Wed, 28 December 2016 21:33 UTC

Return-Path: <hallam@gmail.com>
X-Original-To: jose@ietfa.amsl.com
Delivered-To: jose@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 787B21296A2 for <jose@ietfa.amsl.com>; Wed, 28 Dec 2016 13:33:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.597
X-Spam-Level:
X-Spam-Status: No, score=-2.597 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DJGIe044xLkL for <jose@ietfa.amsl.com>; Wed, 28 Dec 2016 13:33:53 -0800 (PST)
Received: from mail-wm0-x22a.google.com (mail-wm0-x22a.google.com [IPv6:2a00:1450:400c:c09::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4D97A12969F for <jose@ietf.org>; Wed, 28 Dec 2016 13:33:53 -0800 (PST)
Received: by mail-wm0-x22a.google.com with SMTP id c85so105220983wmi.1 for <jose@ietf.org>; Wed, 28 Dec 2016 13:33:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:from:date:message-id:subject:to; bh=rrlRT2zbQcQNy6An39CTEp8fOaSCtvJthul22i6zMqw=; b=gX0gtX7+fMCYbOG+H7vvKxDzzHzBr/Z82QJAJZ17tJGt3VX6jbu7z9sowZT70d5tCt 13yG8NZCE3u6lXxRaVfoHMDOH/6OpmnLEpLY3WrqXc/qpWkMZiRJsc0sxVfk0nU3uUP8 kB62VHO4DV6G17JzyaMih/1GY68EgdqPhyDPtacrf+teuJ//IjJ7v3OK3a7B8YMazqQr 4uMxWc80S0vm8qzDYFrhmNgGuEbyfkboTaRQoHZMq+0g3iBTFkucnYnreH9pQQ0YPTtL 5iSyGk/fUVPzxbv5XyL/bofP74cJbmmBUoYKGcdh/mGu5mivw45zWa/rZjh9KqbBLoey Murw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:from:date:message-id:subject :to; bh=rrlRT2zbQcQNy6An39CTEp8fOaSCtvJthul22i6zMqw=; b=CgQpcPXHGuODiD+XlorhsrM645bdbHiKS690d6O6yrt2xyQaR8e7nzccmf46KPLZcQ QuyWIFkhNLijG2WflVYANTIjhK9pSJlokr2b9SG/hV5snn9D4Ples4z3yC0URuddV9J/ fwsRPp99g2jagT/CGEyq+KyacDGIzh3lLsKdprBW2/8DGyC+/5ZMsXR1Zn7QuVsciOkU Ct7uagt15p3hYoWrQFIoJsTpFb5d3FDzCPdXo/gC+atdFMglTQkQYvTX4m2UZtst5sM7 qey8Lw3cUzWzCH6R5M0lo9w3iZ8qfDTYlWZM2G7sW57oJFQfG2n1eUqI6RLcsL54nQ1u 9Flw==
X-Gm-Message-State: AIkVDXLaAf7TyHYgz9AzFEgkOmE6A9eiobpK3Mor4J1yRhvvsuzOFEdKE+TRzlbDVRXYfU3p4COx0r2WDCotVw==
X-Received: by 10.28.211.72 with SMTP id k69mr32217489wmg.137.1482960831391; Wed, 28 Dec 2016 13:33:51 -0800 (PST)
MIME-Version: 1.0
Sender: hallam@gmail.com
Received: by 10.194.83.101 with HTTP; Wed, 28 Dec 2016 13:33:50 -0800 (PST)
From: Phillip Hallam-Baker <phill@hallambaker.com>
Date: Wed, 28 Dec 2016 16:33:50 -0500
X-Google-Sender-Auth: qct4NmzyjhBssNx3AzMMFxrD9js
Message-ID: <CAMm+Lwiq-pJb3n=_y_o6fbszhvrbeRXTYKCVN6GTv+uOfxSQsw@mail.gmail.com>
To: "jose@ietf.org" <jose@ietf.org>
Content-Type: multipart/alternative; boundary="001a11470106922c5e0544beb95b"
Archived-At: <https://mailarchive.ietf.org/arch/msg/jose/9Dh0zqua11erxghaWzzWKxUM88g>
Subject: [jose] Using Jose as an encrypted document container.
X-BeenThere: jose@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Javascript Object Signing and Encryption <jose.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/jose>, <mailto:jose-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jose/>
List-Post: <mailto:jose@ietf.org>
List-Help: <mailto:jose-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/jose>, <mailto:jose-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Dec 2016 21:33:55 -0000

I am currently looking at using Jose as the basis for a CMS-like document
container. This is to support the use of proxy-re-encryption to enable true
end-to-end Web security. By which I mean that the content on the site is
encrypted.

To meet these needs I need to modify Jose to support the following
additional requirements that the RFCs currently don't.

1) Single pass processing for encoding and decoding operations.
1a) Bounded state
1b) Support streaming, i.e. length not known in advance

2) Reduce complexity and number of options.

3) Efficient encoding of bodies, i.e. no Base64 overhead.


On the single pass processing, the main change I have had to make is to add
in an unprotected header to announce the digest function to be used for the
signature calculation:

"unprotected": {    "dig": "S512"}

This is straightforward but requires that a document that is signed with
multiple keys be signed using a single digest. So it might well be better
to have two entries for signatures, one at the start listing the set of
signing keys and then a second one at the end with the actual values.


To reduce complexity, I am using UDF identifiers as the sole key identifier
mechanism. Using a single identifier for keys at every stage in a system
really simplifies everything. Using fingerprints of the public key
guarantees each key has one identifier.

I am also removing as many options as possible. Having a 'simplified' way
to do something only makes the code more complex because now there are two
choices to encode rather than one.


The big problem I am having is with avoiding Base64 encoding of the message
body. There are two approaches I am considering.

1) Use JSON-B

The simplest way to avoid the Base64 overhead of JSON is to extend the JSON
encoding to add code points for length-data encoding of binary data and
strings. This is all that JSON-B is. Regular JSON in UTF8 only ever uses
bytes with values > 127 inside strings. So all those code points are
available as extension codes.

Since JSON-B is a strict superset of JSON, it is only necessary to
implement one decoder that will accept either as input. This greatly
simplifies negotiation of encodings. It is only necessary to advertise the
encodings that are accepted, a sender does not need to tag content to
specify what was sent.

The document is encoded as an array entry:

[ { JSON-Header } , [ <length> <DataChunk> ] + , {JSON-Trailer} ]

Pro: Simple, easy to implement if you have the tools
Con: Everyone needs to implement the new encoder/decoder


2) Use JSON-Log format approach

The second approach is to split the document container into three parts and
use RFC7464 record separators to delimit the boundaries, thus:

<RS> JSON-Header <LF>
[<length> <DataChunk> ] +
<RS> JSON-Trailer<LF>

In this scheme the header and trailer blocks are mandatory and there must
be at least one entry in the body.

Pro: Can use unmodified JSON encoder/decoder
Con: Have to wrap the result in something that isn't at all like JSON.


One issue that might tip the choice is that if you have comments in a chat
log, or the like, you might have a sequence of comments encrypted to a
common group key. This would enable true end-to-end encrypted chat and
other asynchronous messaging formats.