[OAUTH-WG] review: draft-ietf-oauth-json-web-token-05

=JeffH <Jeff.Hodges@KingsMountain.com> Tue, 27 November 2012 22:23 UTC

Return-Path: <Jeff.Hodges@KingsMountain.com>
X-Original-To: oauth@ietfa.amsl.com
Delivered-To: oauth@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BFA5221F8758 for <oauth@ietfa.amsl.com>; Tue, 27 Nov 2012 14:23:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.599
X-Spam-Level:
X-Spam-Status: No, score=-102.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ypaZtz1kWJgs for <oauth@ietfa.amsl.com>; Tue, 27 Nov 2012 14:23:19 -0800 (PST)
Received: from oproxy11-pub.bluehost.com (oproxy11-pub.bluehost.com [173.254.64.10]) by ietfa.amsl.com (Postfix) with SMTP id 56B3A21F8665 for <oauth@ietf.org>; Tue, 27 Nov 2012 14:23:19 -0800 (PST)
Received: (qmail 32204 invoked by uid 0); 27 Nov 2012 22:22:55 -0000
Received: from unknown (HELO box514.bluehost.com) (74.220.219.114) by oproxy11.bluehost.com with SMTP; 27 Nov 2012 22:22:55 -0000
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=kingsmountain.com; s=default; h=Content-Transfer-Encoding:Content-Type:Subject:To:MIME-Version:From:Date:Message-ID; bh=b2f2C6VkcnWR7XzPPF01Jr3+7GpVjob0XMKOaswBoH8=; b=DrqYbWoKix6E7+UTRAqicixJbVotmj1xlJnrakbKH15WaCYCc+EwDbS3bDyIEaEcWoVQPcO5QYM6vaSwpf52alE439+q/8pZmb8LyYTAquXuStMngXb66o2m09T4fCVh;
Received: from [24.4.122.173] (port=56712 helo=[192.168.11.12]) by box514.bluehost.com with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76) (envelope-from <Jeff.Hodges@KingsMountain.com>) id 1TdTYN-0004hJ-EJ for oauth@ietf.org; Tue, 27 Nov 2012 15:22:55 -0700
Message-ID: <50B53D3E.1000107@KingsMountain.com>
Date: Tue, 27 Nov 2012 14:22:54 -0800
From: =JeffH <Jeff.Hodges@KingsMountain.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121028 Thunderbird/16.0.2
MIME-Version: 1.0
To: IETF oauth WG <oauth@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Identified-User: {11025:box514.bluehost.com:kingsmou:kingsmountain.com} {sentby:smtp auth 24.4.122.173 authed with jeff.hodges+kingsmountain.com}
Subject: [OAUTH-WG] review: draft-ietf-oauth-json-web-token-05
X-BeenThere: oauth@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: OAUTH WG <oauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/oauth>, <mailto:oauth-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/oauth>
List-Post: <mailto:oauth@ietf.org>
List-Help: <mailto:oauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Nov 2012 22:23:20 -0000

Hi, at ietf-85 atlanta I agreed to do a review of 
draft-ietf-oauth-json-web-token-05, and so I have some thoughts below. Also, +1 
to Hannes' comments.

Overall the spec seems to get its idea across, but is pretty rough. Part of this 
is due to the language being convoluted, plus some concepts are only tacitly 
described (with clues scattered throughout the spec), and thus it is difficult 
to understand without multiple passes of this spec as well as [JWE] and [JWS].

For example, a JWT appears to be simply either a JWS or a JWE containing a JWT 
Claims Set, but this is not stated until right before section 3.1 (and it isn't 
stated that clearly).

Immediately below are some overall comments, and then below that some detailed 
comments on various portions of the spec.  I'm not addressing everything I 
noticed due to time constraints (apologies).

HTH

=JeffH
------


JWT terminology:

Some issues seem to me to be caused by defining the JWT to be the base64url 
encoded JSON  object itself and not having terminology to clearly refer to its 
unencoded form.

For example, these two JSON objects together apparently comprise a (unencoded) JWT..

      {"typ":"JWT",
       "alg":"HS256"}

      {"iss":"joe",
       "exp":1300819380,
       "http://example.com/is_root":true}

..but there's no defined way to refer to them given the spec's terminlogy.

Consider terming the above a JWT and its encoded-string form an Encoded JWT, and 
define them separately. And then there'll be similar definitions for JWT Header 
and JWT Claims Set, e.g.,

    Encoded JWT   A JWT that has been encoded according to the
       process defined in Section X.

    Encoded JWT Header   The encoded-string form of a JWT Header

    Encoded JWT Claims Set   The encoded-string form of a JWT Claims Set

    encoded-string form   The result of applying Base64url encoding to an
       input JSON text .

    JSON Web Token (JWT)  A JWT comprises a JWT Header and a JWT Claims Set. ...

    JWT Header  A JSON object that is a component of a JWT. It denotes the
       cryptographic operations applied to the JWT.  ...

    JWT Claims Set  A JSON object containing a set of claims.  ...


This also gets rid of the use of the "A string representing a JSON object..." 
which I find confusing and potentially misleading (because it is actually "a 
Base64url encoding of a JSON object").



UTF-8:

UTF-8 is mentioned in lots of places. It could probably be stated once up near 
the beginning of the spec that all the JSON text is UTF-8 encoded, and all the 
JSON strings are UTF-8 encoded.



Semantics, profiles and relationship to SAML:

The spec does not define any overall JWT semantics (i.e., what any given JWT 
/means/). Semantics are only defined in context of each individual Reserved 
Claim Name.

Thus any application of JWTs will need to profile the JWT spec: specifying the 
claim set(s) contents, and the overall semantics of the resultant JWT(s).  This 
is not explicitly explained in the JWT spec.

In terms of SAML, Appendix B should refer to SAML assertions rather than saml 
tokens. Also, I'm not sure SAML assertions inherently have more expressivity 
than JWTs. They do have more pre-defined structure and semantics.

Syntactically, it seems one can encode pretty much anything in whatever amount 
in a JWT (one can do the same with SAML assertions), and thus theoretically JWTs 
could be used to accomplish the same things as SAML assertions.

Semantically, SAML assertions are explicitly statements made by a system entity 
about a subject. But by default, a JWT is empty, and has no semantics (this 
isn't stated explicitly). All semantics defined in the JWT spec are particular 
to individual reserved claims, but all reserved claims are optional. Thus an 
application of JWTs to use cases also apropos for SAML assertions will require 
arguably more profiling than that needed to apply SAML assertions.

The token size & complexity comparison seems nominally fine.



Some detailed-but-rough comments and musings on portions of the spec as I was 
reading through it...



 > 2. Terminology


terminology is not alphabetised!


"claim", "claims", "token" should be defined in terminology

suggestion:

      Claim:  an assertion of something as a fact. Here, claims are
         name and value pairs, consisting of a Claim Name and a
         Claim Value.


 >    JSON Web Token (JWT)

   is jwt always a "string" or is it string (only) when it's been serialized 
into one?

 >                    A string representing a set of claims as a JSON
 >       object that is digitally signed or MACed and/or encrypted.

   is it more that it's a set of claims encoded as a JSON object
   that is string-serialized?

   is it /not/ a JWT by definition if it is not ((signed or unmac'd) and/or 
encrypted) ?   No, because there are Plaintext JWTs, but they aren't in 
terminology (probably should be).

   "parts" in JWT definition is unclear
     are "parts" json objects or arrays unto themselves ?

   the definition assumes knowledge that's presented later. perhaps needs fwd
   reference(s), or perhaps better is to not present as much technical detail
   in the definitions.


 >    JWT Claims Set

   similar comments as to JSON Web Token (JWT)

   the definition says how it is encoded and encrypted, but not how claims are 
mapped into a JSON object


should probably be simply:

    JWT Claims Set: A set of claims expressed as a JSON object, where each
       claim is an object member (i.e., a name/value pair). A claim may have
       a JWT Claims Set as a value.


 >    Claim Name  The name of a member of the JSON object representing a
 >       JWT Claims Set.

should probably be simply:

    Claim Name  The name portion of a claim, expressed as a JSON object member
       name.


 >    Claim Value  The value of a member of the JSON object representing a
 >       JWT Claims Set.

should probably be simply:

    Claim Value  The value portion of a claim, expressed as a JSON object member
       value.



 > 3. JSON Web Token (JWT) Overview

 >    The bytes of the UTF-8 representation of the JWT Claims Set are
 >    digitally signed or MACed in the manner described in JSON Web
 >    Signature (JWS) [JWS] and/or encrypted in the manner described in
 >    JSON Web Encryption (JWE) [JWE].

s/ and/or encrypted / or encrypted and signed /


 >    The contents of the JWT Header describe the cryptographic operations
 >    applied to the JWT Claims Set. If the JWT Header is a JWS Header, the
 >    claims are digitally signed or MACed.  If the JWT Header is a JWE
 >    Header, the claims are encrypted.

What if a JWT is signed AND encrypted?  Hm, from my looking at JWS and JWE 
specs, it seems that in that case one uses JWE because that encompasses both 
encrypt & sign.



 >    A JWT is represented as a JWS or JWE.  The number of parts is
 >    dependent upon the representation of the resulting JWS or JWE.

Does the above mean to say..

    A JWT consists of a JWS or JWE object, which in turn conveys the JWT
    Claims Set. In the case of a JWS, the JWT Claims Set is the JWS
    Payload. In the case of a JWE, the JWT Claims Set is the input
    Plaintext.




 > 4. JWT Claims
 >
 >
 >    The JWT Claims Set represents a JSON object whose members are the
 >    claims conveyed by the JWT.  The Claim Names within this object MUST
 >    be unique; JWTs with duplicate Claim Names MUST be rejected.

does the above mean to say claim names must be unique amongst the set of claim 
names within any given JWT Claims Set ?  If so, that's only implied by the above 
but should be stated explicitly; as it is, the above is ambiguous.


 > 4.2. Public Claim Names
 >
 >
 >    Claim names can be defined at will by those using JWTs.  However, in

s/Claim names/Public claim names/

 >    order to prevent collisions, any new claim name SHOULD either be
 >    registered in the IANA JSON Web Token Claims registry Section 9.1 or
 >    be a URI that contains a Collision Resistant Namespace.


why should a claim name be a URI if I wish to make use of Collision Resistant 
Namespaces?  For example, if I simply use say UUIDs as claim names..

      {"iss":"joe",
       "3005fa05-e76c-4994-bbc9-65b2ace2305c":"foo"}

..it will be universally unique provided I minted it appropriately (no URI 
syntax is needed).



 > 4.3. Private Claim Names
 >
 >
 >    A producer and consumer of a JWT may agree to any claim name that is
 >    not a Reserved Name Section 4.1 or a Public Name Section 4.2.  Unlike
 >    Public Names, these private names are subject to collision and should
 >    be used with caution.

it seems private claim names are only subject to collision if the implementers 
don't make appropriate use of Collision Resistant Namespaces, i.e. they "can be" 
subject to collision.


the above two sections use "public" and "private" as distinguishing factors, but 
it seems to me the distinguishing factor (given how the above is written) is 
more whether Collision Resistant Namespaces are employed or not.

An implied aspect of public claim names seems to be that it is assumed that they 
are publicly listed/documented/leaked, thus the "public" moniker, and hence 
ought to be universally unique as a matter of course.

and "private" ones seem to be assumed to be obfuscated to all but the agreeing 
parties?  Or they are "private" in only the sense that they are created in the 
context of a private arrangement?



 >
 > 7. Rules for Creating and Validating a JWT
 >
 >
 >    To create a JWT, one MUST perform these steps.  The order of the
 >    steps is not significant in cases where there are no dependencies
 >    between the inputs and outputs of the steps.
 >
 >    1.  Create a JWT Claims Set containing the desired claims.  Note that
 >        white space is explicitly allowed in the representation and no
 >        canonicalization is performed before encoding.


I presume the rationale for allowing white space is that JSON objects are 
unordered (and can contain arbitrary whitespace), thus simple buffer-to-buffer 
comparisons between serialized objects cannot be reliably performed.  If so this 
should be explicitly stated.

It seems that member/value-by-member/value comparisons must always be done, by 
parsing the JSON objects and extracting all members and values, this should be 
stated explicitly in the spec.

I found meager jwt comparison instructions at the very end of Section 7. it 
should probably be its own subsection. It should probably explicitly say that 
JWTs need to be parsed into their constituent components, and the latter must be 
individually examined/compared.


 >    Comparisons between JSON strings and other Unicode strings MUST be
 >    performed as specified below:

this comparison algorithm seems to be attempting to allow for comparison of 
UTF-8 encoded JSON strings with other unicode strings in any of the unicode 
encoding formats, but only implies that; it should be stated.

 >
 >    1.  Remove any JSON applied escaping to produce an array of Unicode
 >        code points.

I don't think (1) is correct.  A JSON string is by default encoded in UTF-8. A 
UTF-8 encoded string is not "an array of Unicode code points" (its a sequence of 
code units, which must be decoded into code points), i think a step is missing 
here..

    1.  Remove any JSON escaping from the input JSON string.

    1.a  convert the string into a sequence of unicode code points.

..and then compare code point-by-code point. This overall algorithm /seems/ ok, 
but I'm not sure, it seems there's rationale that's not expressed, eg for 
excluding use of Unicode Normalization [USA15]. Also the alg is incomplete in 
that it doesn't stipulate converting the "other unicode string" into a sequence 
of code points.




 > 10. Security Considerations
 >
 >
 >    All of the security issues faced by any cryptographic application
 >    must be faced by a JWT/JWS/JWE/JWK agent.  Among these issues are
 >    protecting the user's private key, preventing various attacks, and
 >    helping the user avoid mistakes such as inadvertently encrypting a
 >    message for the wrong recipient.  The entire list of security
 >    considerations is beyond the scope of this document, but some
 >    significant concerns are listed here.
 >
 >    All the security considerations in the JWS specification also apply
 >    to JWT, as do the JWE security considerations when encryption is
 >    employed.  In particular, the JWS JSON Security Considerations and
 >    Unicode Comparison Security Considerations apply equally to the JWT
 >    Claims Set in the same manner that they do to the JWS Header.
 >

dunno if you can get away with sec cons wholly in other docs, and I'm not sure 
it's appropriate given that JWTs are a profile of JWE or JWS.

above needs editorial polish -- there aren't any  "significant concerns" 
actually listed here.


---
end