[OAUTH-WG] OAuth v2 Authorization Header Credentials

Mark Lentczner <mzero@google.com> Wed, 27 October 2010 23:42 UTC

DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=mime-version:from:date:message-id:subject:to:content-type :content-transfer-encoding; b=RMFnobDrxuyr5RcD4Zo8sRdV8hSitoLs3AqGucF14bENlI4bECUgLoMLXqmtQETS0d GPJLFfg+cxHtpQAtoNrw==
MIME-Version: 1.0
From: Mark Lentczner <mzero@google.com>
Date: Wed, 27 Oct 2010 16:44:17 -0700
Message-ID: <AANLkTi=d9jBES+f0ynxuH8+XDPHi4swE8LbnhGdiY1On@mail.gmail.com>
To: oauth@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: [OAUTH-WG] OAuth v2 Authorization Header Credentials
Precedence: list

In my work implementing OAuth v2 to draft 10, I have had to review the
definition of credentials included in an Authorization header. I
believe this current definition has some significant issues for
parsers. However, with some small clean up of the definition, proposed
below, I believe that the credentials can be made robust and more
compatible for existing and future implementations.

= Background =
In draft-ietf-oauth-v2-10 §5.1.1 we find the ABNF for credentials,
which is the non-terminal for the value of the Authorization header:
    credentials    = "OAuth" RWS access-token [ CS 1#auth-param ]
    access-token   = 1*( quoted-char / <"> )

    CS             = OWS "," OWS

    quoted-char    =   "!" / "#" / "$" / "%" / "&" / "'" / "("
                     / ")" / "*" / "+" / "-" / "." / "/" / DIGIT
                     / ":" / "<" / "=" / ">" / "?" / "@" / ALPHA
                     / "[" / "]" / "^" / "_" / "`" / "{" / "|"
                     / "}" / "~" / "\" / "," / ";"

Section §1.1 tells us how to interpret this ABNF:
This document uses the Augmented Backus-Naur Form (ABNF) notation of
[I-D.ietf-httpbis-p1-messaging].  Additionally, the following rules
are included from [RFC2617]: realm, auth-param; from [RFC3986]:
URI-Reference; and from [I-D.ietf-httpbis-p1-messaging]: OWS, RWS, and
quoted-string.

In turn, the I-D.ietf-httpbis-p1-messaging draft defines it’s ABNF
rules in §1.2, as an extension of the ABNF defined in RFC5234. In
particular, it adds a “#” construct.(*)

The definition of the Authorization header used in OAuth v2 is based
on RFC2617, which in turn is based on RFC2616. There is a potential
clash here between the original specification of HTTP, and the new one
being worked on by the httpbis WG. In particular, the two sets of
standards use different ABNF frameworks.

RFC2616 §14.8 tells us that the Authorization header’s value is the
non-terminal credentials, which it never defines. RFC2617 takes up the
challenge, but defines it multiple times:
    §1.2:		credentials = auth-scheme #auth-param
    §2:		  credentials = "Basic" basic-credentials
    §3.2.2:	credentials = "Digest" digest-response

But these aren’t at all consistent given that #auth-param is a list of
key/value pairs, whereas basic-credentials is a base64 encoded string,
and digest-response is a mixture of key/value pairs and strings.

Unfortunately, httpbis WG’s draft  draft-ietf-httpbis-p7-auth-11
§1.2.2 and §3.2 gums things up by referring back to RFC2617 for the
definition of credentials, thus mixing two ABNF forms...

The OAuth draft, RFC5849 §3.5.1 defines credentials that start with an
auth-scheme of “OAuth”, and so any reuse of that auth-scheme would
have to be compatible with it. Alas, it doesn’t use an ABNF to specify
it’s credentials. The prose reads as a comma separated list of
key/value pairs, where all the keys start with “oauth_” or are the key
“realm”, and all the values must be quoted. This is yet a fourth form!


= Problems =
In the current OAuth v2 draft 10, there is a clash between the comma
in CS and having comma as part of access-token. Upon encountering a
comma, how do you know if it is part of the token, or introducing the
optional list of auth-param values?

As defined, these v2 credentials may not be parsable by the v1 syntax.
The non-key/value access-token would be illegal. This might be
considered good, and perhaps it was the intention to make this the way
to distinguish the two types of credentials. However, it is flawed
because the equal sign and quote characters are both legal as part of
the access-token. Hence realm="Example" is both a legal OAuth v2
access-token, and the legal start of an OAuth v1 credential. What’s
worse is that comma is allowed in access-token, and so actually, ALL
OAuth v1 credentials could be seen as v2 access-tokens, or not, or
even parsed as some prefix that is the access-token and remainder that
are the auth-params.

== Minor Problems ==
The non-terminal access-token as defined includes characters that are
not allowed in a structured header value without being inside a quoted
string. See  I-D.ietf-httpbis-p1-messaging §1.2.2. However, that same
draft, for historical reasons, gives headers considerable leeway in
violating this, §3.2.

The access-token allows quote characters, but not as delimiters. This
is somewhat at odds with the normal header parsing of tokens. It would
be clear what "foo" means: Is that three characters (as in a
quoted-string) or five characters, including the quotes?


= Possible Solutions =
I am going to assume that the intention of RFC2617, and the eventually
httpbis auth drafts is that after the scheme, you have a comma
separated list of items, where the items are either strings, or
key/value pairs. I also assume the intention was to make v1 vs. v2
credentials distinguishable syntactically even though they shared the
“OAuth” scheme.

This would imply that quote, equals sign, and comma should be removed
from access-token. But that would remove the pad character (“=”) used
in all Base64 encodings, which seems common for tokens. We can get
around this by either:
    • quoting the access token
    • defining that the first, and only the first, item is not a
key/value pair, but just a value
    • putting the access token in a key/value pair


= Proposal =
I met with a number of other engineers here at Google that have been
involled in our OAuth efforts for some time. We discussed the merits
of various approaches I came up with, and so now I propose the
following:

== OAuth2 Credentials ==
This follows the basic form as laid out by RFC2617 §1.2, but defined
in terms of the ABNF used in httpbis:
    credentials    = "OAuth2" RWS #auth-param
    auth-param     = auth-key "=" ( token | quoted-string )
    auth-key       = "realm" / "access_token" / future-key
    future-key     = token

N.B.: The values are either tokens or quoted strings, which have
escaping rules defined in httpbis. Notably, the equals sign and
forward slash are not allowed in token, and so base64 encoded values
would have to be placed within double quotes.

The future-key rule allows for future extension.  In the spirit of the
rest of the v2 draft, the parameters are named without any prefix. The
scheme has been changed, so that a choice of parsers can be made
up-front, rather than attempting to validate under OAuth v1 rules,
then falling back to some v2 set of rules when that fails.

Interoperating with Draft 10 Implementations
In order to be able to consistently interpret credentials that have
been deployed under draft 10, we recommend the following algorithm:
    if the string "oauth_signature=" is in the credential
        then parse as an OAuth v1 credential
        otherwise the credential is an OAuth v2 draft 10 credential,
to be parsed as:

        draft10credentials = "OAuth" RWS draft10value
        draft10value       = access-token [ WSP VCHAR* ]

== Alternative ==
If the community deems it infeasible to use a v2 specific auth-scheme,
then these rules need to be altered so that they are compliant with
OAuth v1, yet rigorous. In particular, OAuth v1 requires the double
quotes, but is silent about what is legal inside them, or how quoting
works. This alternative proposal tightens that up:

    credentials    = "OAuth" RWS #auth-param
    auth-param     = auth-key "=" quoted-string
    auth-key       = "realm" / oauth1-key / oauth2-key / future-key

    oauth1-key     = "oauth_" oauth1-param
    oauth1-param   = "consumer_key" / "token" / "signature" / and many others...

    oauth2-key     = "oauth2_" oauth2-param
    oauth2-param   = "access_token"

    future-key     = token

Lastly, if there was significant compatability issues with introducing
a new prefix to key values in OAuth credentials, then "oauth_" could
be used as the prefix for all keys, and OAuth v2 would rely on its
parameter names not clashing with OAuth v1's.

This all being said, I, and the engineers I talked with, feel that
there would be far fewer compatability issues if the existing OAuth v1
credential processing code were left as is, and OAuth v2 went with the
new auth-scheme of "OAuth2" for its credentials.

— Mark Lentczner, mzero@google.com

(*) It spells out a very clear definition, which I think is slightly
different than previous such constructs. Nonetheless, note that
consumers must accept lists with missing elements, and possibly
leading and/or trailing commas, none of which contribute elements to
the semantic list. Producers, on the other hand, must not generate
such things.

[OAUTH-WG] OAuth v2 Authorization Header Credenti… Mark Lentczner
Re: [OAUTH-WG] OAuth v2 Authorization Header Cred… Manger, James H
Re: [OAUTH-WG] OAuth v2 Authorization Header Cred… Manger, James H