Re: [OAUTH-WG] Understanding the reasoning for Base64

Dick Hardt <dick.hardt@gmail.com> Sat, 03 July 2010 19:45 UTC

Return-Path: <dick.hardt@gmail.com>
X-Original-To: oauth@core3.amsl.com
Delivered-To: oauth@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 2D2FA3A68F5 for <oauth@core3.amsl.com>; Sat, 3 Jul 2010 12:45:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.669
X-Spam-Level:
X-Spam-Status: No, score=-1.669 tagged_above=-999 required=5 tests=[AWL=-0.560, BAYES_05=-1.11, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IJ9MGLPVoDMm for <oauth@core3.amsl.com>; Sat, 3 Jul 2010 12:45:24 -0700 (PDT)
Received: from mail-pw0-f44.google.com (mail-pw0-f44.google.com [209.85.160.44]) by core3.amsl.com (Postfix) with ESMTP id 986513A6911 for <oauth@ietf.org>; Sat, 3 Jul 2010 12:45:24 -0700 (PDT)
Received: by pwj1 with SMTP id 1so44826pwj.31 for <oauth@ietf.org>; Sat, 03 Jul 2010 12:45:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:mime-version :content-type:from:in-reply-to:date:cc:message-id:references:to :x-mailer; bh=OXbP3y8f3jMwKFNdYIZsG7VnbY/wotciOCRRatfMAgM=; b=SXxrVvjrSEPrOM3AHwEKq3ppue3pxDeDZXuEWrTUDeNDLdmGyBhXymQtODRW7r6O1E VC2j3qUVgP+0CI164Pm5LctJIgfIABxaGxBhm2pHgFLk99np3R/yivdd56Cnw3Gtcu4k dH+eeZ4Hg77pECLEpX9DFzV5hBBVTJPYdOjh4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :message-id:references:to:x-mailer; b=IA+qhlR0PxirGsg2y7hYTR+5BPZ8RfxWMUAojKQHnZJtz0ifYL+ewVFwdWn2WEoaNS k+zLWNC5ORXMZOegTB7oNyHGrcb0W4VjA+m2yt+y6332wBN0gbpW8cSi9GStA9ZpZyzi wr0EsIenlaoo20KouNmm+ayJnIvwHbL5deHHU=
Received: by 10.114.112.17 with SMTP id k17mr742017wac.188.1278186334256; Sat, 03 Jul 2010 12:45:34 -0700 (PDT)
Received: from [192.168.1.5] (c-24-130-32-55.hsd1.ca.comcast.net [24.130.32.55]) by mx.google.com with ESMTPS id 33sm33176789wad.6.2010.07.03.12.45.30 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 03 Jul 2010 12:45:33 -0700 (PDT)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: multipart/alternative; boundary="Apple-Mail-7--768581437"
From: Dick Hardt <dick.hardt@gmail.com>
In-Reply-To: <AANLkTilTxGBYt2RFrEOqaYoLCV1TQOtonBh5dxL5PQCd@mail.gmail.com>
Date: Sat, 03 Jul 2010 12:45:29 -0700
Message-Id: <095B9543-1F20-4DA5-A8EB-48F86CDED9A6@gmail.com>
References: <AANLkTimMruKyblUWROkPMDapFKtTztOXqL64PpQxCmKO@mail.gmail.com> <2625894F-2979-40BD-81E1-05A6EB8723CD@facebook.com> <AANLkTinvLOV0f3I-aWpeAbfIpfGyxZSB2RHu52iw5mDC@mail.gmail.com> <AANLkTilWNneonIRX21U1RZcE80FuVSJWXU7CNm5pV275@mail.gmail.com> <AANLkTin-7PNLv-Hc229JJcOrIBh4fJqY5CMaLCMbmoIk@mail.gmail.com> <AANLkTikh_nQ8dXSp7QXJ79kCdbX1zeyPKAl_kgplb25x@mail.gmail.com> <3DC7AEF8-3283-4970-BB98-3D680A3E2429@gmail.com> <AANLkTimpvWCbCBEWdI1Id5Ig_xCUW2hvKDro5LyhufMV@mail.gmail.com> <FE47FED0-3850-4393-9C79-DE06F0F7B6CA@gmail.com> <BA564125-9FBB-4B1A-93AC-7DD1A754A5E1@facebook.com> <C66A9854-02EB-4CCE-8338-382AEEC7EA61@gmail.com> <AANLkTikiXVruhZSH3Q6rMhdZAHRBPkhE_JVhSNOhCXmN@mail.gmail.com> <6B008ED4-4536-4A95-89B6-917696E6AF79@gmail.com> <AANLkTilTxGBYt2RFrEOqaYoLCV1TQOtonBh5dxL5PQCd@mail.gmail.com>
To: Naitik Shah <n@daaku.org>
X-Mailer: Apple Mail (2.1081)
Cc: OAuth WG <oauth@ietf.org>
Subject: Re: [OAUTH-WG] Understanding the reasoning for Base64
X-BeenThere: oauth@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: OAUTH WG <oauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/oauth>
List-Post: <mailto:oauth@ietf.org>
List-Help: <mailto:oauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 03 Jul 2010 19:45:26 -0000

On 2010-07-03, at 12:14 PM, Naitik Shah wrote:

> On Sat, Jul 3, 2010 at 9:42 AM, Dick Hardt <dick.hardt@gmail.com> wrote:
> 
> On 2010-07-03, at 9:13 AM, Naitik Shah wrote:
>> > I think Naitik is saying that accidentally doing base64 and not base64url will send some '+'s along.
>> 
>> if there are '+'s in the token, then it is easy for someone helping to spot the problem. also easy for servers to send back an error message saying, "hey, looks like you are using base64 instead of base64url encoding"
>> 
>> ie, it is easy to detect the error -- urlencoding / decoding is hard to detect as an error
>> 
>> The pluses are not guaranteed. They may or may not be there depending on the data stream you're encoding. If you don't urlencode the JSON, you'll get a "{", if you do it once, you'll get a "%7B", if you do it twice, you'll get a "%257B" -- seems easier to detect.
> 
> Your earlier point was that developers may incorrectly use base64 instead of base64url. If they used base64, and if there is a + / = or % in the string, the server can send a useful note saying what is wrong. There may not be one of those characters depending on the input string, but if there is, then the server can suggest what the error might be using base64 instead of base64url. If the token contains ANY character that is not in base64url, then the server can say that it is not base64url encoded.
> 
> That seems pretty fool proof to detect. Note that you should never get any %7B or other encoding in the token as it is url safe.
> 
> The thing I was trying to say was that it's less predictable. That it might work just fine when you're experimenting with the API because at that point your token did not contain any pluses, but then suddenly started failing after you sent a link to your app to someone because their encoded token contains a plus. This hit-or-miss to me is worse than being able to tell by looking at the first few characters of the urlencoded JSON blob which will give a definitive answer as to how many times the token has been urlencoded.

I understand your point. I still think base64url encoding makes it really clear that it is encoded (nothing is legible anymore), allows there to be one encoding format for all data, makes it easy to support encryption.


> 
>  
>> 
>> When I wrote a sample in Perl, it was pretty easy to make it base64url which then provides a consistent encoding.
>> 
>> Did it involve a string replace call? Or a third party library?
> 
> I used a standard CPAN library.
> 
> 
> Exactly :) I'm imagining our documentation where we want to be library agnostic, and have almost psuedo code like code snippets. I said this earlier -- while base64 may be common in standard libraries built into languages, the base64url version isn't. In order to not have a "cpan install base64url" (and gem install, easy_install, mvn install..) -- we'ed most likely document a str_replace() call in addition to a base64 call. And I'm worried that developers will miss this detail.

Likely they will install an OAuth library that will deal with it if they are going to have to sign rather than using a bearer token (I believe most people will use a bearer token if they can -- soooo much easier!)

Besides base64url, there is HMAC256 and JSON -- not all of which are built in -- but are becoming more built in as time goes on, and if OAuth signing uses base64url, I would expect these will all be part of standard distributions in the future ... 

(... in case you are unfamiliar with my backgroud, I have delivered Perl, Python and Tcl distributions in the past -- what goes into a packages is what is heavily used or what we thought was a good thing to promote -- and assuming that  making base64url more available is a good thing, than using it in OAuth is a good thing to do. :)

> 
>  
>>  
>> 
>> 
>> 
>> >
>> >> I am unclear on what your point is.
>> >>
>> >> The token would be included as one of the headers. This is often preferable as it separates the authorization layer (in header) from application layer parameters (query string or message body)
>> >
>> > With our proposal, we were focussed on url parameters (hence the choice of urlencode after it was all put together). I think it makes total sense to not do the encoding as part of the sig spec, and let the transport choice dictate which encoding to use.
>> 
>> I understand what you are saying. having multiple encodings makes libraries harder, and leads to the issues that motivated base64url over url-encoding 
> 
> Glad we agree on that.
> 
> I agree multiple encodings makes libraries harder :)

glad we agree there


> 
> But I think the stark difference between OAuth1 vs 2 wrt to signing actually makes the Authorization header less valuable (again, for signing only). I'm pretty happy with this because I thought this header was more complex for developers anyways (while big corporations with authentication infrastructure love it) :) But the reason I think so is that now the header is not just the signature, but also the signed payload. This means an application isn't just making a http request as before with a bunch of query or post parameters. It's instead making a "JSON request" that may or may not have query/post params. It's just not as separate as before.

I am confused on what point you are trying to make here.