Re: [OAUTH-WG] Understanding the reasoning for Base64

Dick Hardt <dick.hardt@gmail.com> Wed, 07 July 2010 06:45 UTC

Return-Path: <dick.hardt@gmail.com>
X-Original-To: oauth@core3.amsl.com
Delivered-To: oauth@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B37E63A6862 for <oauth@core3.amsl.com>; Tue, 6 Jul 2010 23:45:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DZlnfd2sbCG6 for <oauth@core3.amsl.com>; Tue, 6 Jul 2010 23:45:52 -0700 (PDT)
Received: from mail-pz0-f44.google.com (mail-pz0-f44.google.com [209.85.210.44]) by core3.amsl.com (Postfix) with ESMTP id 155143A6828 for <oauth@ietf.org>; Tue, 6 Jul 2010 23:45:52 -0700 (PDT)
Received: by pzk6 with SMTP id 6so750128pzk.31 for <oauth@ietf.org>; Tue, 06 Jul 2010 23:45:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:mime-version :content-type:from:in-reply-to:date:cc:message-id:references:to :x-mailer; bh=oYQkhgtBp/ezY0g78qNTqcQWHaDEXkuj7Meph3+gAGs=; b=N6XwzLhcxwhmV1DmOZmiPhVW7pK3geTqAljikbVpZo1Zuf0z9OQyiqFNTiR9EH8Oz3 tG1yjBdQN4FqdF6i3jIt4be+5bfQsqxS+aqVNSiINhZ3s2xqPfr2HVUKzqgIN69CwIgI 8a5QjWwwMnliUquDeNugDj6RO2oGWJa9YhMks=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :message-id:references:to:x-mailer; b=ARs/bkmvGnCNdxMYcTkvfN0utK/BV5vbzrg6hl1Yy2RFIyNhdRo9+XQ6x3OEDIFn3g s8/PzMoAeap9DEGoo12qHWwWNB723TJ2Ne3ULNwLlDtGcTe4zVH1rIBQbzCyJVn09Q83 xyjQo9r0owoBqA7PO/DVS1zg9WjmaQO84omBg=
Received: by 10.142.125.20 with SMTP id x20mr6810962wfc.134.1278485151380; Tue, 06 Jul 2010 23:45:51 -0700 (PDT)
Received: from [192.168.1.5] (c-24-130-32-55.hsd1.ca.comcast.net [24.130.32.55]) by mx.google.com with ESMTPS id t11sm6845080wfc.4.2010.07.06.23.45.49 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 06 Jul 2010 23:45:49 -0700 (PDT)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: multipart/alternative; boundary="Apple-Mail-43--469762011"
From: Dick Hardt <dick.hardt@gmail.com>
In-Reply-To: <AANLkTimoD2DRu2s6QgQUbBPLiYrCqtN02Oa-sk_Ag8yv@mail.gmail.com>
Date: Tue, 06 Jul 2010 23:45:48 -0700
Message-Id: <63E303F0-C3EF-4D56-AFC2-9D5BB06DCFF6@gmail.com>
References: <AANLkTimMruKyblUWROkPMDapFKtTztOXqL64PpQxCmKO@mail.gmail.com> <2625894F-2979-40BD-81E1-05A6EB8723CD@facebook.com> <AANLkTinvLOV0f3I-aWpeAbfIpfGyxZSB2RHu52iw5mDC@mail.gmail.com> <AANLkTilWNneonIRX21U1RZcE80FuVSJWXU7CNm5pV275@mail.gmail.com> <AANLkTin-7PNLv-Hc229JJcOrIBh4fJqY5CMaLCMbmoIk@mail.gmail.com> <AANLkTikh_nQ8dXSp7QXJ79kCdbX1zeyPKAl_kgplb25x@mail.gmail.com> <3DC7AEF8-3283-4970-BB98-3D680A3E2429@gmail.com> <AANLkTimpvWCbCBEWdI1Id5Ig_xCUW2hvKDro5LyhufMV@mail.gmail.com> <FE47FED0-3850-4393-9C79-DE06F0F7B6CA@gmail.com> <BA564125-9FBB-4B1A-93AC-7DD1A754A5E1@facebook.com> <C66A9854-02EB-4CCE-8338-382AEEC7EA61@gmail.com> <AANLkTikiXVruhZSH3Q6rMhdZAHRBPkhE_JVhSNOhCXmN@mail.gmail.com> <6B008ED4-4536-4A95-89B6-917696E6AF79@gmail.com> <AANLkTilTxGBYt2RFrEOqaYoLCV1TQOtonBh5dxL5PQCd@mail.gmail.com> <095B9543-1F20-4DA5-A8EB-48F86CDED9A6@gmail.com> <AANLkTimdWTqFd8UmcnYtYPZ3Dqzsffgn1HHwxPpnHcWY@mail.gmail.com> <AANLkTimoD2DRu2s6QgQUbBPLiYrCqtN02Oa-sk_Ag8yv@mail.g mail.com>
To: Naitik Shah <n@daaku.org>
X-Mailer: Apple Mail (2.1081)
Cc: OAuth WG <oauth@ietf.org>
Subject: Re: [OAUTH-WG] Understanding the reasoning for Base64
X-BeenThere: oauth@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: OAUTH WG <oauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/oauth>
List-Post: <mailto:oauth@ietf.org>
List-Help: <mailto:oauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Jul 2010 06:45:54 -0000

Yeah!

On 2010-07-06, at 11:12 PM, Naitik Shah wrote:

> I was hoping to avoid needing str_replace -- but I've been convinced. I'm happy with base64url :)
> 
> 
> Thanks,
> -Naitik
> 
> On Tue, Jul 6, 2010 at 9:17 PM, Evan Gilbert <uidude@google.com> wrote:
> Hi all - having a little bit of a hard time following the full thread, but I'm strongly in favor of base64url encoding.
> 
> A big advantage of this encoding is that, if token is base64url encoded, then urlencode(token) == token. 
> 
> This allows developers to avoid a large class of problems in dealing with URL encoding / decoding issues - it is very easy to accidentally double encode / decode values, and also easy to get tripped up on the different encoding rules in different parts of a URL. For example, different characters are OK before and after the hash, and not all browsers decode the hash the same way.
> 
> Also being able to copy a value from a URL and use it directly in a tool or Authorization header is invaluable for debugging.
> 
> Per notes above, the transformation is very straightforward.
> 
> Evan 
> 
> On Sat, Jul 3, 2010 at 12:45 PM, Dick Hardt <dick.hardt@gmail.com> wrote:
> 
> On 2010-07-03, at 12:14 PM, Naitik Shah wrote:
> 
>> On Sat, Jul 3, 2010 at 9:42 AM, Dick Hardt <dick.hardt@gmail.com> wrote:
>> 
>> On 2010-07-03, at 9:13 AM, Naitik Shah wrote:
>>> > I think Naitik is saying that accidentally doing base64 and not base64url will send some '+'s along.
>>> 
>>> if there are '+'s in the token, then it is easy for someone helping to spot the problem. also easy for servers to send back an error message saying, "hey, looks like you are using base64 instead of base64url encoding"
>>> 
>>> ie, it is easy to detect the error -- urlencoding / decoding is hard to detect as an error
>>> 
>>> The pluses are not guaranteed. They may or may not be there depending on the data stream you're encoding. If you don't urlencode the JSON, you'll get a "{", if you do it once, you'll get a "%7B", if you do it twice, you'll get a "%257B" -- seems easier to detect.
>> 
>> Your earlier point was that developers may incorrectly use base64 instead of base64url. If they used base64, and if there is a + / = or % in the string, the server can send a useful note saying what is wrong. There may not be one of those characters depending on the input string, but if there is, then the server can suggest what the error might be using base64 instead of base64url. If the token contains ANY character that is not in base64url, then the server can say that it is not base64url encoded.
>> 
>> That seems pretty fool proof to detect. Note that you should never get any %7B or other encoding in the token as it is url safe.
>> 
>> The thing I was trying to say was that it's less predictable. That it might work just fine when you're experimenting with the API because at that point your token did not contain any pluses, but then suddenly started failing after you sent a link to your app to someone because their encoded token contains a plus. This hit-or-miss to me is worse than being able to tell by looking at the first few characters of the urlencoded JSON blob which will give a definitive answer as to how many times the token has been urlencoded.
> 
> I understand your point. I still think base64url encoding makes it really clear that it is encoded (nothing is legible anymore), allows there to be one encoding format for all data, makes it easy to support encryption.
> 
> 
>> 
>>  
>>> 
>>> When I wrote a sample in Perl, it was pretty easy to make it base64url which then provides a consistent encoding.
>>> 
>>> Did it involve a string replace call? Or a third party library?
>> 
>> I used a standard CPAN library.
>> 
>> 
>> Exactly :) I'm imagining our documentation where we want to be library agnostic, and have almost psuedo code like code snippets. I said this earlier -- while base64 may be common in standard libraries built into languages, the base64url version isn't. In order to not have a "cpan install base64url" (and gem install, easy_install, mvn install..) -- we'ed most likely document a str_replace() call in addition to a base64 call. And I'm worried that developers will miss this detail.
> 
> Likely they will install an OAuth library that will deal with it if they are going to have to sign rather than using a bearer token (I believe most people will use a bearer token if they can -- soooo much easier!)
> 
> Besides base64url, there is HMAC256 and JSON -- not all of which are built in -- but are becoming more built in as time goes on, and if OAuth signing uses base64url, I would expect these will all be part of standard distributions in the future ... 
> 
> (... in case you are unfamiliar with my backgroud, I have delivered Perl, Python and Tcl distributions in the past -- what goes into a packages is what is heavily used or what we thought was a good thing to promote -- and assuming that  making base64url more available is a good thing, than using it in OAuth is a good thing to do. :)
> 
>> 
>>  
>>>  
>>> 
>>> 
>>> 
>>> >
>>> >> I am unclear on what your point is.
>>> >>
>>> >> The token would be included as one of the headers. This is often preferable as it separates the authorization layer (in header) from application layer parameters (query string or message body)
>>> >
>>> > With our proposal, we were focussed on url parameters (hence the choice of urlencode after it was all put together). I think it makes total sense to not do the encoding as part of the sig spec, and let the transport choice dictate which encoding to use.
>>> 
>>> I understand what you are saying. having multiple encodings makes libraries harder, and leads to the issues that motivated base64url over url-encoding 
>> 
>> Glad we agree on that.
>> 
>> I agree multiple encodings makes libraries harder :)
> 
> glad we agree there
> 
> 
>> 
>> But I think the stark difference between OAuth1 vs 2 wrt to signing actually makes the Authorization header less valuable (again, for signing only). I'm pretty happy with this because I thought this header was more complex for developers anyways (while big corporations with authentication infrastructure love it) :) But the reason I think so is that now the header is not just the signature, but also the signed payload. This means an application isn't just making a http request as before with a bunch of query or post parameters. It's instead making a "JSON request" that may or may not have query/post params. It's just not as separate as before.
> 
> I am confused on what point you are trying to make here. 
> 
> 
> _______________________________________________
> OAuth mailing list
> OAuth@ietf.org
> https://www.ietf.org/mailman/listinfo/oauth
> 
> 
>