Re: [OAUTH-WG] Understanding the reasoning for Base64

Naitik Shah <n@daaku.org> Wed, 07 July 2010 06:12 UTC

Return-Path: <naitiks@gmail.com>
X-Original-To: oauth@core3.amsl.com
Delivered-To: oauth@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id ADFEB3A67EC for <oauth@core3.amsl.com>; Tue, 6 Jul 2010 23:12:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.976
X-Spam-Level:
X-Spam-Status: No, score=-1.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b6-0-Wkxt+yq for <oauth@core3.amsl.com>; Tue, 6 Jul 2010 23:12:40 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by core3.amsl.com (Postfix) with ESMTP id B5A4A3A67CC for <oauth@ietf.org>; Tue, 6 Jul 2010 23:12:39 -0700 (PDT)
Received: by iwn38 with SMTP id 38so1909378iwn.31 for <oauth@ietf.org>; Tue, 06 Jul 2010 23:12:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:cc:content-type; bh=9TGMujgfpdDPo0tpbKJX9KS2CsOSAJsIE04kkGqkoSo=; b=qlvvMi/507GsuXFWx0Vs+gKNAtsv0fmmHtY9rxlKStwlI/gcKSYQxrP7XU9kkpuDP6 C+/MrF4m3vvbDOYtbtpMYS07iZ1UkvVAHsYKobu9c3aN19UpGMnszCt5OsBR5txdUkk5 ai4h5WTIKERaiMMI0pLDEkdxZOyoD0JnsypAQ=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; b=W0Jy3a1kAfMDUmVWkZ7hoDwS3542H/GK+Cx+ObsWu0JAwG2DxCLRrOYjr0Dmbvbli9 l6HjGnWwv15cEfZEIPyd91Z5G09gBMbqRWhW0Z2p0hGc9loF/PEPUmMcdYup4CcXId8k 2Ks5WPcY9g0u9E/vB6feZeZiJW1mGzwrLdFAA=
Received: by 10.42.1.205 with SMTP id 13mr1853733ich.63.1278483162343; Tue, 06 Jul 2010 23:12:42 -0700 (PDT)
MIME-Version: 1.0
Sender: naitiks@gmail.com
Received: by 10.231.159.193 with HTTP; Tue, 6 Jul 2010 23:12:22 -0700 (PDT)
In-Reply-To: <AANLkTimdWTqFd8UmcnYtYPZ3Dqzsffgn1HHwxPpnHcWY@mail.gmail.com>
References: <AANLkTimMruKyblUWROkPMDapFKtTztOXqL64PpQxCmKO@mail.gmail.com> <2625894F-2979-40BD-81E1-05A6EB8723CD@facebook.com> <AANLkTinvLOV0f3I-aWpeAbfIpfGyxZSB2RHu52iw5mDC@mail.gmail.com> <AANLkTilWNneonIRX21U1RZcE80FuVSJWXU7CNm5pV275@mail.gmail.com> <AANLkTin-7PNLv-Hc229JJcOrIBh4fJqY5CMaLCMbmoIk@mail.gmail.com> <AANLkTikh_nQ8dXSp7QXJ79kCdbX1zeyPKAl_kgplb25x@mail.gmail.com> <3DC7AEF8-3283-4970-BB98-3D680A3E2429@gmail.com> <AANLkTimpvWCbCBEWdI1Id5Ig_xCUW2hvKDro5LyhufMV@mail.gmail.com> <FE47FED0-3850-4393-9C79-DE06F0F7B6CA@gmail.com> <BA564125-9FBB-4B1A-93AC-7DD1A754A5E1@facebook.com> <C66A9854-02EB-4CCE-8338-382AEEC7EA61@gmail.com> <AANLkTikiXVruhZSH3Q6rMhdZAHRBPkhE_JVhSNOhCXmN@mail.gmail.com> <6B008ED4-4536-4A95-89B6-917696E6AF79@gmail.com> <AANLkTilTxGBYt2RFrEOqaYoLCV1TQOtonBh5dxL5PQCd@mail.gmail.com> <095B9543-1F20-4DA5-A8EB-48F86CDED9A6@gmail.com> <AANLkTimdWTqFd8UmcnYtYPZ3Dqzsffgn1HHwxPpnHcWY@mail.gmail.com>
From: Naitik Shah <n@daaku.org>
Date: Tue, 06 Jul 2010 23:12:22 -0700
X-Google-Sender-Auth: CZ5jj5UpL9LFd5GzNaFsqrINWCQ
Message-ID: <AANLkTimoD2DRu2s6QgQUbBPLiYrCqtN02Oa-sk_Ag8yv@mail.gmail.com>
To: Evan Gilbert <uidude@google.com>
Content-Type: multipart/alternative; boundary="005045015aecbe2a19048ac60e77"
Cc: OAuth WG <oauth@ietf.org>
Subject: Re: [OAUTH-WG] Understanding the reasoning for Base64
X-BeenThere: oauth@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: OAUTH WG <oauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/oauth>
List-Post: <mailto:oauth@ietf.org>
List-Help: <mailto:oauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Jul 2010 06:12:45 -0000

I was hoping to avoid needing str_replace -- but I've been convinced. I'm
happy with base64url :)

Thanks,
-Naitik

On Tue, Jul 6, 2010 at 9:17 PM, Evan Gilbert <uidude@google.com> wrote:

> Hi all - having a little bit of a hard time following the full thread, but
> I'm strongly in favor of base64url encoding.
>
> A big advantage of this encoding is that, if token is base64url encoded,
> then urlencode(token) == token.
>
> This allows developers to avoid a large class of problems in dealing with
> URL encoding / decoding issues - it is very easy to accidentally double
> encode / decode values, and also easy to get tripped up on the different
> encoding rules in different parts of a URL. For example, different
> characters are OK before and after the hash, and not all browsers decode the
> hash the same way.
>
> Also being able to copy a value from a URL and use it directly in a tool or
> Authorization header is invaluable for debugging.
>
> Per notes above, the transformation is very straightforward.
>
> Evan
>
> On Sat, Jul 3, 2010 at 12:45 PM, Dick Hardt <dick.hardt@gmail.com> wrote:
>
>>
>> On 2010-07-03, at 12:14 PM, Naitik Shah wrote:
>>
>> On Sat, Jul 3, 2010 at 9:42 AM, Dick Hardt <dick.hardt@gmail.com> wrote:
>>
>>>
>>> On 2010-07-03, at 9:13 AM, Naitik Shah wrote:
>>>
>>> > I think Naitik is saying that accidentally doing base64 and not
>>>> base64url will send some '+'s along.
>>>>
>>>> if there are '+'s in the token, then it is easy for someone helping to
>>>> spot the problem. also easy for servers to send back an error message
>>>> saying, "hey, looks like you are using base64 instead of base64url encoding"
>>>>
>>>> ie, it is easy to detect the error -- urlencoding / decoding is hard to
>>>> detect as an error
>>>>
>>>
>>> The pluses are not guaranteed. They may or may not be there depending on
>>> the data stream you're encoding. If you don't urlencode the JSON, you'll get
>>> a "{", if you do it once, you'll get a "%7B", if you do it twice, you'll get
>>> a "%257B" -- seems easier to detect.
>>>
>>>
>>> Your earlier point was that developers may incorrectly use base64 instead
>>> of base64url. If they used base64, and if there is a + / = or % in the
>>> string, the server can send a useful note saying what is wrong. There may
>>> not be one of those characters depending on the input string, but if there
>>> is, then the server can suggest what the error might be using base64 instead
>>> of base64url. If the token contains ANY character that is not in base64url,
>>> then the server can say that it is not base64url encoded.
>>>
>>> That seems pretty fool proof to detect. Note that you should never get
>>> any %7B or other encoding in the token as it is url safe.
>>>
>>
>> The thing I was trying to say was that it's less predictable. That it
>> might work just fine when you're experimenting with the API because at that
>> point your token did not contain any pluses, but then suddenly started
>> failing after you sent a link to your app to someone because their encoded
>> token contains a plus. This hit-or-miss to me is worse than being able to
>> tell by looking at the first few characters of the urlencoded JSON blob
>> which will give a definitive answer as to how many times the token has been
>> urlencoded.
>>
>>
>> I understand your point. I still think base64url encoding makes it really
>> clear that it is encoded (nothing is legible anymore), allows there to be
>> one encoding format for all data, makes it easy to support encryption.
>>
>>
>>
>>
>>
>>>
>>>>  When I wrote a sample in Perl, it was pretty easy to make it base64url
>>>> which then provides a consistent encoding.
>>>>
>>>
>>> Did it involve a string replace call? Or a third party library?
>>>
>>>
>>> I used a standard CPAN library.
>>>
>>>
>> Exactly :) I'm imagining our documentation where we want to be library
>> agnostic, and have almost psuedo code like code snippets. I said this
>> earlier -- while base64 may be common in standard libraries built into
>> languages, the base64url version isn't. In order to not have a "cpan install
>> base64url" (and gem install, easy_install, mvn install..) -- we'ed most
>> likely document a str_replace() call in addition to a base64 call. And I'm
>> worried that developers will miss this detail.
>>
>>
>> Likely they will install an OAuth library that will deal with it if they
>> are going to have to sign rather than using a bearer token (I believe most
>> people will use a bearer token if they can -- soooo much easier!)
>>
>> Besides base64url, there is HMAC256 and JSON -- not all of which are built
>> in -- but are becoming more built in as time goes on, and if OAuth signing
>> uses base64url, I would expect these will all be part of standard
>> distributions in the future ...
>>
>> (... in case you are unfamiliar with my backgroud, I have delivered Perl,
>> Python and Tcl distributions in the past -- what goes into a packages is
>> what is heavily used or what we thought was a good thing to promote -- and
>> assuming that  making base64url more available is a good thing, than using
>> it in OAuth is a good thing to do. :)
>>
>>
>>
>>
>>>
>>>
>>>
>>>
>>>> >
>>>> >> I am unclear on what your point is.
>>>> >>
>>>> >> The token would be included as one of the headers. This is often
>>>> preferable as it separates the authorization layer (in header) from
>>>> application layer parameters (query string or message body)
>>>> >
>>>> > With our proposal, we were focussed on url parameters (hence the
>>>> choice of urlencode after it was all put together). I think it makes total
>>>> sense to not do the encoding as part of the sig spec, and let the transport
>>>> choice dictate which encoding to use.
>>>>
>>>> I understand what you are saying. having multiple encodings makes
>>>> libraries harder, and leads to the issues that motivated base64url over
>>>> url-encoding
>>>
>>>
>>> Glad we agree on that.
>>>
>>
>> I agree multiple encodings makes libraries harder :)
>>
>>
>> glad we agree there
>>
>>
>>
>> But I think the stark difference between OAuth1 vs 2 wrt to signing
>> actually makes the Authorization header less valuable (again, for signing
>> only). I'm pretty happy with this because I thought this header was more
>> complex for developers anyways (while big corporations with authentication
>> infrastructure love it) :) But the reason I think so is that now the header
>> is not just the signature, but also the signed payload. This means an
>> application isn't just making a http request as before with a bunch of query
>> or post parameters. It's instead making a "JSON request" that may or may not
>> have query/post params. It's just not as separate as before.
>>
>>
>> I am confused on what point you are trying to make here.
>>
>>
>> _______________________________________________
>> OAuth mailing list
>> OAuth@ietf.org
>> https://www.ietf.org/mailman/listinfo/oauth
>>
>>
>