Re: [OAUTH-WG] Understanding the reasoning for Base64

Naitik Shah <n@daaku.org> Sat, 03 July 2010 16:13 UTC

Return-Path: <naitiks@gmail.com>
X-Original-To: oauth@core3.amsl.com
Delivered-To: oauth@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 187363A6811 for <oauth@core3.amsl.com>; Sat, 3 Jul 2010 09:13:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.744
X-Spam-Level:
X-Spam-Status: No, score=-1.744 tagged_above=-999 required=5 tests=[AWL=0.232, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pjG9iEoNyxr4 for <oauth@core3.amsl.com>; Sat, 3 Jul 2010 09:13:48 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by core3.amsl.com (Postfix) with ESMTP id AF7A33A67D0 for <oauth@ietf.org>; Sat, 3 Jul 2010 09:13:47 -0700 (PDT)
Received: by iwn10 with SMTP id 10so1971870iwn.31 for <oauth@ietf.org>; Sat, 03 Jul 2010 09:14:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:cc:content-type; bh=s5R0W3Gd2dFL8v8s6+xXepQPQlAWbx+iKR/6axysYGc=; b=c62knHDgl7UdwLdIo8o2QLlpcOpR2a2A3OsJ0TauauE6WoJ+/u2EAgO8zGuAl+UgdG HTlEzan8MECa+PS22mIsWhoazPoP3GKusxkJ+0P/ZQN7lNGv3z+uKpejDAV6wGWIE5QI PRb0vDIRbVr5BP/OHnK4Vjjw/Jmq9xKKJoRT4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; b=gbccj549C4HEt6Fkh7t5gAbKJST/lXAaqbfcuSbOJLLoLDeF7JLim1zZmGXfnifg36 HGhtYMmPa8dNaklklBp/bJ9AAyYEchn/VzZUFxjL1Rbws37wdH6u3EFmNsdEk3G/hJKA VJAkf9ZGHXhyibyVmlO4CG2NsHApid7xNDKdA=
Received: by 10.231.35.195 with SMTP id q3mr593605ibd.22.1278173639297; Sat, 03 Jul 2010 09:13:59 -0700 (PDT)
MIME-Version: 1.0
Sender: naitiks@gmail.com
Received: by 10.231.170.9 with HTTP; Sat, 3 Jul 2010 09:13:39 -0700 (PDT)
In-Reply-To: <C66A9854-02EB-4CCE-8338-382AEEC7EA61@gmail.com>
References: <AANLkTimMruKyblUWROkPMDapFKtTztOXqL64PpQxCmKO@mail.gmail.com> <2625894F-2979-40BD-81E1-05A6EB8723CD@facebook.com> <AANLkTinvLOV0f3I-aWpeAbfIpfGyxZSB2RHu52iw5mDC@mail.gmail.com> <AANLkTilWNneonIRX21U1RZcE80FuVSJWXU7CNm5pV275@mail.gmail.com> <AANLkTin-7PNLv-Hc229JJcOrIBh4fJqY5CMaLCMbmoIk@mail.gmail.com> <AANLkTikh_nQ8dXSp7QXJ79kCdbX1zeyPKAl_kgplb25x@mail.gmail.com> <3DC7AEF8-3283-4970-BB98-3D680A3E2429@gmail.com> <AANLkTimpvWCbCBEWdI1Id5Ig_xCUW2hvKDro5LyhufMV@mail.gmail.com> <FE47FED0-3850-4393-9C79-DE06F0F7B6CA@gmail.com> <BA564125-9FBB-4B1A-93AC-7DD1A754A5E1@facebook.com> <C66A9854-02EB-4CCE-8338-382AEEC7EA61@gmail.com>
From: Naitik Shah <n@daaku.org>
Date: Sat, 03 Jul 2010 09:13:39 -0700
X-Google-Sender-Auth: wUCo7wOiZt3mLy4oXex075eMyCk
Message-ID: <AANLkTikiXVruhZSH3Q6rMhdZAHRBPkhE_JVhSNOhCXmN@mail.gmail.com>
To: Dick Hardt <dick.hardt@gmail.com>
Content-Type: multipart/alternative; boundary="0022152d6cb5bb5480048a7dfd86"
Cc: OAuth WG <oauth@ietf.org>
Subject: Re: [OAUTH-WG] Understanding the reasoning for Base64
X-BeenThere: oauth@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: OAUTH WG <oauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/oauth>
List-Post: <mailto:oauth@ietf.org>
List-Help: <mailto:oauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 03 Jul 2010 16:13:50 -0000

On Sat, Jul 3, 2010 at 9:02 AM, Dick Hardt <dick.hardt@gmail.com> wrote:

>
> On 2010-07-02, at 5:04 PM, Paul Tarjan wrote:
>
> >>> We don't think base64url will work, because the most common error we'll
> see is that developers forget the "url" part and just do plain base64, and
> that's not sufficient because the stock set includes +.
> >>
> >> I think forgetting to url-decode is more likely than doing the wrong
> base64 encoding. At least with the wrong base64 encoding, what was done
> wrong is more obvious right away. The + will not be in the string.
> >
> > Most web frameworks that I know of urldecode the inputs before they even
> hit application code.
> >
> >
> >
> >>>
> >>> So it will maybe work, maybe not. Maybe they'll do urlencoding after
> anyways, since if they are passing this as a query param, or post data,
> client libraries will take a dict and try to "do the right thing". And we
> end up with pluses, and we're not quite sure if they should be urldecoded or
> not.
> >>
> >> we won't have pluses
> >
> > I think Naitik is saying that accidentally doing base64 and not base64url
> will send some '+'s along.
>
> if there are '+'s in the token, then it is easy for someone helping to spot
> the problem. also easy for servers to send back an error message saying,
> "hey, looks like you are using base64 instead of base64url encoding"
>
> ie, it is easy to detect the error -- urlencoding / decoding is hard to
> detect as an error
>

The pluses are not guaranteed. They may or may not be there depending on the
data stream you're encoding. If you don't urlencode the JSON, you'll get a
"{", if you do it once, you'll get a "%7B", if you do it twice, you'll get a
"%257B" -- seems easier to detect.




> >
> >
> >
> >
> >> why hex? ... why not base64url?
> >
> > It seems to be the encoding format in languages:
> >
> > python:
> >>>> hmac.new('secret', 'payload', hashlib.sha256).hexdigest()
> > 'b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4'
> >
> > php:
> > print hash_hmac('sha256', 'payload', 'secret');
> > b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba
> >
> > ruby:
> >>> HMAC::SHA256.hexdigest('secret', 'payload')
> > => "b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4"
>
> When I wrote a sample in Perl, it was pretty easy to make it base64url
> which then provides a consistent encoding.
>

Did it involve a string replace call? Or a third party library?




> >
> >> I am unclear on what your point is.
> >>
> >> The token would be included as one of the headers. This is often
> preferable as it separates the authorization layer (in header) from
> application layer parameters (query string or message body)
> >
> > With our proposal, we were focussed on url parameters (hence the choice
> of urlencode after it was all put together). I think it makes total sense to
> not do the encoding as part of the sig spec, and let the transport choice
> dictate which encoding to use.
>
> I understand what you are saying. having multiple encodings makes libraries
> harder, and leads to the issues that motivated base64url over url-encoding




> >
> > Therefore, I think we should make the signature:
> >
> >    hash + '.' + json string
> >
> > And then if you are putting it in a url parameter, you should urlencode
> the whole thing. If you are putting it in an HTTP header you should remove
> all the "\r" and "\n" in the json output (which are only whitespace as they
> aren't allowed inside strings, and most language encoders won't even output
> them by default).
> >
> > This way, this is a general signature spec, regardless of how it is being
> sent. You could send it as a DNS record and do the proper encoding for that
> scenario, or carrier pigeon encoded in Navajo, etc.
> >
> >
> >
> > So to sum up:
> >
> > * We'd like the signature first (so you can left split instead of right
> split)
>
> What are the advantages of left split vs right split?
>

Built in split function with a limit is more common, which makes the left
split easier.



-Naitik