Re: [secdir] secdir review of draft-ietf-httpauth-basicauth-update-06 -- transition strategy for charset parameter

Julian Reschke <> Fri, 20 February 2015 15:50 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 052751A8781; Fri, 20 Feb 2015 07:50:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.3
X-Spam-Status: No, score=-1.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FROM=0.001, J_CHICKENPOX_15=0.6, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id fLRrBy2pR4vE; Fri, 20 Feb 2015 07:50:02 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E62941A8547; Fri, 20 Feb 2015 07:50:01 -0800 (PST)
Received: from [] ([]) by (mrgmx003) with ESMTPSA (Nemesis) id 0M7HGA-1Xapi23egY-00x41H; Fri, 20 Feb 2015 16:49:37 +0100
Message-ID: <>
Date: Fri, 20 Feb 2015 16:49:28 +0100
From: Julian Reschke <>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
MIME-Version: 1.0
To: Daniel Kahn Gillmor <>,
References: <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Provags-ID: V03:K0:nybL+O+WEV2E2oNVzXd3cY33aepJfgT07Yvq4oqwF7jfr+B6Y3M cgvgO7JvmjavSGkYaPWC3LlBVbdpz3DwtN9weSTpNnPjmGZPI7SQeo0ErZwqKAteO5/SEPL qGsfOzcF0Qy1RNc3pvsU/6vVy5HPPXgdRQkFUWQLTT0CvsCMTxEbnPG0jQ+uaBKKQXNY/p3 fTtXyHhQJi+zaT+mVuC3g==
X-UI-Out-Filterresults: notjunk:1;
Archived-At: <>
Cc: "" <>,,
Subject: Re: [secdir] secdir review of draft-ietf-httpauth-basicauth-update-06 -- transition strategy for charset parameter
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Security Area Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 20 Feb 2015 15:50:04 -0000

On 2015-02-19 21:15, Daniel Kahn Gillmor wrote:
> ...
>> That being said, the heuristics are quite simple: try to decode the
>> octets with a strict UTF-8 decoder, and if that doesn't fail the input
>> was likely encoded in UTF-8.
> Some binary strings are valid in both character encodings, though,
> right?  For example, "c3 a1 62 63" in UTF-8 is "ábc", but in ISO-8859-1,
> it is "ábc" So if my password is non-ASCII in the first place, it could
> very well match the UTF-8 encoding even though i've intended another
> one.
> So maybe the heuristic should be: even if the UTF-8 decode succeeds, the
> server could try its fallback decoding mechanism if the UTF-8 version of
> the password doesn't match.  (fwiw, my understanding is that facebook
> checks common accidental variations on the entered password during their
> (non-basic-auth) login process.  so if my password is b4nanAs, but i
> type B4NANaS or b5nanAs, facebook might let me in anyway)
> Is this advisable?  What are the risks of testing two variants of the
> password against the password table?  I haven't thought this through
> fully, but it seems like it would be a relevant consideration.
> In the absence of a signal from the client about their choice of
> encoding, documenting these heuristics and recommending them seems like
> a useful way to facilitate adoption and uniformity among servers
> implementing this spec.
> ...

I looked at this some more, and it seems that the general question 
applies to the third paragraph in B.2 

    Finally, origin servers that need to support non-US-ASCII characters
    and can use the UTF-8 character encoding scheme can opt in as
    described above.  In the worst case, they'll continue to see either
    broken credentials or no credentials at all (depending on how legacy
    clients handle characters they cannot encode).

I propose to expand the problem description, and to mention that servers 
that need to support a mix of clients can attempt a "try both" strategy:

    Finally, origin servers that need to support non-US-ASCII characters
    and can use the UTF-8 character encoding scheme can opt in by
    specifying the charset parameter in the authentication challenge.
    Clients that do understand the charset parameter will then start to
    use UTF-8, while other clients will continue to send credentials in
    their default encoding, broken credentials, or no credentials at all.
    Until all clients are upgraded to support UTF-8, servers are likely
    to see both UTF-8 and "legacy" encodings in requests.  When
    processing as UTF-8 fails (due to a failure to decode as UTF-8 or a
    mismatch of user-id/password), a server might try a fallback to the
    previously supported legacy encoding in order to accomodate these
    legacy clients.  Note that implicit retries need to be done
    carefully; for instance, some subsystems might detect repeated login
    failures and treat them as potential credentials guessing attack.



Best regards, Julian

PS: slightly related: this section talks about origin servers instead of 
servers (-> proxy authentication); I'll fix that as well.