Re: [Lager] Stephen Farrell's Discuss on draft-ietf-lager-specification-11: (with DISCUSS and COMMENT)

"Asmus Freytag (c)" <asmusf@ix.netcom.com> Mon, 16 May 2016 17:21 UTC

Return-Path: <asmusf@ix.netcom.com>
X-Original-To: lager@ietfa.amsl.com
Delivered-To: lager@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 433E612D807; Mon, 16 May 2016 10:21:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.72
X-Spam-Level:
X-Spam-Status: No, score=-2.72 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); domainkeys=pass (384-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I4nejwCwLGGf; Mon, 16 May 2016 10:21:36 -0700 (PDT)
Received: from elasmtp-banded.atl.sa.earthlink.net (elasmtp-banded.atl.sa.earthlink.net [209.86.89.70]) by ietfa.amsl.com (Postfix) with ESMTP id 37B3512D7F8; Mon, 16 May 2016 10:21:35 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=ix.netcom.com; b=PisVDlnv7bU8pJSx1tAuc4eXbHKz8Y03zeyw2odOTcpT51ks9yZO88kYPAH/zqB3; h=Received:Subject:To:References:Cc:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding:X-ELNK-Trace:X-Originating-IP;
Received: from [71.35.99.91] (helo=[192.168.0.4]) by elasmtp-banded.atl.sa.earthlink.net with esmtpa (Exim 4.67) (envelope-from <asmusf@ix.netcom.com>) id 1b2MCc-0007t5-0T; Mon, 16 May 2016 13:21:10 -0400
To: Stephen Farrell <stephen.farrell@cs.tcd.ie>, Alexey Melnikov <aamelnikov@fastmail.fm>
References: <20160421102401.19578.54300.idtracker@ietfa.amsl.com> <1461412191.851961.587365345.53A5CC4C@webmail.messagingengine.com> <571B634F.9070600@cs.tcd.ie> <df5235b5-314d-274f-0579-de5de36b7d85@ix.netcom.com> <5739C186.9040200@cs.tcd.ie>
From: "Asmus Freytag (c)" <asmusf@ix.netcom.com>
Message-ID: <0a7a8f7e-6e9f-6341-dcc7-55c5b5b6bd65@ix.netcom.com>
Date: Mon, 16 May 2016 10:21:10 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0
MIME-Version: 1.0
In-Reply-To: <5739C186.9040200@cs.tcd.ie>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: 464f085de979d7246f36dc87813833b2b484d7840976cb7e568c8b3c21060b5e420db38f5576d57d350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 71.35.99.91
Archived-At: <http://mailarchive.ietf.org/arch/msg/lager/sja-Zpv_zxK8zCfaUPgovBIZ4HI>
Cc: draft-ietf-lager-specification@ietf.org, audric.schiltknecht@viagenie.ca, The IESG <iesg@ietf.org>, lager@ietf.org
Subject: Re: [Lager] Stephen Farrell's Discuss on draft-ietf-lager-specification-11: (with DISCUSS and COMMENT)
X-BeenThere: lager@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Label Generation Rules <lager.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lager>, <mailto:lager-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lager/>
List-Post: <mailto:lager@ietf.org>
List-Help: <mailto:lager-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lager>, <mailto:lager-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2016 17:21:38 -0000

On 5/16/2016 5:48 AM, Stephen Farrell wrote:
> Hiya,
>
> This is the promised (and delayed, sorry;-) follow up...
>
> On 28/04/16 20:58, Asmus Freytag (c) wrote:
>>> (4) section 12: I don't think this is at all sufficient.
>>> Missing aspects include: Imprecise LGRs could result in
>>> registration of identifiers that are unexpected in many other
>>> protocols, leading to new vulnerabilities; LGRs could be
>>> deliberately manipulated so as to create such imprecision, and
>>> if I could feed one such to a registry (e.g. via some nice
>>> friendly looking git repo) then I could exploit the vuln later
>>> for fun and profit - that seems to call for some
>>> interoperable form of data integrity and origin
>>> authentication (is the lager WG doing that?) and lastly (for
>>> now), the XML language defined here is very flexible as noted
>>> earlier - I would expect there to be many implementation bugs
>>> in new code that attempts to parse this language. So I think
>>> the security considerations needs to be re-done really.
>> My reading of this comment is that it presupposes a particular use-case.
> Isn't that use-case the reason why this language was developed?
> The charter says it is "primarily to support" that anyway, so
> I don't accept that this is just one amongst many use cases,
> if that's what you mean, but do correct me if I'm getting what
> you meant wrong.

The use case I was referring to is your scenario of registries 
"grabbing" random LGR definitions from open source repositories and 
making them policy without substantial review.

I think the more fundamental issue to cover is that the use of the LGR 
format - by itself - offers no guarantee of suitability of the LGR 
content (more on that below). That is a fair point. The security issues 
ultimately stem from the use of unsuitable policies, not from how they 
are expressed.
>
>> The use case we've identified so far is that these will substitute for
>> IDN tables.
>>
>> IDN tables are controlled by registries and define registry policy. As
>> such it would be up to the registry to have a policy that is well-defined,
>> and the registries would be in charge of developing or adopting an LGR
>> that matches their policies.
>>
>> The LGR data format (and that's what it is) doesn't change anything in who
>> can set and enforce registry policy.
> Correct. But doesn't this change who is likely to develop the
> (fragments of) documents that describe those policies and
> in doing so it could (IMO fairly easily) create unexpected
> side effects for other protocols.
>
>> I also do not understand what an "imprecise" LGR would be. In contrast
>> to what they replace (IDN tables), LGRs are way more definite.
> By "imprecise" I meant an LGR that allows some identifiers
> to end up in protocols where those identifiers cause problems.

Thanks for the clarification. That puts a different spin on things. An 
"imprecise" statement of label generation rules usually means you can't 
tell reliably which labels are being generated. However, let's take your 
definition of "doesn't automatically prevent unsuitable labels".

I fully agree. Nothing in the LGR format prevents unsuitable labels. 
It's just a format. Deciding what labels are suitable or not is the role 
of the LGR content. The draft makes limited, if any, recommendations for 
the contents of label generation rules.

> If there's a good argument that that can never happen then
> that'd be a fine counter-argument, but ISTM that the LGR
> language would allow that. Yes, a registry who did that could
> be called out for it, but fixing such things afterwards can
> be hard and some of the subtleties might take a while to get
> noticed.

Perhaps it would be useful to spell out that the draft (by specifying a 
format) does not by itself change what labels are generated.

It enables a number of techniques that might increase security: the 
format guarantees that each label will be definitely associated with a 
disposition, and many properties of a set of label generation rules can 
be mechanically verified.

In the root zone LGR process we found that having a consistent and 
unambiguous format greatly helps in understanding and reviewing LGR 
proposals, arguably a security benefit.
>> Issues of deployment of LGRs in IDN registration seem out of scope for
>> this document.
> I agree the specifics of any given LGR or registry policy
> aren't in scope but the fact that an "imprecise" (in the sense
> above) LGR could cause wider trouble in perhaps unexpected ways,
> is definitely worth noting here I think, unless that's just
> not really possible. (See above.)
>
> As a near-aside, you didn't tell me if the WG are considering
> issues such as origin authentication or provenance of LGR
> fragments in some other document. I'm not asking that this
> one be held up pending development of that, but I'd be
> interested in knowing. And if the WG aren't ever going to do
> that, which may be reasonable if nobody would deploy it,
> then that also argues for recognising the potential threats
> here. (E.g. that taking a snippet of an LGR from stackexchange
> might be problematic.)

I think it's fair to note that the use of the format places very little 
constraints on what can be expressed as LGR - and, in particular, that 
the use of the format in and of itself does not guarantee that the 
results are suitable.

However, I would argue that this is not the place to discuss all the 
unsuitable ways labels can be defined - because there are other, 
existing formats for label generation rules (aka IDN tables) which also 
allow the definition of policies that range from good to bad.
>
> Lastly, I'm sure you're correct that LGRs are probably more
> likely to be precise, compared to previous technologies such
> as IDN tables, but that does not mean that there's no need to
> recognise potential threats arising from LGRs even if those
> threats existed and were more likely to occur with tables.

For comparison, here is the language used in RFC 3743 (which defines one 
of the IDN table formats with which this draft is compatible):

    As discussed in the Introduction, substantially-unrestricted use of
    international (non-ASCII) characters in domain name labels may cause
    user confusion and invite various types of attacks.  In particular,
    in the case of CJK languages, an attacker has an opportunity to
    divert or confuse users as a result of different characters (or, more
    specifically, assigned code points) with identical or similar
    semantics.  These Guidelines provide a partial remedy for those risks
    by supplying a framework for prohibiting inappropriate characters
    from being registered at all and for permitting "variant" characters
    to be grouped together and reserved, so that they can only be
    registered in the DNS by the same owner.  However, the system it
    suggests is no better or worse than the per-zone and per-language
    tables whose format and use this document specifies.  Specific
    tables, and any additional local processing, will reflect per-zone
    decisions about the balance between risk and flexibility of
    registrations.   And, of course, errors in construction of those
    tables may significantly reduce the quality of protection provided.

We could certainly write equivalent language - but note, that language 
also does not go into details on what kind of bad tables one can write, 
only that it is possible. (And unlike RFC 3743 the draft does not 
contain "guidelines").

Will post suggested language when we have it,

A./
>
> Cheers,
> S.
>