Re: [Anima] 2nd WGLC for draft-ietf-anima-constrained-join-proxy-12, ends September 20th 2022

Toerless Eckert <tte@cs.fau.de> Thu, 03 November 2022 06:52 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: anima@ietfa.amsl.com
Delivered-To: anima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 414F5C14CE24; Wed, 2 Nov 2022 23:52:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.66
X-Spam-Level:
X-Spam-Status: No, score=-6.66 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vqAYhPUvj7YI; Wed, 2 Nov 2022 23:52:04 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [131.188.34.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 626EDC14CE22; Wed, 2 Nov 2022 23:52:01 -0700 (PDT)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [131.188.34.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id DE6C8548527; Thu, 3 Nov 2022 07:51:55 +0100 (CET)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id CBC2E4EBEF6; Thu, 3 Nov 2022 07:51:55 +0100 (CET)
Date: Thu, 03 Nov 2022 07:51:55 +0100
From: Toerless Eckert <tte@cs.fau.de>
To: Esko Dijk <esko.dijk@iotconsultancy.nl>
Cc: Michael Richardson <mcr@sandelman.ca>, Anima WG <anima@ietf.org>, "anima-chairs@ietf.org" <anima-chairs@ietf.org>, stokcons <stokcons@bbhmail.nl>
Message-ID: <Y2NlC2iCgtyn3T2I@faui48e.informatik.uni-erlangen.de>
References: <Yxd/oBl0dmbmUI8L@faui48e.informatik.uni-erlangen.de> <DU0P190MB1978F420D478B93CE29F36D3FD4C9@DU0P190MB1978.EURP190.PROD.OUTLOOK.COM> <46723.1663756262@dooku> <DU0P190MB1978AC04BBB22272B360984DFD4F9@DU0P190MB1978.EURP190.PROD.OUTLOOK.COM> <YzH8R88OY/kNDLxz@faui48e.informatik.uni-erlangen.de> <DU0P190MB1978A4C862C2DE321FD8680EFD229@DU0P190MB1978.EURP190.PROD.OUTLOOK.COM>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <DU0P190MB1978A4C862C2DE321FD8680EFD229@DU0P190MB1978.EURP190.PROD.OUTLOOK.COM>
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/GRU9wuSLC5cQct6rOiZxKWZvlRI>
Subject: Re: [Anima] 2nd WGLC for draft-ietf-anima-constrained-join-proxy-12, ends September 20th 2022
X-BeenThere: anima@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <anima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima>, <mailto:anima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima/>
List-Post: <mailto:anima@ietf.org>
List-Help: <mailto:anima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima>, <mailto:anima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Nov 2022 06:52:10 -0000

On Wed, Oct 12, 2022 at 07:49:06AM +0000, Esko Dijk wrote:
> For the stateless proxy, it is hard to determine a good rate limit number. 1% is not good because it would slow down the joining process of a genuine Pledge to a crawl. Some strategies that could work better:

What is the typical lowest-bitrate that you are worried about,
and what is the total amount of data of an enrolment process ?

How would you deal with proxies that are on frequencies where the duty cycle
is limited by law. For example devices on my 868 home automation network needs to maintain
a 1%/hour duty cycle.

The problem to me seems that under those regulations, badly behaving nodes
can force proxy and registrar into exhausting their regulatory limit as
well unless either proxy and/or registar do something against that.

> 1. Initially allow a (near) unlimited use of "uplink" bandwidth, to get a fast enrollment in the common case. When the relayed traffic persists for some time (either by same Pledge or other Pledge, no difference) apply a stricter rate limit.

This is where a badly behaving pledge would saturate the regulatory
limit of transmissions at least on a pledge unless this is done
carefully. 

> 2. Apply a (rough) 'data budget' for upstream traffic to the Registrar. Only if sufficient "downstream" traffic comes back from the Registrar to Pledge(s), the upstream data is allowed at a high rate. If that's not the case, a strict rate limit is applied.

Yes, this would be nice. I just wonder if we're not giving up the benefits
of stateless on a proxy when we do have to do this per-pledge state.

It almost feels as if radio networks where there are strict duty-cycle
limits are requring per-pledge state on the proxy if the proxy wants to
defend itself against the attacking pledge exhausting the proxies own
duty-cycle. Unless the proxy function itself stricly operates
independent of pledge on a cycle that is below the overal permitted
duty-cycle for the proxy.

Maybe text to admit some degree of defeat ?

Proxy operations as described in this document are not necessarily sufficient
to protect proxy and/or registrar against misbehaving pledges that attack
proxy/registar with too much data, especially when using (radio) networks
with regulatory limitations on the volume permitted per sender (such as
1% duty-cycle per hour limitatios).

Instead, the use of radio networks with such regulations
will require additional considerations. Proxies should implement measures
to limit the proxy function traffic such that other functions on he
proxy are not starved by badly behaving pledges.

Could go into security considerations.

I think this is true, but somewhat hand waiving, but impossible to
be more specific given a possible indefinite amount of regulatory
limitations.

> 3. Send the radio frames associated to relayed "upstream" traffic with the lowest priority.  I.e. have a scheduler with packet priority. I know e.g. OpenThread has this to prioritize mesh management-traffic.

RFC8622

> A combination of approaches is also possible.
> Solution 3 seems easiest to implement, if the mesh network stack already has a working priority-based scheduler. But it will probably need to be combined with some rate-limit approach too.
> 
> So one solution is:
> Proxy SHOULD locally schedule the "upstream" IP packets to be sent with lowest priority. 

I am hesitating to reduce the priority of an important function solely as a proactive
mechanism against attacks. Usually, the control-plane in networks has been
giving itself the highest priority. I am surprised that OpenThread would
"down-prioritize" management traffic. That may be useful for non-time-critical
management functions, but i would think that enrolment in general is something that
should be performed as fast as possible.
 
> > If the proxy does not discover a registrar, then of course it can not forward enrolment
> > requests. We should at least specify that proxies need to correctly support that
> > registrar (announcemenets) are switched on/off. What do you think ?
> 
> Yes, the key part here is that when a Proxy once discovers a Registrar it cannot assume that this Registrar will be alive on that IP address forever. So it needs to rediscover in case the original discovery information expired. If rediscovery fails, it will not forward traffic anymore to the old IP address.
> This aspect may not be obvious so including that in the security considerations sounds good.   (Exact timing out mechanism will depend on the discovery method used. DNS-SD may be also an option for a future extension of the draft. )
> In the GRASP case it's announcements being switched on/off.  In the CoAP discovery case, the Max-Age Option in the response determines for how long the result is valid. (Default max-age is 60 secs). The latter may be good to point out explicitly as none of the CoAP RFCs mention this use of Max-Age yet; though it was raised on the core list and is in draft-lenders-dns-over-coap-04.

Usually, the discovery mechanisms themselves are not used/useful to discover
failure of the announced function. Instead, the absence of reaction to
communications is what triggers selection of alternative resources. This
can only work reasonably well when the absense of replies is NOT per-pledge.

Ideally, resource announcements should have something like
"consider my service dead when i do not response to invocation for more than x [seconds]"

This would be a much shorter time than the ttl of the announcement, because the
latter just indicates when to delete cached service information in the absence
of attempts to use it.

Alas, i have never seen such a dead-service-time being announced. Seems to me
all these service announcements are still logically build wih non-constrained
networks in mind.

> > Instead of stopping service announcements (registrar and proxy), i would then love to see the service
> > announcements witth some "service status" flag/field. For example "off hours" or the like. Workflow:
> > Device to be enrolled has single color LED. You connect it (west coast) to the network, and it would
> > indicate "off hours" through eg.: repeating three short blinks. This validates that network connectivity
> > works, and that enrolment will proceed once someone switches "BRSKI on" (next morning).
> >
> > Does that make sense ?
> 
> That makes sense, and sounds like possible future work! For now we can just rely on discovery and the presence/absence of a service, and its advertised lifetime. All of the current discovery methods support that at least.
> For CoAP discovery at least some "service availability parameter" could be defined in general so that it can be used for any types of services/resources advertised.

Right. likewise for DNS-SD (with or without GRASP).

Cheers
    toerless
> 
> Regards
> Esko
> 
> -----Original Message-----
> From: Toerless Eckert <tte@cs.fau.de> 
> Sent: Monday, September 26, 2022 21:24
> To: Esko Dijk <esko.dijk@iotconsultancy.nl>
> Cc: Michael Richardson <mcr@sandelman.ca>; Anima WG <anima@ietf.org>; anima-chairs@ietf.org; stokcons <stokcons@bbhmail.nl>
> Subject: Re: [Anima] 2nd WGLC for draft-ietf-anima-constrained-join-proxy-12, ends September 20th 2022
> 
> Thanks, Esko
> 
> inline
> 
> On Wed, Sep 21, 2022 at 10:52:47AM +0000, Esko Dijk wrote:
> > Thanks,
> > 
> > One item I forgot to include which was also on my mind - a security consideration. If an autonomous bootstrap method like BRSKI is left "always-on" in a mesh network, it means that at any time an off-mesh attacker can contact its nearby Join Proxies and flood them with traffic. E.g. using different LL addresses to pretend it is multiple Pledges.
> > This will cause relayed traffic on the mesh, potentially overloading it.
> > 
> > * One solution component is clearly rate-limiting of relayed traffic. But, this is not even mentioned in the security considerations. And not in 8995 as far as I can tell.
> 
> For the stateful proxy, the pull request from my review i sent last friday suggests the
> following text:
> 
>    To protect itself and the Registrar against malfunctioning Pledges
>    and or denial of service attacks, the join proxy SHOULD limit the
>    number of simultaneous mapping states for each IP_p%IF to 2 and the
>    number of simultaneous mapping states per interface to 10.  When
>    mapping state can not be built, the proxy SHOULD return an ICMP error
>    (1), "Destination Port Unreachable" message with code (1),
>    "Communication with destination administratively prohibited".
> 
> Do you think these are useful numbers ?
> 
> The whole idea of the stateless proxy is of course to remove the need for intellegence
> from the proxy and only have it on the registrar. The best DoS protection i could
> think of on the proxy is therefore just a total packet rate limiter. Is it possible
> to come up with good recommendations on such packet rate limiters ? For example
> 1% of the "uplink" bitrate ? Can you think of mesh networks where this would not
> be a good enough number ? If this (or another number)  makes sense we could suggest
> to add it to the stateless proxy section.
> 
> > * Another solution component is being able (by an admin) to "turn on" and "turn off" the entire option of BRSKI bootstrapping.  This could also be mentioned as a security advice: turn it off when not needed i.e. when the operator knows for sure there are no new Pledges to be bootstrapped.  The method of "turning on/off" could be implementation-specific as we don't define any APIs for control of Join Proxies.  The intended behavior of any Join Proxy is then as follows:
> >    1. If BRSKI is "on", respond to discovery requests by Pledges as usual and do relay any (DTLS) records they may send to the join-port.
> >    2. If BRSKI is "off" , don't respond to discovery requests by Pledges and don't relay any data sent to the join-port. (Effectively, close it.)
> 
> If the proxy does not discover a registrar, then of course it can not forward enrolment
> requests. We should at least specify that proxies need to correctly support that
> registrar (announcemenets) are switched on/off. What do you think ?
> 
> > Some networks may have a "BRSKI always on" policy because it's needed for their application and for convenience, but for a majority of networks I expect that isn't needed.
> 
> 
> I can already see a BRSKI scenario in the USA, where the manager of the east-coast NOC went home at
> 5PM and some IT folks on the west coast still want enroll new equipment in an installation and
> wonder what happens.
> 
> But if this is what customers want (and i think you say some of them likely will want this), then i would
> like to see appropriate disagnostics for the local installer:
> 
> Instead of stopping service announcements (registrar and proxy), i would then love to see the service
> announcements witth some "service status" flag/field. For example "off hours" or the like. Workflow:
> Device to be enrolled has single color LED. You connect it (west coast) to the network, and it would
> indicate "off hours" through eg.: repeating three short blinks. This validates that network connectivity
> works, and that enrolment will proceed once someone switches "BRSKI on" (next morning).
> 
> Does that make sense ?
> 
> > Maybe such practical security related solutions are already described in other documents e.g. 6TiSCH joining or GRASP documents but unfortunately I didn't read enough of those documents to know this. If so we can also refer to it as security consideration.
> > For a mesh network, avoiding an outsider to be able to "load" the mesh links with random data is especially important.
> 
> Indeed. Hopefully the #state and rate limiters proposed above would be sufficient to get past
> this point ?
> 
> Cheers
>     Toerless
> > 
> > Regards
> > Esko
> > 
> > -----Original Message-----
> > From: Michael Richardson <mcr@sandelman.ca> 
> > Sent: Wednesday, September 21, 2022 12:31
> > To: Esko Dijk <esko.dijk@iotconsultancy.nl>
> > Cc: Anima WG <anima@ietf.org>; anima-chairs@ietf.org; stokcons <stokcons@bbhmail.nl>
> > Subject: Re: [Anima] 2nd WGLC for draft-ietf-anima-constrained-join-proxy-12, ends September 20th 2022
> > 
> > Okay, thank you. I'll crunch through your comments on Friday.
> 
> -- 
> ---
> tte@cs.fau.de

-- 
---
tte@cs.fau.de