Re: [Gen-art] Gen-art LC review: draft-mm-netconf-time-capability-05

Andy Bierman <andy@yumaworks.com> Wed, 05 August 2015 01:02 UTC

Return-Path: <andy@yumaworks.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 563581B2A36 for <gen-art@ietfa.amsl.com>; Tue, 4 Aug 2015 18:02:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.978
X-Spam-Level:
X-Spam-Status: No, score=-1.978 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hDj9okaXEoIl for <gen-art@ietfa.amsl.com>; Tue, 4 Aug 2015 18:02:54 -0700 (PDT)
Received: from mail-lb0-f180.google.com (mail-lb0-f180.google.com [209.85.217.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 09E881B2A33 for <gen-art@ietf.org>; Tue, 4 Aug 2015 18:02:53 -0700 (PDT)
Received: by lbbud7 with SMTP id ud7so15735118lbb.3 for <gen-art@ietf.org>; Tue, 04 Aug 2015 18:02:52 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=dRaFaHWyUaPy2zVt2nZn7clpU42wTGlzu/tWhBw72AE=; b=ReFoXLhEM1RJ6CGBIn763AK3PXzYBbT6A7rCvYn9OYS/GNiaLDfYIuKqd5ofSM+2C6 rbWQr1gD3JR63cBnJDSG9puz/ePtbFcg+vXuKSmA++1p9EK6h9lp8sgI2eFu5en6U4my MPXlZdIqLmSbka0MiXiRruJW8G5dPRhFy3MSl/K6ATJdIFszNbJ6tpGnRsNp7bzJk/sX gs3Ss1FHV39aFtYhUtbsNb68NAfEmb/BekTZwxdLeWLQOl+UYAqX4Lz0rmHRXyXaAIFU dVBrWiv8lXAhbh4yliXoQ8zb3HHw0GdXM62Mdrw1bKuxtz1mdbKv9kUGrMmp9ydwEvZL QvEg==
X-Gm-Message-State: ALoCoQk4L861wp8AcCEld6DVahhxbe3ZWxVuoAD6scB0epgL27Qqg1j2sSTVXGNihc2trSN5BSZg
MIME-Version: 1.0
X-Received: by 10.152.88.106 with SMTP id bf10mr6987161lab.82.1438736572376; Tue, 04 Aug 2015 18:02:52 -0700 (PDT)
Received: by 10.112.200.102 with HTTP; Tue, 4 Aug 2015 18:02:52 -0700 (PDT)
In-Reply-To: <a788b8d09b104d9a9f48a8486fbdb33c@IL-EXCH01.marvell.com>
References: <60322a704b1e4d1cbc85f6a3b6a33b8e@IL-EXCH01.marvell.com> <55BFEDC8.6040800@nostrum.com> <03c295837c984138bb30bd9aacf21999@IL-EXCH01.marvell.com> <55C0FDD7.1050203@nostrum.com> <a788b8d09b104d9a9f48a8486fbdb33c@IL-EXCH01.marvell.com>
Date: Tue, 04 Aug 2015 18:02:52 -0700
Message-ID: <CABCOCHSBs2qXqxb=VCNPVHg6KOARK7oaUE=MyFv2hWU66=3NMw@mail.gmail.com>
From: Andy Bierman <andy@yumaworks.com>
To: Tal Mizrahi <talmi@marvell.com>
Content-Type: multipart/alternative; boundary="001a11c2640e52828f051c85f6a9"
Archived-At: <http://mailarchive.ietf.org/arch/msg/gen-art/DgbiWAtN2FbknTPLjmL4pyUCp3o>
Cc: General Area Review Team <gen-art@ietf.org>, "draft-mm-netconf-time-capability.all@ietf.org" <draft-mm-netconf-time-capability.all@ietf.org>, "ietf@ietf.org" <ietf@ietf.org>
Subject: Re: [Gen-art] Gen-art LC review: draft-mm-netconf-time-capability-05
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Aug 2015 01:02:59 -0000

Hi,

The draft adds the invoke-at-time capability to a small set
of NETCONF operations (via augment-stmt).
The mechanism cannot be used for any other operations.
It appears this is the entire list of operations supported:

   - get-config
   - get
   - copy-config
   - edit-config
   - delete-config
   - lock
   - unlock
   - close-session
   - kill-session
   - commit

Why was this subset of all operations selected?

I cannot find any text in the draft that says what happens
if the client session terminates for any reason.  There are
commands that support the 'execution-time' parameter
like <lock> that explicitly require a session to be maintained.
Not sure a delayed <close-session> even makes sense.

If the session is gone when the scheduled operation is about
to be executed, does the server cancel it or attempt it?
Without a session, the server cannot send an <rpc-reply>,
so it should not attempt the command.

What if commands are scheduled at the same time?
Is the server expected to serialize these commands or
invoke them in parallel?  Note that operations within
a single session MUST be invoked in order, but this only
seems to apply to the original <rpc> to schedule the delayed
operation.


Andy

On Tue, Aug 4, 2015 at 4:41 PM, Tal Mizrahi <talmi@marvell.com> wrote:

> Hi Robert,
>
> Thanks again for the prompt responses.
>
>
> >Well, those are just a subset of the things that could change in command's
> >context that would cause the command to be erroneous or even damaging if
> >it were run, and you're not addressing the other security issues that come
> >with very long scheduling (overflowing buffers, or having lots of time to
> >schedule a massive number of commands to all try to happen at once). I
> >suspect there are other things that pressured adding the "near future"
> >restriction that haven't been captured well yet.
>
> Well, the thing is that 15 seconds (or 'a few seconds' for that matter) is
> a long enough time to send thousands (or more) of scheduled RPCs, so I am
> not sure the sched-max-future mitigates the buffer overflow threat.
> Generally speaking, Section 3.6 discusses erroneous scenarios, and not
> security threats.
>
> I would suggest to add some text to the security considerations section,
> which discusses the overflow attack you mentioned here. Would this address
> your concern?
>
>
> >I think you're saying that in production deployments today, the
> >authorization policy is "the peer was able to send me a packet". Is that
> >wrong?
>
> I can't comment about what is deployed in production today, although I am
> sure there are operators out there who can comment about that. RFC 6536,
> which defines a NETCONF access control model, is cited by 6 other RFCs, so
> I do not think access control has been overlooked by the community.
> Nevertheless, I believe that (much like RFC 6241) the access control
> specifics are not within the scope of the current draft.
>
>
> Thanks,
> Tal.
>
>
> >-----Original Message-----
> >From: Robert Sparks [mailto:rjsparks@nostrum.com]
> >Sent: Tuesday, August 04, 2015 9:01 PM
> >To: Tal Mizrahi
> >Cc: ietf@ietf.org; General Area Review Team; draft-mm-netconf-time-
> >capability.all@ietf.org
> >Subject: Re: Gen-art LC review: draft-mm-netconf-time-capability-05
> >
> >
> >
> >On 8/4/15 11:19 AM, Tal Mizrahi wrote:
> >> Hi Robert,
> >>
> >> Thanks for the comments.
> >>
> >>
> >>>> A typical example of using near-future scheduling is a coordinated
> >>>> commit; a client needs to trigger a commit at n servers, so that the
> >>>> n servers perform the commit as close as possible to simultaneously.
> >>>> Without the time capability, the client sends a sequence of n commit
> >>>> messages, and thus each server performs the commit at a different
> >>>> time. By using the time capability, the client can send commit
> >>>> messages that are scheduled to take place at time Ts, which is 5
> >>>> seconds in the future, causing the servers to invoke the commit as
> close
> >as possible to time Ts.
> >>> I'm interested in your response to Andy's point on this paragraph.
> >> Okay, so here is Andy's point:
> >>
> >>>> You should pick a different example because the NETCONF
> >>>> confirmed-commit procedure is designed to be loose-coupled.  The
> >default timeout is 10 minutes.
> >>>> Since the client needs sessions open with all servers involved in
> >>>> the network-wide commit, there is no advantage in staging the
> >>>> <commit> operations 15 sec. in advance, to make sure the servers are
> >reachable.
> >> And here is our response from 02-Aug-2015:
> >>
> >>> Right, confirmed-commit is loose-coupled. But the example quoted
> >>> above (Example
> >>> 1 in the draft) is not intended to replace the confirmed commit. The
> >>> purpose in this example is different: the client wants the commit
> >>> RPCs to be executed at the same time in all servers.
> >>> The confirmed-commit serves a different purpose, which is to make
> >>> sure that everyone either commits or rolls back. BTW, a confirmed
> >>> commit can be sent with the scheduled-time element, allowing to enjoy
> >the best of both worlds.
> >>
> >> Please let us know if you have further concerns about this point.
> >>
> >>
> >>>> The default value of sched-max-future is defined to be 15 seconds.
> >>>> This duration is long enough to allow the scheduled RPC to be sent
> >>>> by the client, potentially to multiple servers, and in some cases to
> >>>> send a cancellation message, as described in Section ‎3.2. On the
> >>>> other hand, the 15 second duration yields a very low probability of a
> >reboot or a permission change.
> >>> I'm not finding the explanation terribly persuasive, but it's at
> >>> least _some_ explanation - thanks for that.  I'll leave it to the ADs
> >>> and other reviewers in the field to see if it's sufficient for an
> >>> experimental protocol.
> >> (*) Please see comment (**) below.
> >>
> >>>> Note that we did not define a maximal value for sched-max-future,
> >>>> since one of the goals was to define a generic tool that can be used
> >>>> for various different environments. The draft clearly states the
> >>>> intention of using near-future-scheduling, but the requirements and
> >>>> constraints of different environments may require the
> >>>> sched-max-future to have a different value, potentially higher than
> >>>> 30 seconds. Hence, we prefer not to define a maximal value. Indeed, in
> >the draft 06 there is a more detailed discussion about the issues we are
> trying
> >to prevent by using near-future scheduling (Section 3.6).
> >>> Without a maximal value, I think you need more of a discussion
> >>> guiding the choice of sched-max-future. Otherwise, you are just
> >>> waiving your hands at not addressing the problems with far-future
> >>> scheduling, and potentially well-meaning but uninformed people are
> >>> going to go step in them anyway. There was a point to choosing the
> near-
> >future limit.
> >>> Enforce it or explain it with more vigor please.
> >> (**) Your point is well taken. What we suggest, regarding this point
> and the
> >previous point (*), is that we add more text explaining the factors that
> affect
> >sched-max-future to Section 3.6 .
> >>
> >> Here is the new text we suggest. Please let us know if this addresses
> your
> >comment:
> >>
> >>
> >> The challenge in far future scheduling is that during the long period
> between
> >the time at which the RPC is sent and the time at which it is scheduled
> to be
> >executed the following erroneous events may occur:
> >> - The server may restart.
> >> - The client's authorization level may be changed.
> >> - The client may restart and send a conflicting RPC.
> >> - A different client may send a conflicting RPC.
> >Well, those are just a subset of the things that could change in command's
> >context that would cause the command to be erroneous or even damaging if
> >it were run, and you're not addressing the other security issues that come
> >with very long scheduling (overflowing buffers, or having lots of time to
> >schedule a massive number of commands to all try to happen at once). I
> >suspect there are other things that pressured adding the "near future"
> >restriction that haven't been captured well yet.
> >>
> >> In these cases if the server performs the scheduled operation it may
> >perform an action that is inconsistent with the current network policy, or
> >inconsistent with the currently active clients.
> >>
> >> Near future scheduling guarantees that external events such as the
> >examples above have a low probability of occurring during the sched-max-
> >future period, and even when they do, the period of inconsistency is
> limited
> >to sched-max-future, which is a short period of time.
> >>
> >> Hence, sched-max-future should be configured to a value that is high
> >enough to allow the client to:
> >> 1. Send the scheduled RPC, potentially to multiple servers.
> >> 2. Receive notifications or rpc-error messages from the server(s), or
> wait for
> >a timeout and decide that if no response has arrive then something is
> wrong.
> >> 3. If necessary, send a cancellation message, potentially to multiple
> servers.
> >>
> >> On the other hand, sched-max-future should be configured to a value
> that is
> >low enough to allow a low probability of the erroneous events above,
> typically
> >on the order of a few seconds. Note that even if sched-max-future is
> >configured to a low value, it is still possible (with a low probability)
> that an
> >erroneous event will occur. However, this short potentially hazardous
> period
> >is not significantly worse than in conventional (unscheduled) RPCs, as
> even a
> >conventional RPC may in some cases be executed a few seconds after it was
> >sent by the client.
> >>
> >> The default value of sched-max-future is defined to be 15 seconds. This
> >duration is long enough to allow the scheduled RPC to be sent by the
> client,
> >potentially to multiple servers, and in some cases to send a cancellation
> >message, as described in Section ‎3.2. On the other hand, the 15 second
> >duration yields a very low probability of a reboot or a permission change.
> >I still think, especially while this as at experimental, you should scope
> this with
> >an absolute max. But I'm just one reviewer. Work it out with your AD.
> >
> >>
> >>
> >>>> This YANG module defines the <cancel-schedule> RPC. This RPC may
> >>>> be considered sensitive or vulnerable in some network environments.
> >>>> Since the value of the <schedule-id> is known to all the clients that
> are
> >>>> subscribed to notifications from the server, the <cancel-schedule> RPC
> >>>> may be used maliciously to attack servers by canceling their pending
> >RPCs.
> >>>> This attack is addressed in two layers: (i) security at the transport
> layer,
> >>>> limiting the attack only to clients that have successfully initiated
> a secure
> >>>> session with the server, and (ii) the authorization level required to
> cancel
> >>>> an RPC should be the same as the level required to schedule it.
> >>> To help me along, point me to the specifics of what you use to set and
> >>> verify such an authorization level?
> >> Indeed, there is a need for an authorization scheme, which is able to
> set and
> >verify the authorization level.
> >> NETCONF (RFC 6241) does not explicitly define an authorization scheme,
> and
> >it is probably not within the scope of the current draft to define such a
> >scheme either.
> >> Quoting RFC 6241:
> >>
> >>     This document does not specify an authorization scheme, as such a
> >>     scheme will likely be tied to a meta-data model or a data model.
> >>     Implementors SHOULD provide a comprehensive authorization scheme
> >with
> >>     NETCONF.
> >>     ...
> >>     Different environments may well allow different rights prior to and
> >>     then after authentication.  Thus, an authorization model is not
> >>     specified in this document.  When an operation is not properly
> >>     authorized, a simple "access denied" is sufficient.
> >I think you're saying that in production deployments today, the
> >authorization policy is "the peer was able to send me a packet". Is that
> >wrong?
> >>
> >>
> >>
> >> Please let us know if you have further comments or concerns about any of
> >the issues above.
> >>
> >> Thanks,
> >> Tal.
>
>