Re: [Gen-art] Gen-ART Last Call review of draft-ietf-anima-grasp-api-07

Brian E Carpenter <brian.e.carpenter@gmail.com> Sat, 07 November 2020 22:56 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EEC9A3A0D69; Sat, 7 Nov 2020 14:56:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mr5ark__2Ukw; Sat, 7 Nov 2020 14:56:22 -0800 (PST)
Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1CEDF3A0D67; Sat, 7 Nov 2020 14:56:21 -0800 (PST)
Received: by mail-pg1-x52f.google.com with SMTP id e21so3821542pgr.11; Sat, 07 Nov 2020 14:56:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:subject:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=xp1bpm9gAg22KirzyjDOdC+V7wT9xC1d4FJlQYPgIgQ=; b=dZ+gLricBYHXJtdBmckX5y+U4x3YY3CQR0v/OHdcSckb1kWOS2x6nM4IMwi5UuXk6D X9rwitGPmAYSYIEdLzRzbTptINVqImfj8oU+Lrwys40p5J36Ob4AomCdWVvT2pFMSbUn Vop0fj6zhrwEsB8qbVGSwU5dg8lYMpFH1UqFE6bymfwglW11u9EwmofqZ9jVWqGuxuuY KGD/wjeHtRe4l8YOIjfPviAAozRui6KPZih1eaYp1xjvGSWueJfnLbyv2bqrvZG5yB42 kXa+q45P3yRtJM+a1KMqSAoFUWgOw4axMtHVqi5DR6JbD14sqzEy9vt9QMGRGIjfUq+d n2Yg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=xp1bpm9gAg22KirzyjDOdC+V7wT9xC1d4FJlQYPgIgQ=; b=TmRvQ3JldZCqqhIdg6OiSjjY+y1+esHBsSd2WfFqbZVxaUrkTG60fuJtbCskyuPsVH Mb/cHkXeXuDHJRtawAhKzH6W62iSmcwCIBZBshKy089cV+x5JhEMAo+7qGAQOGdzkPni AgH5iNFT1S7f9VOsUZjUxD1tWAHzBI9wEDpdHoVV0QdA5wfy9Nc1Xo6U25/Kr0F3MSNc hWIs6VsF390v/0UehO9oHnteP4t5T8mbYspWFC8e5E+EnSok99DnQbUETVF79P6qHHou 3olJPqUyxbmdkAKGS1CC5TplP5Hkzt4st27Sz7Huh6hGesVX6ne6l3aG8K4lowvMbFkN LL7g==
X-Gm-Message-State: AOAM531MM5HcojJoHrvf9skVdeaiChbMniqZMVHL1Ou18Ino84afdwo4 +s6+qyWHEW4TxP04YJU1oQHogCZ8p43t2A==
X-Google-Smtp-Source: ABdhPJwTaB9MvVWuDk4l7ioxY5QthphqeI7sTmklgYA1b/Hr93kPge7AAIld3yPQPMlAuksxpEJNuQ==
X-Received: by 2002:a63:3c1b:: with SMTP id j27mr7189270pga.79.1604789780834; Sat, 07 Nov 2020 14:56:20 -0800 (PST)
Received: from [192.168.178.20] ([151.210.130.0]) by smtp.gmail.com with ESMTPSA id j11sm6524619pfh.143.2020.11.07.14.56.17 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 07 Nov 2020 14:56:19 -0800 (PST)
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
To: Paul Kyzivat <pkyzivat@alum.mit.edu>, draft-ietf-anima-grasp-api.all@ietf.org
Cc: General Area Review Team <gen-art@ietf.org>
References: <d71097bc-c37a-c322-d478-7435c35cc5c4@alum.mit.edu>
Message-ID: <2fde1476-bd5e-ed74-122d-2cac17e96ee8@gmail.com>
Date: Sun, 8 Nov 2020 11:56:15 +1300
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <d71097bc-c37a-c322-d478-7435c35cc5c4@alum.mit.edu>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/gen-art/nbRHg6NSZYW-f3HNg0_WIE3Hd2U>
Subject: Re: [Gen-art] Gen-ART Last Call review of draft-ietf-anima-grasp-api-07
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Nov 2020 22:56:24 -0000

Hi Paul,

What a careful review. Quite a few of your comments are complicated
to respond to in-line, so even where there is little or nothing below,
there will be corresponding text changes in the next version.

On 29-Oct-20 04:50, Paul Kyzivat wrote:
> I am the assigned Gen-ART reviewer for this draft. The General Area
> Review Team (Gen-ART) reviews all IETF documents being processed
> by the IESG for the IETF Chair.  Please treat these comments just
> like any other last call comments.
> 
> For more information, please see the FAQ at
> 
> <https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.
> 
> Document: draft-ietf-anima-grasp-api-07
> Reviewer: Paul Kyzivat
> Review Date: 2020-10-28
> IETF LC End Date: 2020-10-28
> IESG Telechat date: ?
> 
> This draft is on the right track but has open issues, described in the 
> review.
> 
> General:
> 
> I began this review without any knowledge of Anima. I did read several 
> of the related documents for context, but my overall understanding 
> remains somewhat sketchy. So take my comments with with that in mind. 
> (At least you get a fresh, untainted perspective.)

In fact, if the API is intended for programmers who aren't necessarily
protocol experts, that's exactly the right persepective.
 
> Issues:
> 
> Major: 5
> Minor: 6
> Nits:  2
> 
> 1) MAJOR: Unclear model for API function usage
> 
> As I read through sections 2.3.2 through 2.3.6 these I got progressively 
> more confused. I finally concluded that a big picture of how API 
> functions interact is lacking.
> 
> For one thing, there isn't a clear delineation between those calls used 
> by requesters and those used by responders. And then valid sequencing of 
> calls is hard to tease out.
> 
> It would be helpful to have one state model for requesters and another 
> for responders. It may also be helpful to break this out for threaded 
> and event loop variants.

OK, there will be quite a few changes/additions and a new diagram to help
with this. (We do assume familiarity with the GRASP spec, but that
is quite a big ask and probably unreasonable for the average ASA
programmer.)

> 
> My subsequent issues regarding these sections are reflective of this 
> confusion.
> 
> 2) MAJOR: What state does GRASP retain for Objectives?
> 
> It is clear that the GRASP must retain the names of registered 
> objectives. There are implications that it must keep more. Seemingly it 
> must retain received flooded objectives, on a per-source basis. It is 
> unclear if it is only for registered objectives, or also for 
> unregistered objectives. This could potentially become a large resource 
> drain if there are lots of nodes flooding values.

Yes, there's a scaling issue there (one reason that a pub/sub extension
to reduce flooding is being considered in draft-ietf-anima-grasp-distribution).
 
> Negotiated objectives seem to be treated differently. It isn't clear to 
> me if GRASP needs to retain copies of their values.

Indeed, as far as GRASP is concerned those values pass in the night.

> 
> 3) MAJOR: Consistency of Objective definitions
> 
> In section 2.3.1.2 and elsewhere, presumably all parties that use a 
> particular objective must agree on the values of synch, neg, dry, and 
> the format of the value.
> 
> Apparently you are relying on each caller getting this right. What 
> happens when this isn't the case? How can blame be ascribed so that the 
> problem can be fixed?

Every objective requires its own spec. See draft-ietf-anima-prefix-management
for example. Will clarify.

There will be numerous fixes for your items 4) through 6)...

> 4) MAJOR: Confusing semantics of 'request_negotiate'
> 
> In section 2.3.4 I don't understand the following:
> 
>           1.  The 'session_nonce' parameter is null.  In this case the
>               negotiation has succeeded immediately (the peer has
>               accepted the request).  The returned 'proffered_objective'
>               contains the value accepted by the peer.
> 
> IIUC this requires a network exchange with the peer. I don't see how 
> this can complete *immediately*. ISTM that this could only complete 
> immediately if it were satisfied from a local cache. That doesn't seem 
> appropriate for this function.
> 
> Similarly, in bullet 2 I don't see how the proffered_objective would be 
> available in the initial call.
> 
> Or is this text only applicable to a threaded implementation with 
> synchronous calls?
> 
> Bullet 2 also says:
> 
>               ... The
>               returned 'proffered_objective' contains the first value
>               proffered by the negotiation peer.  The contents of this
>               instance of the objective must be used in the subsequent
>               negotiation call because it contains the updated loop
>               count, sent by the negotiation peer.
> 
> Do you mean this must be used in the call to negotiate_step? But that 
> would mean asking for what has already been granted. For the the 
> negotiation to be useful I would expect that the next round would ask 
> for a value somewhere between what was originally requested and what was 
> initially offered.
> 
> I think you need to say more about the whole concept of negotiation and 
> how it is expected to work. Also, additional discussion of the purpose 
> and semantics of dry run negotiation.
> 
> One more thing: you don't explain the semantics of 'timeout'. Is it only 
> used locally, to terminate the synchronous form of the call? Or is it 
> transmitted and used by the responder to control how long it might spend 
> trying to fulfill the request before giving up?
> 
> 5) MAJOR: Confusing semantics of 'negotiate_step'
> 
> In section 2.3.4, I'm confused by:
> 
>              Called in the same thread as the
>              preceding 'request_negotiate' or 'listen_negotiate'
> 
> I understand use of this following request_negotiate, but not with 
> listen_negotiate. A negotiation should consist of an ordered sequence of 
> "rounds" of bidding. Allowing both requester and responder to initiate a 
> next step can lead to a disruption of the ordering. I would expect this 
> to only be used by the caller of request_negotiate. What am I missing?
> 
> 6) MINOR(MAJOR?): Confusing semantics of 'negotiate_wait'
> 
> Regarding section 2.3.4: I understand that the GRASP Confirm Waiting 
> message (from [I-D.ietf-anima-grasp]) and hence 'negotiate_wait' is only 
> used by the responder. How is the effect manifested in the requester? Is 
> a synchronous requester just silently extended? What about an event loop 
> requester?
> 
> 7) MINOR(MAJOR?): Role of 'peer' in 'synchronize'
> 
> The description of 'synchronize' (in section 2.3.5) says:
> 
>        -  If the objective was already flooded, the flooded value is
>           returned immediately in the 'result' parameter.  In this case,
>           the 'peer' and 'timeout' are ignored.
> 
> Do you really want to ignore 'peer' in this case?

No. I believe you are correct, and the presence of the 'peer' should
prevent automatic use of the flooded value. So this will be a 
technical change, unless anyone in the WG objects. The new text
would read:

  - If the 'peer' parameter is null, and the objective is already
    available in the local cache, the flooded objective is returned
    immediately in the 'result' parameter. In this case, the 'timeout'
    is ignored.

  - Otherwise, synchronization with a discovered ASA is performed...

> I infer from the description of get_flood that the GRASP retains flooded 
> values separately for each flood source. So you should have the 
> capability to return only a value flooded by the requested peer.

You can do that by using get_flood() and picking the result you
you want. I think that with that feature, and synchronize() fixed
as above, we cover all possibilities, and a simple ASA could use
synchronize() without caring about anything except the objective's
name.

> 
> 8) MINOR(MAJOR?): Confusing semantics of 'flood'
> 
> The descriptions of 'flood' in section 2.3.5 seem to imply that the 
> GRASP in each of the nodes receiving the flood saves the value, for use 
> in get_flood and synchronize. Is that right? (Actually, the description 
> of get_flood implies that flooded values are saved separately for every 
> flood source!)
> 
> Are only registered objectives saved? Or must the GRASP save all the 
> values from every flood that it receives?
> 
> What if one flood contains X, Y, and Z. Then a later flood from the same 
> source contains only X and Y. Presumably the new values of X and Y 
> replace the old ones. Does GRASP retain the value of Z, or discard it?
> 
> ISTM (based on my limited understanding) that it would make more sense 
> for the GRASP to simply cache all objectives that have been registered 
> and/or received as a result of synchronization, together with their TTL. 
> And then subsequent requests would be satisfied from the cache as long 
> as the TTL hasn't expired.

Your reasoning is correct. Will clarify.

> 
> 9) MINOR: Clarify Session
> 
> Section 2.2.3 implies that Session is a full fledged entity. It would be 
> helpful to flesh this out in more detail. E.g., describe it as a 
> distinct entity in the API and include a state model.

Yes, we should define 'session'. The state model varies from case
to case. Negotiation is the only complicated one, really (as per
your comments above).

> 10) MINOR: Representing mutually exclusive items with booleans
> 
> In section 2.3.1.3, 'synch' and 'neg' are mutually exclusive. By 
> representing them as independent booleans you introduce the possibility 
> of invalidly setting them both or neither. ISTM it would be better to 
> represent these as a single value (enumeration).
> 
> A similar situation occurs in section 2.3.1.4.
> 
> 11) MINOR: Managing thread for 'listen_negotiate'
> 
> In section 2.3.4, suppose the ASA wants try to ensure that it always has 
> a listener ready for an incoming request. IIUC the only way to achieve 
> that is to create the maximum number of threads it is willing to devote 
> and set them all to listening. That is sub-optimal. It would be better 
> if it could somehow learn that another one is needed and create it. Some 
> additional API mechanisms would be needed to support that.

This is an issue, but I think the solution is very dependent on the
programming environment. The way I tackled it in demo ASAs was to
have a lightweight listener that fields each incoming request and
immediately kicks it to a new thread, then goes back to listening:

while True:
    err, snonce, answer = grasp.listen_negotiate(asa_nonce, obj1)
    if err:
        grasp.tprint("listen_negotiate error:",grasp.etext[err])
        time.sleep(5) #to calm things if there's a looping error
    else:
        #got a new negotiation request; kick off a separate negotiator
        #so that multiple requests can be handled in parallel
        negotiator(snonce,answer).start()

I think that in *any* solution, there will be short intervals where
no thread is listening, in which case the other end will get an
error code and should retry. Anyway we will try to cover this in
the text.

> 
> 12) NIT: Idempotence
> 
> Bullet #2 of section 2.2.1 says:
> 
>         To facilitate this, the API implementation would provide non-
>         blocking versions of all the functions that otherwise involve
>         blocking and queueing.  In these calls, a 'noReply' code will be
>         returned by each call instead of blocking, until such time as the
>         event for which it is waiting (or a failure) has occurred.  Thus,
>         for example, discover() would return 'noReply' instead of waiting
>         until discovery has succeeded or timed out.  The discover() call
>         would be repeated in every cycle of the main loop until it
>         completes.  Effectively, it becomes a polling call.
> 
> This seems to contradict the earlier statement (in 2.2) that the calls 
> are not idempotent. Section 2.2.2 tries to clarify this, but it is not 
> very clear.

Ack, will try harder.

> 
> 13) NIT: Confusing terminology in 'discover'
> 
> In the description of 'discover()' (in section 2.3.3) the 'age_limit' 
> parameter is confusing. At the least the name seems wrong for the 
> described semantics. The implication is that an expiration date/time is 
> what GRASP stores, and that remaining time until expiration is what is 
> being used here. It is hard to see how it is appropriate to call that an 
> "age". ISTM that would better be called a TTL, and that is parameter 
> might then be called 'minimum_TTL'.

Hmm. We added this after a specific discussion in the WG. (I see that
I never implemented it in my prototype code, which just has a 'flush'
parameter instead.) For me the semantics of "age limit" and "minimum TTL"
are the same. On reflection, I think you're correct and minimum TTL is
clearer.

Again many thanks,
     Brian